Skip to content

Releases: zaneveld/full_spectrum_bioinformatics

Full Spectrum Bioinformatics Development Release 2024.2.0

15 Feb 23:11
59a443f
Compare
Choose a tag to compare

Full Spectrum Bioinformatics Development Release 2024.2.0

What's Changed

New Content

Quickly introduce command line software

The new 'Duck vs Yeast' exercise lets students who have just learned how to navigate the command line immediately apply those skills to identify homologs of duck delta crystallin protein in yeast. This is intended to provide an immediate example of how one might use command line software with a few valued parameters in a bioinformatic analysis.

Quickly learn to make graphs and run basic statistics in python

This new release shifts teaching approach for new students who have just learned python. Instead of starting with more traditional python topics, the text instead now introduces how to accomplish some useful tasks with just a little bit of python first. After a brief introduction to strings, ints, lists and calling functions, we immediately establish some 'quick wins' by using python to graph (matplotlib boxplots or scatterplots) and statistically analyze (T-test or Pearson regression) hard-coded data that we enter by hand using lists. We then immediately build on this in the 'Another Quick Win' chapter by discussing the Tidy data format, and how to load tabular Tidy-format data into python. One reason for this change is to allow learners to see how to organize data from their final bioinformatic analyses so it is easy to graph and analyze before they start writing analytical code.

  • Added Quick Wins in Python section by @zaneveld in #149
  • Updated 'another quick win' and associated resources by @zaneveld in #167

Added discussion of scientific writing for bioinformatics

This release included a guide to installing and using Zotero for reference management contributed by Dr. Mushtaq Bilal. The goal is to make citation less time-consuming so that it is easier to appropriately reference papers that establish the background for our bioinformatics projects.

The release also includes updates to guidance for writing about the literature.

Added discussion of merging tables in pandas

A new chapter covers merging tables using Pandas. This is important for many projects that draw data from multiple sources. Examples might include annotating gene functional categories in a table of gene ids, compiling demographic characteristics of cities, states or countries from government sources, etc.

  • Adding data files for table merging chapter by @zaneveld in #152
  • Adding the merging tables chapter by @zaneveld in #153
  • Updated the merging tables section to include information on avoiding… by @zaneveld in #154

Edits

This release includes several edits to improve chapters

  • Additional edits to the command line chapter by @zaneveld in #134
  • Update exercise_little_brother_is_missing.ipynb by @zaneveld in #135
  • Update to DataFrame chapter for data downloads and series math explanations by @zaneveld in #142
  • Edited the analyzing tabular omic data in python chapter by @zaneveld in #143
  • Updated text and the graph in the sequencing depth chapter by @zaneveld in #144
  • Updated reading response link in merging and filtering data in pandas… by @zaneveld in #158
  • Updated fasta file reading chapter text and added an exercise by @zaneveld in #165
  • Add Git bash install instructions to Command Line interfaces chapter by @zaneveld in #133

Fixes

This release includes fixes to some bugs and dead links

Development of Future Chapters

This release includes some behind-the-scenes work drafting sections that are not yet ready for use. This includes some preliminary text and graphics for a chapter introducing the use of GitHub, as well as a case study demonstrating how to remap column names if they are encoded with ids, as is common in some US government datasets, using the NHANES data.

  • Added a draft version of the chapter on using GitHub and supporting images by @zaneveld in #162
  • Update to github draft chapter by @zaneveld in #164
  • Initial commit of NHANES sleep data tutorial in python by @zaneveld in #166

New Contributors

  • @yeemey made their first contribution in #168
  • Dr. Mushtaq Bilal contributed a chapter on Zotero

Full Changelog: release-2022.3.1...release-2024.2.0

Full Spectrum Bioinformatics Development Release 2022.3.0

02 Mar 19:45
38cacaf
Compare
Choose a tag to compare

What's Changed

The 2022.3.1 Release of Full Spectrum Bioinformatics greatly expands the scope and maturity of the text, including contributions from 3 undergraduate co-authors. This text has now been used to support multiple classes, and has 35 sections that are linked from the table of content and ready for classroom use.

Here are some of the major changes:

The text has several new sections:
-- An overview of python syntax now overviews how to recognize python syntax before we dive into studying the details
-- A first chapter on sequence alignment now covers Needleman-Wunsch alignment, both as worked by hand using a simple example, and an implementation in numpy.
-- The text now discusses linear models, with accompanying illustrations as well as figures
-- An Error Bingo exercise now encourages students to intentionally trigger and learn from errors
-- An extensive section has been added discussing common errors in python, why they most commonly occur, and how to fix them.

-- 3 undergraduate contributors have added Bioinformatics Vignettes showing how to apply the principles in the text to biological problems:
- Nia Prabhu (nucleotide composition)
- Aziz Bajouri (set analysis)
- Ayomikun Akinrinade (machine learning)

-- A section has been added on revising writing about statistical results
-- An initial draft section on visualizing correlation has been added showing how a scatterplot can be revised to add linear regression results, 95% confidence intervals, and to better meet recommendations for data visualization.
-- The Data Sources page has been greatly updated, and now includes logos for linked resources

New Draft Sections:
-- A draft section on student activism and fighting for an inclusive workplace has been added.
-- A draft section on network analysis has several in-progress code commits (not yet linked from main table of contents)

Other changes:
-- Full Spectrum Bioinformatics has now adopted a code of conduct
-- Many minor fixes
-- Exercises have been added to many sections that previously lacked them
-- The exercise on calculating CG content in the human genome has been updated
-- Several chapters have been updated to include Feedback links that were previously missing
-- Unused Jupyter Book files have been removed

Full Changelog: release-2020.12.1...release-2022.3.1

Full Spectrum Bioinformatics Development Release 2020.12.1

08 Dec 20:43
24badca
Compare
Choose a tag to compare

This is an initial development release of the Full Spectrum Bioinformatics online textbook. This is not a full release of the entire planned textbook, but rather an incremental development release of some content that is sufficiently developed that it has been used in classes.

Some current features include:
-- A series of open-access Jupyter Notebooks discussing topics in Bioinformatics.
-- Links to Google Colab to allow students to run notebooks in a browser without installing software
-- An outline table of contents shows planned sections, with sections that are in beta status available as live links.
-- This release includes 21 new sections, covering topics ranging from sequence analysis to how to revise one's writing about statistical results:

Foreword
The Command Line
Using the Command Line
Exercise: Little Brother is Missing
Exploring Python
Exploring Python
A Tour of Python Data Types
Project Design
Using Literature Surveys to Ask Good Questions and Propose Testable Hypotheses
Biological Sequences
An introduction to Biological Sequences
Representing and Manipulating Biological Sequences as Python Strings
Analyzing Biological Sequences with For Loops and If Statements
Reading and writing FASTA files using Python
'Omics
An Introduction to 'Omics
Working with Tabular 'Omic data in Python using Pandas
Phylogenetic Trees
Representing Phylogenetic Trees with Python Classes
Generating Trees Using Birth-Death Models
Simulation
Simulating the Population Genetics of Natural Selection and Genetic Drift
Statistics
Rank Transformations
Monte Carlo simulation of Effect Size, Sample Size, and Significance
Dealing with Multiple Comparisons
Exercise: Revising your writing about statistical results
Polishing and Publishing
Presenting Research
Careers that draw on Bioinformatics
Applying for Grants

NOTE: this is very similar to release-2020.12.0, other than minor edits to the readme but I need to re-release to trigger Zenodo to generate a DOI.

Full Spectrum Bioinformatics Development Release 2020.12.0

07 Dec 17:39
f32fd37
Compare
Choose a tag to compare

This is an initial development release of the Full Spectrum Bioinformatics online textbook. This is not a full release of the entire planned textbook, but rather an incremental development release of some content that is sufficiently developed that it has been used in classes.

Some current features include:
-- A series of open-access Jupyter Notebooks discussing topics in Bioinformatics.
-- Links to Google Colab to allow students to run notebooks in a browser without installing software
-- An outline table of contents shows planned sections, with sections that are in beta status available as live links.
-- This release includes 21 new sections, covering topics ranging from sequence analysis to how to revise one's writing about statistical results:

Foreword
The Command Line
Using the Command Line
Exercise: Little Brother is Missing
Exploring Python
Exploring Python
A Tour of Python Data Types
Project Design
Using Literature Surveys to Ask Good Questions and Propose Testable Hypotheses
Biological Sequences
An introduction to Biological Sequences
Representing and Manipulating Biological Sequences as Python Strings
Analyzing Biological Sequences with For Loops and If Statements
Reading and writing FASTA files using Python
'Omics
An Introduction to 'Omics
Working with Tabular 'Omic data in Python using Pandas
Phylogenetic Trees
Representing Phylogenetic Trees with Python Classes
Generating Trees Using Birth-Death Models
Simulation
Simulating the Population Genetics of Natural Selection and Genetic Drift
Statistics
Rank Transformations
Monte Carlo simulation of Effect Size, Sample Size, and Significance
Dealing with Multiple Comparisons
Exercise: Revising your writing about statistical results
Polishing and Publishing
Presenting Research
Careers that draw on Bioinformatics
Applying for Grants