Merge pull request #2031 from gregcaporaso/191-release

1.9.1 release
biocore · May 26, 2015 · fd59c20 · fd59c20
2 parents 8c5a051 + 85ceed2
commit fd59c20
Show file tree

Hide file tree

Showing 418 changed files with 494 additions and 474 deletions.
diff --git a/ChangeLog.md b/ChangeLog.md
@@ -1,12 +1,12 @@
-QIIME 1.9.0-dev
-===============
+QIIME 1.9.1
+===========
 
 Bug fixes
 ---------
 
-* Updated minimum required version of the [qiime-default-reference](http://github.com/biocore/qiime-default-reference) package to 0.1.2. **This release includes an important bug fix described in more detail in [this QIIME blog post](https://qiime.wordpress.com/2015/04/15/qiime-1-9-0-bug-affecting-pynast-alignment-of-16s-amplicons-generated-with-non-515f806r-primers/) and in [biocore/qiime-default-reference#14](https://github.com/biocore/qiime-default-reference/issues/14).**
-* Fixed bug in ``differential_abundance.py`` fitZIG algorithm ([#1960](https://github.com/biocore/qiime/pull/1960)). **This was a serious bug that was encountered when users would call ``differential_abundance.py -a metagenomeSeq_fitZIG``. Any results previosuly generated with that command should be re-run.**
-* Fixed serious bug in ``observation_metadata_correlation.py``, described in [#2009](https://github.com/biocore/qiime/issues/2009). **All previous output generated with ``observation_metadata_correlation.py`` was incorrect, and analyses using those results should be re-run.** This most commonly would have resulted in massive Type 2 error (false negatives), where observations whose abundance is correlated with metadata are not reported, though Type 1 error (false positives) are also possible.
+* **Critical**: Updated minimum required version of the [qiime-default-reference](http://github.com/biocore/qiime-default-reference) package to 0.1.2. **This release includes an important bug fix described in more detail in [this QIIME blog post](https://qiime.wordpress.com/2015/04/15/qiime-1-9-0-bug-affecting-pynast-alignment-of-16s-amplicons-generated-with-non-515f806r-primers/) and in [biocore/qiime-default-reference#14](https://github.com/biocore/qiime-default-reference/issues/14).**
+* **Critical**: Fixed bug in ``differential_abundance.py`` fitZIG algorithm ([#1960](https://github.com/biocore/qiime/pull/1960)). **This was a serious bug that was encountered when users would call ``differential_abundance.py -a metagenomeSeq_fitZIG``. Any results previosuly generated with that command should be re-run.**
+* **Critical**: Fixed bug in ``observation_metadata_correlation.py``, described in [#2009](https://github.com/biocore/qiime/issues/2009). **All previous output generated with ``observation_metadata_correlation.py`` was incorrect, and analyses using those results should be re-run.** This most commonly would have resulted in massive Type 2 error (false negatives), where observations whose abundance is correlated with metadata are not reported, though Type 1 error (false positives) are also possible.
 * ``count_seqs.py`` no longer fails on empty files. [#1991](https://github.com/biocore/qiime/issues/1991)
 * Updated minimum required version of [biom-format](http://github.com/biocore/biom-format) package to 2.1.4. This is a bug fix release. Details are available in the [biom-format ChangeLog](https://github.com/biocore/biom-format/blob/master/ChangeLog.md).
 * Updated minimum required version of [Emperor](http://github.com/biocore/emperor) package to 0.9.51.
@@ -16,7 +16,6 @@ Bug fixes
 * Fixed issued where ``filter_samples_from_otu_table.py`` could only filter the mapping file when ``--valid_states`` was passed as the filtering method ([#2003](https://github.com/biocore/qiime/issues/2003)).
 * Fixed bug where distance matrix files generated by QIIME (e.g., using ``beta_diversity.py``) could have diagonals with values that were close to zero in rare cases (depending on input data, machine architecture, installed dependencies, etc.). These files could not be loaded by QIIME scripts that accepted distance matrix files as input (e.g., ``principal_coordinates.py``) and would result in an error message stating that the distance matrix was not hollow. Values on the diagonal that are close to zero are now set to 0.0 ([#1933](https://github.com/biocore/qiime/issues/1933)).
 
-
 Usability enhancements
 ----------------------
 
@@ -25,6 +24,7 @@ Usability enhancements
 * If ``temp_dir`` is not defined in the QIIME config file, QIIME will use the system's default temporary directory instead of assuming that ``/tmp`` is present and writeable. Note that the location of this default temporary directory [can be changed with environment variables](https://docs.python.org/2/library/tempfile.html#tempfile.tempdir) ([#1995](https://github.com/biocore/qiime/issues/1995)).
 * Improve error reporting from ``filter_taxa_from_otu_table.py``, ``filter_otus_from_otu_table.py``, and ``filter_samples_from_otu_table.py`` when all OTUs/samples are filtered out resulting in an empty table ([#1963](https://github.com/biocore/qiime/issues/1963)), and generally when attempting to write an empty BIOM table from QIIME.
 * Added ability to pass user-defined runtime limit for jobs to ``start_parallel_jobs_slurm.py``. This can be achieved by setting the ``slurm_time`` variable in ``qiime_config``, or by passing ``--time`` to ``start_parallel_jobs_slurm.py``.
+* Distances matrices and UPGMA trees generated from the full (unrarefied) OTU table are now stored under ``unrarefied_bdiv`` in the output directory from ``jackknifed_beta_diversity.py``. That UPGMA tree is optionally used (if the user passes ``--master_tree full``). This change makes their content more explicit so they're less likely to be used by accident ([#2024](https://github.com/biocore/qiime/issues/2024)).
 
 QIIME 1.9.0
 ===========

diff --git a/doc/conf.py b/doc/conf.py
@@ -46,9 +46,9 @@
 # built documents.
 #
 # The short X.Y version.
-version = '1.9.0-dev'
+version = '1.9.1'
 # The full version, including alpha/beta/rc tags.
-release = '1.9.0-dev'
+release = '1.9.1'
 
 # The language for content autogenerated by Sphinx. Refer to documentation
 # for a list of supported languages.

diff --git a/doc/scripts/align_seqs.rst b/doc/scripts/align_seqs.rst
@@ -37,7 +37,7 @@ This script aligns the sequences in a FASTA file to each other or to a template
 	-a, `-`-pairwise_alignment_method
 		Method for performing pairwise alignment in PyNAST. Valid choices are muscle, pair_hmm, clustal, blast, uclust, mafft [default: uclust]
 	-t, `-`-template_fp
-		Filepath for template alignment [default: /Users/jairideout/.virtualenvs/qiime/lib/python2.7/site-packages/qiime_default_reference/gg_13_8_otus/rep_set_aligned/85_otus.fasta]
+		Filepath for template alignment [default: /Users/caporaso/.virtualenvs/qiime/lib/python2.7/site-packages/qiime_default_reference/gg_13_8_otus/rep_set_aligned/`85_otus.py <./85_otus.html>`_nast.fasta]
 	-e, `-`-min_length
 		Minimum sequence length to include in alignment [default: 75% of the median input sequence length]
 	-p, `-`-min_percent_id

diff --git a/doc/scripts/assign_taxonomy.rst b/doc/scripts/assign_taxonomy.rst
@@ -33,9 +33,9 @@ Reference data sets and id-to-taxonomy maps for 16S rRNA sequences can be found
 	**[OPTIONAL]**
 
 	-t, `-`-id_to_taxonomy_fp
-		Path to tab-delimited file mapping sequences to assigned taxonomy. Each assigned taxonomy is provided as a semicolon-separated list. For assignment with rdp, each assigned taxonomy must be exactly 6 levels deep. [default: /Users/jairideout/.virtualenvs/qiime/lib/python2.7/site-packages/qiime_default_reference/gg_13_8_otus/taxonomy/97_otu_taxonomy.txt]
+		Path to tab-delimited file mapping sequences to assigned taxonomy. Each assigned taxonomy is provided as a semicolon-separated list. For assignment with rdp, each assigned taxonomy must be exactly 6 levels deep. [default: /Users/caporaso/.virtualenvs/qiime/lib/python2.7/site-packages/qiime_default_reference/gg_13_8_otus/taxonomy/97_otu_taxonomy.txt]
 	-r, `-`-reference_seqs_fp
-		Path to reference sequences.  For assignment with blast, these are used to generate a blast database. For assignment with rdp, they are used as training sequences for the classifier. [default: /Users/jairideout/.virtualenvs/qiime/lib/python2.7/site-packages/qiime_default_reference/gg_13_8_otus/rep_set/97_otus.fasta]
+		Path to reference sequences.  For assignment with blast, these are used to generate a blast database. For assignment with rdp, they are used as training sequences for the classifier. [default: /Users/caporaso/.virtualenvs/qiime/lib/python2.7/site-packages/qiime_default_reference/gg_13_8_otus/rep_set/97_otus.fasta]
 	-p, `-`-training_data_properties_fp
 		Path to ".properties" file in pre-compiled training data for the RDP Classifier.  This option is overridden by the -t and -r options. [default: None]
 	`-`-read_1_seqs_fp

diff --git a/doc/scripts/collapse_samples.rst b/doc/scripts/collapse_samples.rst
@@ -33,7 +33,7 @@ Collapse samples in a BIOM table and mapping file. Values in the BIOM table are
 	**[OPTIONAL]**
 
 	`-`-collapse_mode
-		The mechanism for collapsing counts within groups; valid options are: mean, sum, random, median, first
+		The mechanism for collapsing counts within groups; valid options are: mean, sum, random, median, first. [default: sum]
 	`-`-normalize
 		Normalize observation counts to relative abundances, so the counts within each sample sum to 1.0. [default: False]
 

diff --git a/doc/scripts/denoiser_worker.rst b/doc/scripts/denoiser_worker.rst
@@ -32,7 +32,7 @@ A worker waits for data and does flowgram alignments once it gets it.
 	**[OPTIONAL]**
 
 	-e, `-`-error_profile
-		Path to error profile [DEFAULT: /Users/jairideout/dev/qiime/qiime/support_files/denoiser/Data/FLX_error_profile.dat]
+		Path to error profile [DEFAULT: /Users/caporaso/Dropbox/code/Qiime/qiime/support_files/denoiser/Data/FLX_error_profile.dat]
 	-c, `-`-counter
 		Round counter to start this worker with  [default: 0]
 

diff --git a/doc/scripts/exclude_seqs_by_blast.rst b/doc/scripts/exclude_seqs_by_blast.rst
@@ -48,7 +48,7 @@ WARNING: You cannot use this script if there are spaces in the path to the datab
 	`-`-blastmatroot
 		Path to a folder containing blast matrices [default: None].
 	`-`-working_dir
-		Working dir for BLAST [default: /tmp/].
+		Working dir for BLAST [default: /Users/caporaso/temp/].
 	-m, `-`-max_hits
 		Max hits parameter for BLAST. CAUTION: Because filtering on alignment percentage occurs after BLAST, a max hits value of 1 in combination with an alignment percent filter could miss valid contaminants. [default: 100]
 	-w, `-`-word_size

diff --git a/doc/scripts/filter_fasta.rst b/doc/scripts/filter_fasta.rst
@@ -27,23 +27,23 @@
 	**[OPTIONAL]**
 
 	-m, `-`-otu_map
-		An OTU map where sequences ids are those which should be retained
+		An OTU map where sequences ids are those which should be retained.
 	-s, `-`-seq_id_fp
-		A list of sequence identifiers (or tab-delimited lines with a seq identifier in the first field) which should be retained
+		A list of sequence identifiers (or tab-delimited lines with a seq identifier in the first field) which should be retained.
 	-b, `-`-biom_fp
-		A biom file where otu identifiers should be retained
+		A biom file where otu identifiers should be retained.
 	-a, `-`-subject_fasta_fp
 		A fasta file where the seq ids should be retained.
 	-p, `-`-seq_id_prefix
-		Keep seqs where seq_id starts with this prefix
+		Keep seqs where seq_id starts with this prefix.
 	`-`-sample_id_fp
-		Keep seqs where seq_id starts with a sample id listed in this file
+		Keep seqs where seq_id starts with a sample id listed in this file. Must be newline delimited and may not contain a header.
 	-n, `-`-negate
-		Discard passed seq ids rather than keep passed seq ids [default: False]
+		Discard passed seq ids rather than keep passed seq ids. [default: False]
 	`-`-mapping_fp
-		Mapping file path (for use with --valid_states) [default: None]
+		Mapping file path (for use with --valid_states). [default: None]
 	`-`-valid_states
-		Description of sample ids to retain (for use with --mapping_fp) [default: None]
+		Description of sample ids to retain (for use with --mapping_fp). [default: None]
 
 
 **Output:**

diff --git a/doc/scripts/jackknifed_beta_diversity.rst b/doc/scripts/jackknifed_beta_diversity.rst
@@ -2,7 +2,7 @@
 
 .. index:: jackknifed_beta_diversity.py
 
-*jackknifed_beta_diversity.py* -- A workflow script for performing jackknifed UPGMA clustering and build jackknifed 2d and 3D PCoA plots.
+*jackknifed_beta_diversity.py* -- A workflow script for performing jackknifed UPGMA clustering and building jackknifed Emperor PCoA plots.
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
 **Description:**
@@ -48,12 +48,12 @@ To directly measure the robustness of individual UPGMA clusters and clusters in
 
 **Output:**
 
-This scripts results in several distance matrices (from `beta_diversity.py <./beta_diversity.html>`_), several rarified OTU tables (from `multiple_rarefactions_even_depth.py <./multiple_rarefactions_even_depth.html>`_), several UPGMA trees (from `upgma_cluster.py <./upgma_cluster.html>`_), a supporting file and newick tree with support values (from `tree_compare.py <./tree_compare.html>`_), and 2D and 3D PCoA plots.
+This scripts results in several distance matrices (from `beta_diversity.py <./beta_diversity.html>`_), several rarefied OTU tables (from `multiple_rarefactions_even_depth.py <./multiple_rarefactions_even_depth.html>`_), several UPGMA trees (from `upgma_cluster.py <./upgma_cluster.html>`_), a supporting file and newick tree with support values (from `tree_compare.py <./tree_compare.html>`_), and Emperor PCoA plots.
 
 
 **Example:**
 
-These steps are performed by the following command: Compute beta diversity distance matrix from otu table (and tree, if applicable); build rarefied OTU tables by evenly sampling to the specified depth (-e); build UPGMA tree from full distance matrix; compute distance matrics for rarefied OTU tables; build UPGMA trees from rarefied OTU table distance matrices; build a consensus tree from the rarefied UPGMA trees; compare rarefied OTU table distance matrix UPGMA trees to either (full or consensus) tree for jackknife support of tree nodes; perform principal coordinates analysis on distance matrices generated from rarefied OTU tables; generate 2D and 3D PCoA plots with jackknifed support.
+These steps are performed by the following command: Compute beta diversity distance matrix from otu table (and tree, if applicable); build rarefied OTU tables by evenly sampling to the specified depth (-e); build UPGMA tree from full distance matrix; compute distance matrics for rarefied OTU tables; build UPGMA trees from rarefied OTU table distance matrices; build a consensus tree from the rarefied UPGMA trees; compare rarefied OTU table distance matrix UPGMA trees to either (full or consensus) tree for jackknife support of tree nodes; perform principal coordinates analysis on distance matrices generated from rarefied OTU tables; generate Emperor PCoA plots with jackknifed support.
 
 
 

diff --git a/doc/scripts/normalize_table.rst b/doc/scripts/normalize_table.rst
@@ -27,22 +27,37 @@ it be 0.01 instead of 1?), for more information read Costea, P. et al. (2014)
 DESeq/DESeq2 can also have a very slow runtime, especially for larger datasets.
 In this script, we implement DESeq2's variance stabilization technique. If you do use these
 alternatives to rarefying, we would recommend metagenomeSeq's CSS (cumulative sum
-scaling) transformation for those metrics that are abundance-based.  It is not 
+scaling) transformation for those metrics that are abundance-based.  It is not
 recommended to use these new methods with presence/absence metrics, for example
-binary Jaccard or unweighted UniFrac. 
+binary Jaccard or unweighted UniFrac.
 
 For more on metagenomeSeq's CSS, please see Paulson, JN, et al. 'Differential
 abundance analysis for microbial marker-gene surveys' Nature Methods 2013.  For DESeq
 please see Anders S, Huber W. 'Differential expression analysis for sequence
-count data.' Genome Biology 2010.  For DESeq2 please read Love, MI et al. 
-'Moderated estimation of fold change and dispersion for RNA-Seq data 
+count data.' Genome Biology 2010.  For DESeq2 please read Love, MI et al.
+'Moderated estimation of fold change and dispersion for RNA-Seq data
 with DESeq2,' Genome Biology 2014.  If you use these methods, please CITE the
 appropriate reference as well as QIIME.  For any of these methods, clustering by
 sequence depth MUST BE CHECKED FOR as a confounding variable, e.g. by coloring
 by sequences/sample on a PCoA plot and testing for correlations between
-taxa abundances and sequencing depth with e.g. adonis in `compare_categories.py <./compare_categories.html>`_, 
+taxa abundances and sequencing depth with e.g. adonis in `compare_categories.py <./compare_categories.html>`_,
 or `observation_metadata_correlation.py <./observation_metadata_correlation.html>`_.
 
+Note: If the input BIOM table contains observation metadata (e.g., taxonomy
+metadata for each OTU), this metadata will not be included in the output
+normalized BIOM table when using DESeq2. When using CSS the taxonomy metadata
+will be included in the output normalized table but it may not be in the same
+format as the input table (e.g., "NA" will be added for missing taxonomic
+levels). This discrepancy occurs because the underlying R packages used to
+perform the normalization store taxonomy metadata in a different format.
+
+As a workaround, the "biom add-metadata" command can be used to add the
+original observation metadata to the output normalized table if desired. For
+example, to include the original taxonomy metadata on the output normalized
+table, "biom add-metadata" can be used with the representative sequence
+taxonomic assignment file output by `assign_taxonomy.py <./assign_taxonomy.html>`_.
+
+
 
 
 **Usage:** :file:`normalize_table.py [options]`

diff --git a/doc/scripts/parallel_align_seqs_pynast.rst b/doc/scripts/parallel_align_seqs_pynast.rst
@@ -27,7 +27,7 @@ A wrapper for the `align_seqs.py <./align_seqs.html>`_ PyNAST option, intended t
 	**[OPTIONAL]**
 
 	-t, `-`-template_fp
-		Filepath for template alignment [default: /Users/jairideout/.virtualenvs/qiime/lib/python2.7/site-packages/qiime_default_reference/gg_13_8_otus/rep_set_aligned/85_otus.fasta]
+		Filepath for template alignment [default: /Users/caporaso/.virtualenvs/qiime/lib/python2.7/site-packages/qiime_default_reference/gg_13_8_otus/rep_set_aligned/`85_otus.py <./85_otus.html>`_nast.fasta]
 	-a, `-`-pairwise_alignment_method
 		Method to use for pairwise alignments [default: uclust]
 	-d, `-`-blast_db

diff --git a/doc/scripts/parallel_assign_taxonomy_blast.rst b/doc/scripts/parallel_assign_taxonomy_blast.rst
@@ -27,7 +27,7 @@ This script performs like the `assign_taxonomy.py <./assign_taxonomy.html>`_ scr
 	**[OPTIONAL]**
 
 	-r, `-`-reference_seqs_fp
-		Ref seqs to blast against.  Must provide either --blast_db or --reference_seqs_db for assignment with blast [default: /Users/jairideout/.virtualenvs/qiime/lib/python2.7/site-packages/qiime_default_reference/gg_13_8_otus/rep_set/97_otus.fasta]
+		Ref seqs to blast against.  Must provide either --blast_db or --reference_seqs_db for assignment with blast [default: /Users/caporaso/.virtualenvs/qiime/lib/python2.7/site-packages/qiime_default_reference/gg_13_8_otus/rep_set/97_otus.fasta]
 	-b, `-`-blast_db
 		Database to blast against.  Must provide either --blast_db or --reference_seqs_db for assignment with blast [default: None]
 	-e, `-`-e_value
@@ -51,7 +51,7 @@ This script performs like the `assign_taxonomy.py <./assign_taxonomy.html>`_ scr
 	-Z, `-`-seconds_to_sleep
 		Number of seconds to sleep between checks for run  completion when polling runs [default: 1]
 	-t, `-`-id_to_taxonomy_fp
-		Full path to id_to_taxonomy mapping file [default: /Users/jairideout/.virtualenvs/qiime/lib/python2.7/site-packages/qiime_default_reference/gg_13_8_otus/taxonomy/97_otu_taxonomy.txt]
+		Full path to id_to_taxonomy mapping file [default: /Users/caporaso/.virtualenvs/qiime/lib/python2.7/site-packages/qiime_default_reference/gg_13_8_otus/taxonomy/97_otu_taxonomy.txt]
 
 
 **Output:**