2.3.5 + docs update

mikolmogorov · mikolmogorov · commit 15ae84e34437 · 2018-08-07T15:52:01.000-07:00
diff --git a/README.md b/README.md
@@ -1,7 +1,9 @@
 Flye assembler (successor of ABruijn)
 =====================================
 
-### Version: 2.3.5b
+[![BioConda Install](https://img.shields.io/conda/dn/bioconda/flye.svg?style=flag&label=BioConda%20install)](https://anaconda.org/bioconda/flye)
+
+### Version: 2.3.5
 
 Flye is a de novo assembler for long and noisy reads, such as
 those produced by PacBio and Oxford Nanopore Technologies.
diff --git a/docs/INSTALL.md b/docs/INSTALL.md
@@ -6,25 +6,35 @@ Availability
 
 Flye is available for Linux and MacOS platforms.
 
+Bioconda Releases
+-----------------
+
+You can get the latest stable release through Bioconda:
+
+    conda install flye
+
+Alternatively, you can get a release verson from the github "releases" page
+
+
 Requirements
 ------------
 
 * C++ compiler with C++11 support (GCC 4.8+ / Clang 3.3+ / Apple Clang 5.0+)
 * GNU make
 * Python 2.7
 * Git
-* Basic OS development headers (zlib, etc.)
+* Core OS development headers (zlib, etc)
+
 
-Get the latest version (recommended)
-------------------------------------
+Get the latest source version
+-----------------------------
 
 To get and compile the latest git version, run:
 
     git clone https://github.com/fenderglass/Flye
 	cd Flye
     python setup.py build
 
-Alternatively, you can get a release verson from the "releases" page.
 
 After building, Flye could be invoked with the following command:
 
diff --git a/docs/NEWS.md b/docs/NEWS.md
@@ -1,3 +1,11 @@
+Flye 2.3.5 (7 August 2018)
+==========================
+* New solid kmer alignment implementation with improved specificity
+* Better corrected reads support
+* Minimum overlap is now selected within a wider range for better support of datasets with shorter read length
+* Assembly of large (human size) genomes is now faster
+* Various bugfixes and stability improvements
+
 Flye 2.3.4 (19 May 2018)
 ========================
 * A fix for assemblies with low reads count
diff --git a/docs/USAGE.md b/docs/USAGE.md
@@ -58,7 +58,7 @@ from PacBio and ONT are supported. The expected error rates are
 <30% for raw and <2% for corrected reads. Additionally,
 ```--subassemblies``` option performs a consensus assembly of multiple
 sets of high-quality contigs. You may specify multiple
-fles with reads (separated by spaces). Mixing different read
+files with reads (separated by spaces). Mixing different read
 types is not yet supported.
 
 You must provide an estimate of the genome size as input,
@@ -117,8 +117,8 @@ ONT data than with PacBio data, especially in homopolymer regions.
 
 ### Error-corrected reads input
 
-While Flye was designed for assembly of raw reads (and this is the recommended option),
-it also supports error-corrected PacBio/ONT reads as input (use the correpsonding option).
+While Flye was designed for assembly of raw reads (and this is the recommended way),
+it also supports error-corrected PacBio/ONT reads as input (use the ```corr``` option).
 The parameters are optimized for error rates <2%. If you are getting highly 
 fragmented assembly - most likely error rates in your reads are higher. In this case,
 consider to assemble using the raw reads instead.
@@ -171,10 +171,15 @@ errors (due to improvements on how reads may align to the corrected assembly;
 especially for ONT datasets). If the parameter is set to 0, the polishing will
 not be performed.
 
-### Resuming existing jobs
+### Starting from a particular assembly stage
 
-Use --resume to resume a previous run of the assembler that may have terminated
-prematurely. The assembly will continue from the last previously completed step.
+Use ```--resume``` to resume a previous run of the assembler that may have terminated
+prematurely (using the same output directory). 
+The assembly will continue from the last previously completed step.
+
+You might also resume from a particular stage with ```--resume-from stage_name```,
+where ```stage_name``` is a choice of ```assembly, consensus, repeat, polishing```.
+For example, you might supply different sets of reads for different stages.
 
 ## <a name="graph"></a> Assembly graph
 
@@ -256,16 +261,16 @@ for more detailed information. The assembly pipeline is organized as follows:
 
 * Kmer counting / erroneous kmer pre-filtering
 * Solid kmer selection (kmers with sufficient frequency, which are unlikely to be erroneous)
-* Finding read overlaps based on the A-Bruijn graph
-* Detection of chimeric sequences
-* Contig assembly by read extension
+* Contig extension. The algorithm starts from a single read and extends it
+  with a next overlapping read (overlaps are dynamically detected using the selected
+  solid k-mers).
 
-The resulting contig assembly is now simply a concatenation of read parts 
-and is error-prone. Flye then aligns the reads on the draft contigs using minimap2 and
-calls a rough consensus. Afterwards, the algorithm performs additional repeat analysis
-as follows:
+Note that we do not attempt to resolve repeats at this stage, thus
+the reconstructed contigs might contain misassemblies. 
+Flye then aligns the reads on these draft contigs using minimap2 and
+calls a consensus. Afterwards, Flye performs repeat analysis as follows:
 
-* Repeat graph is reconstructed from the assembled sequence
+* Repeat graph is constructed from the (possibly misassembled) contigs
 * In this graph all repeats longer than minimum overlap are collapsed
 * The algorithm resolves repeats using the read information and graph structure
 * The unbranching paths in the graph are output as contigs
diff --git a/flye/__version__.py b/flye/__version__.py
@@ -1 +1 @@
-__version__ = "2.3.5b"
+__version__ = "2.3.5"
diff --git a/flye/main.py b/flye/main.py
@@ -386,7 +386,7 @@ def _epilog():
             "<30% for raw and <2% for corrected reads. Additionally,\n"
             "--subassemblies option performs a consensus assembly of multiple\n"
             "sets of high-quality contigs. You may specify multiple\n"
-            "fles with reads (separated by spaces). Mixing different read\n"
+            "files with reads (separated by spaces). Mixing different read\n"
             "types is not yet supported.\n\n"
             "You must provide an estimate of the genome size as input,\n"
             "which is used for solid k-mers selection. The estimate could\n"

Original file line number	Diff line number	Diff line change
`@@ -1 +1 @@`
`1`		`-__version__ = "2.3.5b"`
	`1`	`+__version__ = "2.3.5"`