fix: ensure trailing newline is included when parsing GFF3 region #1573
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Resolves #1572
When parsing GFF3, we split up the file on
#sequence-region
pragma blocks. As per specification, there could be multiple of them, and we want to know about this.The
bio
crate'sGffReader
does not support regions, so we do the splitting manually and then pass each region's slice tobio
GffReader
. However, due to mistakenly trimmed trailing newlines in the regions on our side,bio
GFF3 parser would fail when it encounters a commented line as a last line of the region. The error and repro is described in #1572.In this PR I:
.trim()
call such that this newline isn't removedThis way
bio
can understand our GFF3 blocks even when they have comments.