Releases: uniVocity/univocity-parsers
Releases · uniVocity/univocity-parsers
Releasing **BIG** minor version 2.3.0
Enhancements
- Introduced the RetryableErrorHandler to prevent discarding rows on
DataProcessingException
and allow the user to retry and recover the record. - Headers parsed from the input are now available after
parser.beginReading
andprocessor.processStarted
. There's no more need to parse the first valid row prior to obtaining the headers. - Introduced the @FixedWidth annotation to allow users to configure fixed-width length and other properties via annotations.
- Introduced the @Copy annotation to allow creating meta-annotations with properties that can be copied over its component annotations - this is very cool, check the javadocs.
- Added a
skipBitsAsWhitespace
configuration option to all parsers and writers that when disabled will prevent characters\0
and\1
from being skipped as whitespace. You are likely to need this if you are working with database dumps that produce (or expect) such characters to represent a BIT value. - Introduced a getInputDimension() method to all routine implementations (i.e. CsvRoutines, TsvRoutines and FixedWidthRoutines)
- Improved the CSV format detection algorithm with better handling of single quotes inside double quotes and better statistics processing for dubious cases (also fixes the build, which failed when executed under Java 8)
Bugfixes
- Fixed unexpected
null
at the end of the last fixed width record when the file was not terminated by a line ending: #128 -
Setting
writerSettings.setEmptyValue(“\”\””)
(or anything with quotes) for writing CSV files should work as expected and produce""
instead of escaping the quotes.
Next maintenance version 2.2.3
Bugfixes
- Fixed incorrect handling of quote escape character NOT followed by a quote in CSV. The parser was swallowing the escape character if it wasn't itself escaped: commit
- Fixed incorrect handling of null/empty values when writing blank string (CsvWriter): #123
- Fixed regression introduced in version 2.2.2 that could cause
NullPointerException
s when initializing conversions for annotated classes : #121
Enhancements
- Adding option to truncate long strings when reading from a Record instance: commit
Next maintenance version 2.2.2
Bugfixes
- Inconsistent behavior of
maxCharsPerColumn
in CSV parser: (#113) - optimization introduced in version 2.2.1 allowed unquoted CSV values to be as long as the input buffer size, regardless of the setting. - Annotations with
defaultNullRead
won't make the parser set the value provided into a java bean if the targeted column is not present in the input : (#116) - Annotated fields in inherited classes were being ignored.
AbstractConcurrentProcessor
was not shutting down its internal ExecutorService: 100aa34- Fixed incorrect behavior of parser when processing a combination of: user-provided headers + field selection + column reordering disabled: c6241df
WAY FASTER Maintenance version 2.2.1
New minor release version 2.2.0 with many improvements
Enhancements & new features
- Maximum number of characters per column can now be set to unlimited. Just set
parserSettings.setMaxCharsPerColumn(-1)
and the internal buffer will auto-expand. (#96) - Length of values in error messages produced by the parser/writer can now be restricted and/or omitted. Use
settings.setErrorContentLength(int);
to configure this. If set to 0 then no values will be printed out in the exception messages (#98) - CSV parser can now be configured to keep the quotes around quoted fields. Call
parserSettings.setKeepQuotes(true);
to enable it. (#95) - ConcurrentRowProcessor now accepts a
limit
parameter in its constructor to restrict the amount of rows enqueued for consumption of other processors - Great performance improvements for the CSV parser when processing quoted values (around 30% faster). Performance has improved a bit for all other parsers as well.
- Support for meta-annotations introduced. Users can now create annotations pre-configured to their requirements and re-use them in every class/field they need (see #103).
Bugfixes
Next maintenance version 2.1.2
Bugfixes included in this version:
- When using
Record
s, having no headers defined causesArrayIndexOutOfBoundException
when reading column values by index (#93). - Parsers fail with
ArrayIndexOutOfBoundException
if number of columns parsed equals themaxColumns
setting (#89). - When parsing CSV, when the last value is quoted, and there's no newline character after the last record, the parser will produce
nullValue
instead ofemptyValue
(#92). - Unable to parse fixed-width without line breaks (#90).
- Unable to write fixed-width without line breaks. Added the
writeLineSeparatorAfterRecord
option inFixedWidthWriterSettings
to properly support this requirement (#91).
Released maintenance 2.1.1 with bugfixes
This release includes performance improvements parsing CSV and TSV with ignoreTrailingWhitespaces
enabled (which is the default), and a couple of bug fixes:
Released version 2.1.0
- Performance improvements for parsing/writing CSV and TSV. CSV writing and parsing got 30-40% faster.
- Deprecated methods
setParseUnescapedQuotes
andsetParseUnescapedQuotesUntilDelimiter
class CsvParserSettings in favor of the newsetUnescapedQuoteHandling
method that takes values from the UnescapedQuoteHandling enumeration. - Default behavior of the CSV parser when unescaped quotes are found on the input changed to parse until a delimiter character is found, i.e.
UnescapedQuoteHandling.STOP_AT_DELIMITER
. The old default of trying to find a closing quote (i.e.UnescapedQuoteHandling.STOP_AT_CLOSING_QUOTE
) can be problematic when no closing quote is found, making the parser accumulate all characters into the same value, until the end of the input.
Adjustsmets to CSV parsing with unescaped quotes
New option parseUnescapedQuotesUntilDelimiter=true
was not handling some situations with CSV + unescaped quotes such as
"a"b,c,d
Which should produce values:
"a"b
c
d
This is now fixed and the expected output is produced. More details on #60
Fixes to AbstractWriter
Released to fix possible data corruption errors that can occur when writing any of the formats supported by the library. Affects version 2.0.0 only.
Details on this pull request.