Skip to content

Releases: uniVocity/univocity-parsers

Releasing **BIG** minor version 2.3.0

15 Dec 06:51
Compare
Choose a tag to compare

Enhancements

  • Introduced the RetryableErrorHandler to prevent discarding rows on DataProcessingException and allow the user to retry and recover the record.
  • Headers parsed from the input are now available after parser.beginReading and processor.processStarted. There's no more need to parse the first valid row prior to obtaining the headers.
  • Introduced the @FixedWidth annotation to allow users to configure fixed-width length and other properties via annotations.
  • Introduced the @Copy annotation to allow creating meta-annotations with properties that can be copied over its component annotations - this is very cool, check the javadocs.
  • Added a skipBitsAsWhitespace configuration option to all parsers and writers that when disabled will prevent characters \0 and \1 from being skipped as whitespace. You are likely to need this if you are working with database dumps that produce (or expect) such characters to represent a BIT value.
  • Introduced a getInputDimension() method to all routine implementations (i.e. CsvRoutines, TsvRoutines and FixedWidthRoutines)
  • Improved the CSV format detection algorithm with better handling of single quotes inside double quotes and better statistics processing for dubious cases (also fixes the build, which failed when executed under Java 8)

Bugfixes

  • Fixed unexpected null at the end of the last fixed width record when the file was not terminated by a line ending: #128
  • 
Setting writerSettings.setEmptyValue(“\”\””)
 (or anything with quotes) for writing CSV files should work as expected and produce "" instead of escaping the quotes.

Next maintenance version 2.2.3

30 Oct 02:29
Compare
Choose a tag to compare

Bugfixes

  • Fixed incorrect handling of quote escape character NOT followed by a quote in CSV. The parser was swallowing the escape character if it wasn't itself escaped: commit
  • Fixed incorrect handling of null/empty values when writing blank string (CsvWriter): #123
  • Fixed regression introduced in version 2.2.2 that could cause NullPointerExceptions when initializing conversions for annotated classes : #121

Enhancements

  • Adding option to truncate long strings when reading from a Record instance: commit

Next maintenance version 2.2.2

19 Sep 09:55
Compare
Choose a tag to compare

Bugfixes

  • Inconsistent behavior of maxCharsPerColumn in CSV parser: (#113) - optimization introduced in version 2.2.1 allowed unquoted CSV values to be as long as the input buffer size, regardless of the setting.
  • Annotations with defaultNullRead won't make the parser set the value provided into a java bean if the targeted column is not present in the input : (#116)
  • Annotated fields in inherited classes were being ignored.
  • AbstractConcurrentProcessor was not shutting down its internal ExecutorService: 100aa34
  • Fixed incorrect behavior of parser when processing a combination of: user-provided headers + field selection + column reordering disabled: c6241df

WAY FASTER Maintenance version 2.2.1

22 Aug 03:41
Compare
Choose a tag to compare

Enhancements & new features

  • CSV parser and writer performance greatly improved (something around 30%). You'll only believe if you see it.

Bugfixes

  • Trailing comman in headers causes NullPointerException when using Records : (#109)
  • ParsingContext.getCurrentParsedContent() returns null: (#106)

New minor release version 2.2.0 with many improvements

17 Jul 23:37
Compare
Choose a tag to compare

Enhancements & new features

  • Maximum number of characters per column can now be set to unlimited. Just set parserSettings.setMaxCharsPerColumn(-1) and the internal buffer will auto-expand. (#96)
  • Length of values in error messages produced by the parser/writer can now be restricted and/or omitted. Use settings.setErrorContentLength(int); to configure this. If set to 0 then no values will be printed out in the exception messages (#98)
  • CSV parser can now be configured to keep the quotes around quoted fields. Call parserSettings.setKeepQuotes(true);to enable it. (#95)
  • ConcurrentRowProcessor now accepts a limit parameter in its constructor to restrict the amount of rows enqueued for consumption of other processors
  • Great performance improvements for the CSV parser when processing quoted values (around 30% faster). Performance has improved a bit for all other parsers as well.
  • Support for meta-annotations introduced. Users can now create annotations pre-configured to their requirements and re-use them in every class/field they need (see #103).

Bugfixes

  • CSV parser doesn't handle quote characters set to white space characters such as \t: (#100)
  • Cell with only whitespace chars (NullValue) mistakenly treated as EmptyValue in last column:(#97)

Next maintenance version 2.1.2

09 Jun 07:41
Compare
Choose a tag to compare

Bugfixes included in this version:

  • When using Records, having no headers defined causes ArrayIndexOutOfBoundException when reading column values by index (#93).
  • Parsers fail with ArrayIndexOutOfBoundException if number of columns parsed equals the maxColumns setting (#89).
  • When parsing CSV, when the last value is quoted, and there's no newline character after the last record, the parser will produce nullValue instead of emptyValue (#92).
  • Unable to parse fixed-width without line breaks (#90).
  • Unable to write fixed-width without line breaks. Added the writeLineSeparatorAfterRecord option in FixedWidthWriterSettings to properly support this requirement (#91).

Released maintenance 2.1.1 with bugfixes

10 May 07:21
Compare
Choose a tag to compare

This release includes performance improvements parsing CSV and TSV with ignoreTrailingWhitespaces enabled (which is the default), and a couple of bug fixes:

  • Fix incorrect column selection output with mismatching headers: #86
  • ArrayIndexOutOfBoundsException when using selectFields(Enum...) and Record.get...(Enum): #85

Released version 2.1.0

02 May 06:42
Compare
Choose a tag to compare
  1. Performance improvements for parsing/writing CSV and TSV. CSV writing and parsing got 30-40% faster.
  2. Deprecated methods setParseUnescapedQuotes and setParseUnescapedQuotesUntilDelimiter class CsvParserSettings in favor of the new setUnescapedQuoteHandling method that takes values from the UnescapedQuoteHandling enumeration.
  3. Default behavior of the CSV parser when unescaped quotes are found on the input changed to parse until a delimiter character is found, i.e. UnescapedQuoteHandling.STOP_AT_DELIMITER. The old default of trying to find a closing quote (i.e. UnescapedQuoteHandling.STOP_AT_CLOSING_QUOTE) can be problematic when no closing quote is found, making the parser accumulate all characters into the same value, until the end of the input.

Adjustsmets to CSV parsing with unescaped quotes

04 Apr 05:08
Compare
Choose a tag to compare

New option parseUnescapedQuotesUntilDelimiter=true was not handling some situations with CSV + unescaped quotes such as

    "a"b,c,d

Which should produce values:

     "a"b
     c 
     d

This is now fixed and the expected output is produced. More details on #60

Fixes to AbstractWriter

02 Apr 06:16
Compare
Choose a tag to compare

Released to fix possible data corruption errors that can occur when writing any of the formats supported by the library. Affects version 2.0.0 only.

Details on this pull request.