Skip to content

Version 2.2.0

Compare
Choose a tag to compare
@jbaker-dstl jbaker-dstl released this 02 Jun 17:11
· 150 commits to master since this release

The following is a summary of the new features and changes in Baleen
2.2.0. There may be additional changes and features. Please refer to the
diff and commit logs for full details.

New core features

  • All entities now have a sub-type
  • Added gender to Person
  • Baleen Jobs framework
  • Plankton visual pipeline tool

New collection readers and improvements to existing collection readers

  • EmailReader
  • FolderReader now accepts a regular expression to filter against, rather than a file extension
  • MucReader
  • ReutersReader

New annotators and improvements to existing annotators

  • Added nautical miles to Distance regex
  • CorefBrackets cleaner (replaces CorefLocationCoordinate cleaner)
  • Coreference annotators and sieves
  • Improvements to LatLon annotator
  • Interaction annotators
  • Keyword extraction annotators (RakeKeywords and CommonKeywords)
  • Relationship annotators
    • NPVNP
    • SimpleInteraction
    • UbmreConstituent
    • UbmbreDependency
  • Rewrite of MoneyRegex to fix issues with previous version
  • USTelephone

New consumers and improvements to existing consumers

  • CSV Consumers
  • Elasticsearch upgraded to Elasticsearch 2
  • ElasticsearchRest
  • MongoPatternSaver
  • Print consumers to output information to the console

New jobs

  • Interactions jobs
  • MongoStats

New resources

  • SharedStopwordResource
  • SharedWordNetResource

Bug fixes, improved unit testing, updated dependencies and reductions to
technical debt

Please be aware that some aspects of this release may not be backwards
compatible with previous versions.