Version 2.2.0
The following is a summary of the new features and changes in Baleen
2.2.0. There may be additional changes and features. Please refer to the
diff and commit logs for full details.
New core features
- All entities now have a sub-type
- Added gender to Person
- Baleen Jobs framework
- Plankton visual pipeline tool
New collection readers and improvements to existing collection readers
- EmailReader
- FolderReader now accepts a regular expression to filter against, rather than a file extension
- MucReader
- ReutersReader
New annotators and improvements to existing annotators
- Added nautical miles to Distance regex
- CorefBrackets cleaner (replaces CorefLocationCoordinate cleaner)
- Coreference annotators and sieves
- Improvements to LatLon annotator
- Interaction annotators
- Keyword extraction annotators (RakeKeywords and CommonKeywords)
- Relationship annotators
- NPVNP
- SimpleInteraction
- UbmreConstituent
- UbmbreDependency
- Rewrite of MoneyRegex to fix issues with previous version
- USTelephone
New consumers and improvements to existing consumers
- CSV Consumers
- Elasticsearch upgraded to Elasticsearch 2
- ElasticsearchRest
- MongoPatternSaver
- Print consumers to output information to the console
New jobs
- Interactions jobs
- MongoStats
New resources
- SharedStopwordResource
- SharedWordNetResource
Bug fixes, improved unit testing, updated dependencies and reductions to
technical debt
Please be aware that some aspects of this release may not be backwards
compatible with previous versions.