Releases: dkpro/dkpro-core
dkpro-core-2.5.0
DKPro Core is a collection of interoperable software components for natural language processing (NLP) based on the Apache UIMA framework.
https://dkpro.github.io/dkpro-core
This is a feature release.
What's Changed
- ⭐️ #1570 - Support xml id on certain TEI elements by @reckart in #1572
- ⭐️ #1579 - Allow defining features as IRI features so they are not rendered as literal strings by @reckart in #1581
- 🦟 #1575 - Relation offsets not set in WebAnnoTsv3XReader by @reckart in #1576
- 🦟 #1584 - Strip out BOM when reading text files by @reckart in #1585
- 🦟 #811 - LanguageToolSegmenter chokes on "丁肇中" by @reckart in #1590
- ⚙️ #1571 - Upgrade dependencies by @reckart in #1573, #1577, #1580, #1592, #1593, #1594
Full Changelog: dkpro-core-2.4.0...dkpro-core-2.5.0
When upgrading, please mind that you should not mix different versions of DKPro Core components in your projects - they may not be compatible with each other.
If you have previously imported dkpro-core-asl
/dkpro-core-gpl
into your project, please switch to importing dkpro-core-bom-asl
/dkpro-core-bom-gpl
instead.
DKPro Core 2.4.0
DKPro Core is a collection of interoperable software components for natural language processing (NLP) based on the Apache UIMA framework.
https://dkpro.github.io/dkpro-core
This is a feature release.
What's Changed
- #1511 - Do not initialize POS mapping loader if mapping is disabled by @reckart in #1557
- #1565 - Option to replace illegal characters in XMI files by @reckart in #1566
- #1560 - Upgrade dependencies by @reckart in #1567
- #1446 - Add support for BioC by @reckart in #1447
- #1560 - Upgrade dependencies by @reckart in #1568, #1564
Full Changelog: dkpro-core-2.3.1...dkpro-core-2.4.0
When upgrading, please mind that you should not mix different versions of DKPro Core components in your projects - they may not be compatible with each other.
DKPro Core 2.3.1
DKPro Core is a collection of interoperable software components for natural language processing (NLP) based on the Apache UIMA framework.
https://dkpro.github.io/dkpro-core
This is a bugfix release.
What's Changed
Full Changelog: dkpro-core-2.3.0...dkpro-core-2.3.1
When upgrading, please mind that you should not mix different versions of DKPro Core components in your projects - they may not be compatible with each other.
DKPro Core 2.3.0
We are pleased to announce the release of
DKPro Core 2.3.0
a collection of interoperable software components for natural language processing (NLP) based on the Apache UIMA framework.
https://dkpro.github.io/dkpro-core
This release updates many dependencies, removes several modules which are no longer viable and fixes a few bugs.
What's Changed
- 🩹 #1500 - Upgrade dependencies by @reckart in #1501, #1527, #1535, #1540, #1544, #1546, #1558
- ⚙️ #1537 - Centralize creation of XML factories by @reckart in #1538
- ⚙️ #1504 - Author tags by @reckart in #1543
- ⚙️ #1508 - Move model-downloading code out of resources-api by @reckart in #1552
- 💀 #1541 - Drop maui module by @reckart in #1542
- 💀 #1548 - Drop Cermine by @reckart in #1549
- 💀 #1545 - Drop ARK module by @reckart in #1547
- 💀 #1550 - Drop classic Stanford NLP integration by @reckart in #1551
- 💀 #1553 - Remove SemanticFieldAnnotator by @reckart in #1554
- 💀 #1555 - Remove constraints parameter on token merger by @reckart in #1556
- 🦟 #1520 - Danish UD17 Tag Mapping Typo by @reckart in #1559
- 🦟 #1498 - NPE in TeiWriter when NamedEntity value feature is null by @reckart in #1499
Full Changelog: rel/dkpro-core-2.2.0...dkpro-core-2.3.0
Thanks to all contributors!
When upgrading, please mind that you should not mix different versions of DKPro Core components in your projects - they may not be compatible with each other.
DKPro Core 2.2.0
We are pleased to announce the release of
DKPro Core 2.2.0
a collection of interoperable software components for natural language processing (NLP) based on the Apache UIMA framework version 3.
https://dkpro.github.io/dkpro-core
This is a feature release.
Notable changes since DKPro Core 2.1.0
- io-brat: Fixed NPE when WebAnno-style slot feature does not have a role label
- io-xmi: Added support for binary TSI
- io-nif: Improved entity linking support
- io-conll-u: Set div type on paragraphs
- documentation: Make data format examples more easily copy/pastable
- Updated various dependencies
A more detailed overview of the changes in this release can be found [2].
Thanks to all contributors!
When upgrading, please mind that you should not mix different versions of DKPro Core components in your projects - they may not be compatible with each other.
[1] https://github.com/dkpro/dkpro-core/releases/tag/rel%2Fdkpro-core-2.2.0
[2] https://github.com/dkpro/dkpro-core/issues?q=milestone%3A2.2.0
DKPro Core 2.1.0
We are pleased to announce the release of
DKPro Core 2.1.0
a collection of interoperable software components for natural language processing (NLP) based on the Apache UIMA framework version 3.
https://dkpro.github.io/dkpro-core
This is a feature release.
Notable changes since DKPro Core 2.0.0
- Added option to export XMI using XML 1.1 to avoid issues with certain characters
- Added option to CoNLL readers to trim off whitespace from field values to avoid users having issues with incidental space characters (default is on)
- Added support for annotator notes in brat format
- Improved speed for writing WebAnno TSV format (backported from WebAnno)
- Fixed a couple of issues with the CoNLL 2012 format
- Fixed default extension for CoNLL-U writer
- Fixed problem in CoNLL-U writer when text contains line breaks
- Fixed problem that LanguageToolChecker did not fill in suggestions
- Fixed setting div type on paragraphs created by CoNLL-U reader
A more detailed overview of the changes in this release can be found [2].
Thanks to all contributors!
When upgrading, please mind that you should not mix different versions of DKPro Core components in your projects - they may not be compatible with each other.
[1] https://github.com/dkpro/dkpro-core/releases/tag/rel%2Fdkpro-core-2.1.0
[2] https://github.com/dkpro/dkpro-core/issues?q=milestone%3A2.1.0
DKPro Core 1.12.0
We are pleased to announce the release of
DKPro Core 1.12.0
a collection of interoperable software components for natural language processing (NLP) based on the Apache UIMA framework version 2.
https://dkpro.github.io/dkpro-core
This is a feature release.
Important upgrade notice
If you are upgrading from DKPro Core 1.10.x or earlier, please read the DKPro Core 1.11.0 upgrade notice [1].
Notable changes since DKPro Core 1.11.1
- Added option to export XMI using XML 1.1 to avoid issues with certain characters
- Added option to CoNLL readers to trim off whitespace from field values to avoid users having issues with incidental space characters (default is on)
- Added support for annotator notes in brat format
- Improved speed for writing WebAnno TSV format (backported from WebAnno)
- Fixed a couple of issues with the CoNLL 2012 format
- Fixed default extension for CoNLL-U writer
- Fixed problem in CoNLL-U writer when text contains line breaks
- Fixed problem that LanguageToolChecker did not fill in suggestions
A more detailed overview of the changes in this release can be found [2].
Thanks to all contributors!
When upgrading, please mind that you should not mix different versions of DKPro Core components in your projects - they may not be compatible with each other.
[1] https://github.com/dkpro/dkpro-core/releases/tag/rel%2Fdkpro-core-1.12.0
[2] https://github.com/dkpro/dkpro-core/issues?q=milestone%3A1.12.0
DKPro Core 2.0.0
We are pleased to announce the release of
DKPro Core 2.0.0
a collection of interoperable software components for natural language processing (NLP) based on the Apache UIMA framework.
https://dkpro.github.io/dkpro-core
This is a feature release.
Important upgrade notice
This version requires UIMA v3.
If you are upgrading from DKPro Core 1.10.x or earlier, please read the DKPro Core 1.11.0 upgrade notice [1].
Notable changes since DKPro Core 1.11.1
- Switched to UIMAv3
- Added filling in suggestions to LanguageToolChecker
- Added support for notes to BratReader
- Added basic read support for Perseus XML format
- Improved error message when StanfordNamedEntityRecognizerTrainer is called without training data
- Moved StopwordRemover to tokit module and removed stopwordremover module
- Renamed lancaster module to smile
- Removed Tag type from syntax module
- ... and a few additional under-the-hood changes
A more detailed overview of the changes in this release can be found [2].
Thanks for contributions go to: @alaindesilets, @mischor
When upgrading, please mind that you should not mix different versions of DKPro Core components in your projects - they may not be compatible with each other.
[1] https://github.com/dkpro/dkpro-core/releases/tag/rel%2Fdkpro-core-2.0.0
[2] https://github.com/dkpro/dkpro-core/issues?q=milestone%3A2.0.0
DKPro Core 1.11.1
We are pleased to announce the release of
DKPro Core 1.11.1
a collection of interoperable software components for natural language processing (NLP) based on the Apache UIMA framework.
https://dkpro.github.io/dkpro-core
This is a bugfix release.
Important upgrade notice
If you are upgrading from DKPro Core 1.10.x or earlier, please read the DKPro Core 1.11.0 upgrade notice [1].
Notable changes since DKPro Core 1.11.0
- Fixed trimming of whitespace at the start and end of annotations
- Fixed encoding of named entity categories in LIF format
- Fixed unescaping of URI-encoded characters when writing files
- Added parameter to control whitespace normalization in HtmlDocumentReader
- Added parameters to control indentation and output method in XmlDocumentWriter
- Improved exception in Stanford CoreNLP NER trainer when no documents have been processed
A more detailed overview of the changes in this release can be found [2].
Thanks for contributions go to: @az79nefy, @ramonziai, @manuelciosici, @Horsmann, @tilmanbeck, @alaindesilets, @jcklie
When upgrading, please mind that you should not mix different versions of DKPro Core components in your projects - they may not be compatible with each other.
[1] https://github.com/dkpro/dkpro-core/releases/tag/dkpro-core-1.11.0
[2] https://github.com/dkpro/dkpro-core/issues?q=milestone%3A1.11.1
DKPro Core 1.11.0
We are pleased to announce the release of
DKPro Core 1.11.0
a collection of interoperable software components for natural language processing (NLP) based on the Apache UIMA framework.
https://dkpro.github.io/dkpro-core
This is a feature release.
Important upgrade notice
- Changed groupIds and artifactIds. The group ID is now org.dkpro.core and the artifact IDs are dkpro-core-...-(asl/gpl)
- Changed package names. The packages are now all starting with org.dkpro.core... - except the packages of UIMA types which remain unchanged for data compatibility.
Notable changes since DKPro Core 1.10.0
- Changed parts of the brat data conversion code such that it can be more easily used outside a UIMA component
- Changed type mapping such that out-of-tagset types map to the generic type (e.g. an unknown POS tag maps to POS, not to POS_X)
- Changed name of NYTCollectionReader to NitfReader
- Added types to encode XML document structure in CAS
- Added new XmlDocumentReader/Writer components using these types
- Added basic reader for Annotated Gigaword corpus (only reads text so far) (thanks @az79nefy)
- Added basic support for PubAnnotation JSON format
- Added Maui component for keyword assignment
- Added parameter to SfstAnnotator to enable lower-case lookup of first word in a sentence (thanks @rziai)
- Added "order" feature to Token type
- Added support for CoNLL-U document and paragraph IDs (thanks @manuelciosici)
- Added support for CoNLL-U sentence IDs and text
- Added standardized parameter to disable type mapping
- Added support for TCF orthography layer using SofaChangeAnnotations
- Added segmenter for Chinese using jieba (thanks @Horsmann)
- Added MyStem for Russian
- Added links to OpenMinTeD categories in type system documentation
- Added support for the reading/writing the CoreNLP CoNLL flavor
- Added parameter to configure the Tika buffer size (useful for large documents)
- Updated to OpenNLP 1.9.1
- Updated to CoreNLP 3.9.2
- Updated to ICU4J 64.2
- Updated to Tika 1.19.1
- Updated to LanguageTool 4.3
- Updated to PDFBox 2.0.12
- Updated IllinoisNLP components
- Updated TreeTagger models/binaries in build.xml script (thanks @tilmanbeck)
- Updated LIF dependencies
- Updated dataset descriptions
- Updated various general dependencies (e.g. Apache Commons etc.)
- Improved robustness of checksum verification for text files used in datasets (e.g. license files)
- Improved error messages in WebAnno TSV3 module
- Fixed crash in WebannoTsv3XWriter when annotations do not start/end at token boundaries
- Fixed bug in WebAnno TSV3 support causing span annotations with slot features to disappear
- Fixed trimming of whitespace in TeiReader
- Fixed bug in NifWriter causing named entity identifier not to be written
- Fixed crash in BratReader with reading discontinuous segments
- Fixed problem in BratWriter when dealing with slot features
- Fixed metadata of CoNLL2012Writer
- Fixed potential problem of datasets being written outside their target directory
- Dropped the GrAF I/O module since the upstream libraries are outdated and no longer maintained
A more detailed overview of the changes in this release can be found here.
Thanks for contributions go to: @az79nefy, @ramonziai, @manuelciosici, @Horsmann, @tilmanbeck
When upgrading, please mind that you should not mix different versions of DKPro Core components in your projects - they may not be compatible with each other.