NLP-TAB is a web-based system designed to allow researchers and developers of Natural Language Processing (NLP) systems to compare the output of several disparate NLP systems to each other or to a manually created reference standard. The comparison is performed by running the NLP systems on a single corpus of text with subsequent statistical analysis of co-occurrences between annotations generated by NLP systems. Analysis results are stored and indexed using the ElasticSearch technology and displayed to end users with a custom web-based interface.
The Documents section allows for the exploration of the documents run through each of the analyzed systems. You can filter to find specific text in documents, below are a few examples on our demo server:
Type system analysis performs the comparison between annotation types generated by the different NLP systems by first counting how often pairs of annotations from different NLP systems cover approximately the same text and how often they cover completely different text. This co-occurence information is used to generate 2X2 tables for all pairs of annotation types in order to calculate the degree of dependence between annotation types using common metrics which at present include the F-score, Jaccard and Matthews coefficients. Pairs of annotation types with higher scores are more likely to be functionally equivalent.
The type systems screen allows for users to explore the type systems that have been uploaded to the system, browsing the individual types in each system. Information included on the type systems page:
NLP-TAB uses an Elasticsearch backend to store Common Annotation Structure (CAS) information produced by each NLP system being compared for each document in the collection. A read-only api to the backend is accessible at athena.ahc.umn.edu/elasticsearch For more information on elasticsearch, you can visit their website at elasticsearch.org.
- An ElasticSearch server running version 2.1.0.
- JDK 1.8
- Maven.
In order to build the NLP-TAB ElasticSearch plugin run the following command in the NLP-TAB project directory.
mvn clean package
This will build a ElasticSearch plugin zip file in target/releases, nlptab-{version}.zip. To install the plugin into your ElasticSearch server you can type:
bin/plugin install file:/path-to/target/releases/nlptab-{version}.zip
NLP-TAB is developed by the University of Minnesota Institute for Health Informatics NLP/IE Group and the Open Health NLP Consortium.
Funding for this work was provided by:
- 1 R01 LM011364-01 NIH-NLM
- 1 R01 GM102282-01A1 NIH-NIGMS
- U54 RR026066-01A2 NIH-NCRR