(C) 2019 by Damir Cavar, Oren Baldinger, Maanvitha Gongalla, Anurag Kumar, Murali Kammili, Boli Fang
Brought to you by the NLP-Lab.org!
Xrenner wrapper for JSON-NLP. Xrenner specializes in coreference and anaphora resolution, in a more highly annotated manner than just a coreference chain.
Xrenner requires a Dependency Parse in CoNLL-U format. This can come from CoreNLP, or another parser that provides universal dependencies in [CoNNL-U] format. There are two ways to accomplish this:
The XrennerPipeline
class will take care of the details, however it requires an available CoreNLP server.
The easiest way to create one is with Docker:
docker pull nlpbox/corenlp
docker run -p 9000:9000 -ti nlpbox/corenlp
To test this, open a new tab,
wget -q --post-data "Although they didn't like it, they accepted the offer." 'localhost:9000/?properties={"annotators":"depparse","outputFormat":"conll"}' -O /dev/stdout
You then need to create a .env
file in the root of the project, follow the example in sample_env
.
The default entry that corresponds to the Docker command above is:
CORENLP_SERVER=http://localhost:9000
Use the XrennerPipeline.process_conll
function, with your conll data passed as a string via
the conll
argument.
You may find the pyjsonnlp.conversion.to_conllu
function helpful for converting JSON-NLP,
maybe from spaCy, to CoNLL-U.
The JSON-NLP repository provides a Microservice class, with a pre-built implementation of Flask. To run it, execute:
python xrennerjsonnlp/server.py
Since server.py
extends the Flask app, a WSGI file would contain:
from xrennerjsonnlp.server import app as application
Text is provided to the microservice with the text
parameter, via either GET
or POST
. If you pass url
as a parameter, the microservice will scrape that url and process the text of the website.
Here is an example GET
call:
http://localhost:5000?text=John went to the store. He bought some milk.
The process_conll
endpoint mentioned above is available at the /process_conll
URI. Instead of passing text
, pass conll
. A POST operation will be easier than GET
in this situation.