Skip to content

Commit

Permalink
Add test coverage
Browse files Browse the repository at this point in the history
  • Loading branch information
wetneb committed Jan 22, 2019
1 parent eeebc3a commit f9385d8
Show file tree
Hide file tree
Showing 4 changed files with 86 additions and 39 deletions.
7 changes: 4 additions & 3 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,9 @@ python:
- "3.5"
- "3.6"
install:
- pip install -r requirements.txt
# command to run tests
- pip install pytest-cov coveralls -r requirements.txt
script:
- pytest
- pytest --cov=pynif

after_success:
- coveralls
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# pynif [![Build Status](https://travis-ci.org/wetneb/pynif.svg?branch=master)](https://travis-ci.org/wetneb/pynif)
# pynif [![Build Status](https://travis-ci.org/wetneb/pynif.svg?branch=master)](https://travis-ci.org/wetneb/pynif) [![Coverage Status](https://coveralls.io/repos/github/wetneb/pynif/badge.svg?branch=master)](https://coveralls.io/github/wetneb/pynif?branch=master)

The [NLP Interchange Format (NIF)](http://persistence.uni-leipzig.org/nlp2rdf/) is an RDF/OWL-based format that aims to achieve interoperability between Natural Language Processing (NLP) tools, language resources and annotations. It offers a standard representation of annotated texts for tasks such as [Named Entity Recognition](https://en.wikipedia.org/wiki/Named-entity_recognition) or [Entity Linking](https://en.wikipedia.org/wiki/Entity_linking). It is used by [GERBIL](https://github.com/dice-group/gerbil) to run reproducible evaluations of annotators.

Expand Down
115 changes: 81 additions & 34 deletions README.rst
Original file line number Diff line number Diff line change
@@ -1,15 +1,19 @@
pynif
=====

`Build Status <https://travis-ci.org/wetneb/pynif>`__

What is NIF (NLP Interchange Format) ?
--------------------------------------

The NLP Interchange Format (NIF) is an RDF/OWL-based format that aims to
achieve interoperability between Natural Language Processing (NLP)
tools, language resources and annotations. NIF consists of
specifications, ontologies and software (overview).
pynif `Build Status <https://travis-ci.org/wetneb/pynif>`__ `Coverage Status <https://coveralls.io/github/wetneb/pynif?branch=master>`__
========================================================================================================================================

The `NLP Interchange Format
(NIF) <http://persistence.uni-leipzig.org/nlp2rdf/>`__ is an
RDF/OWL-based format that aims to achieve interoperability between
Natural Language Processing (NLP) tools, language resources and
annotations. It offers a standard representation of annotated texts for
tasks such as `Named Entity
Recognition <https://en.wikipedia.org/wiki/Named-entity_recognition>`__
or `Entity Linking <https://en.wikipedia.org/wiki/Entity_linking>`__. It
is used by `GERBIL <https://github.com/dice-group/gerbil>`__ to run
reproducible evaluations of annotators.

This Python library can be used to serialize and deserialized annotated
corpora in NIF.

Documentation
-------------
Expand All @@ -19,20 +23,27 @@ Documentation
Supported NIF versions
----------------------

- 2.1
NIF 2.1, serialized in `any of the formats supported by
rdflib <https://rdflib.readthedocs.io/en/stable/plugin_parsers.html>`__

Supported RDF formats
---------------------
Overview
--------

- `All the formats supported by
rdflib <https://rdflib.readthedocs.io/en/stable/plugin_parsers.html>`__
This library is revolves around three core classes: \* a ``NIFContext``
is a document (a string); \* a ``NIFPhrase`` is the annotation of a
snippet of text (usually a phrase) in a document; \* a ``NIFCollection``
is a set of documents, which constitutes a collection. In NIF, each of
these objects is identified by a URI, and their attributes and relations
are encoded by RDF triples between these URIs. This library abstracts
away the encoding by letting you manipulate collections, contexts and
phrases as plain Python objects.

Usage
-----
Quick start
-----------

0) Import and create a collection

::
.. code:: python
from pynif import NIFCollection
Expand All @@ -42,15 +53,15 @@ Usage
1) Create a context

::
.. code:: python
context = collection.add_context(
uri="http://freme-project.eu/doc32",
mention="Diego Maradona is from Argentina.")
2) Create entries for the entities

::
.. code:: python
context.add_phrase(
beginIndex=0,
Expand All @@ -72,16 +83,57 @@ Usage
3) Finally, get the output with the format that you need

::
.. code:: python
generated_nif = collection.dumps(format='turtle')
print(generated_nif)
You can then parse it back:

::

parsed_collection = NIFCollection.loads(generated_nif)
You will obtain the NIF representation as a string:

.. code:: turtle
<http://freme-project.eu> a nif:ContextCollection ;
nif:hasContext <http://freme-project.eu/doc32> ;
ns1:conformsTo <http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core/2.1> .
<http://freme-project.eu/doc32> a nif:Context,
nif:OffsetBasedString ;
nif:beginIndex "0"^^xsd:nonNegativeInteger ;
nif:endIndex "33"^^xsd:nonNegativeInteger ;
nif:isString "Diego Maradona is from Argentina." .
<http://freme-project.eu/doc32#offset_0_14> a nif:OffsetBasedString,
nif:Phrase ;
nif:anchorOf "Diego Maradona" ;
nif:beginIndex "0"^^xsd:nonNegativeInteger ;
nif:endIndex "14"^^xsd:nonNegativeInteger ;
nif:referenceContext <http://freme-project.eu/doc32> ;
nif:taMsClassRef <http://dbpedia.org/ontology/SoccerManager> ;
itsrdf:taAnnotatorsRef <http://freme-project.eu/tools/freme-ner> ;
itsrdf:taClassRef <http://dbpedia.org/ontology/Person>,
<http://dbpedia.org/ontology/SportsManager>,
<http://nerd.eurecom.fr/ontology#Person> ;
itsrdf:taConfidence 9.869993e-01 ;
itsrdf:taIdentRef <http://dbpedia.org/resource/Diego_Maradona> .
<http://freme-project.eu/doc32#offset_23_32> a nif:OffsetBasedString,
nif:Phrase ;
nif:anchorOf "Argentina" ;
nif:beginIndex "23"^^xsd:nonNegativeInteger ;
nif:endIndex "32"^^xsd:nonNegativeInteger ;
nif:referenceContext <http://freme-project.eu/doc32> ;
nif:taMsClassRef <http://dbpedia.org/resource/Argentina> ;
itsrdf:taAnnotatorsRef <http://freme-project.eu/tools/freme-ner> ;
itsrdf:taClassRef <http://dbpedia.org/ontology/Place>,
<http://dbpedia.org/ontology/PopulatedPlace>,
<http://nerd.eurecom.fr/ontology#Location> ;
itsrdf:taConfidence 9.804964e-01 .
4) You can then parse it back:

.. code:: python
parsed_collection = NIFCollection.loads(generated_nif, format='turtle')
for context in parsed_collection.contexts:
for phrase in context.phrases:
Expand All @@ -92,9 +144,4 @@ Issues

If you have any problems with or questions about this library, please
contact us through a `GitHub
issue <https://github.com/NLP2RDF/pyNIF-lib/issues>`__.

Maintainers
-----------


issue <https://github.com/wetneb/pynif/issues>`__.
1 change: 0 additions & 1 deletion requirements.txt
Original file line number Diff line number Diff line change
@@ -1,2 +1 @@
pytest
rdflib

0 comments on commit f9385d8

Please sign in to comment.