Skip to content

Commit

Permalink
Move NLTK migration info from README to docs
Browse files Browse the repository at this point in the history
Part of #18
  • Loading branch information
goodmami committed Nov 24, 2020
1 parent 3b3f7a4 commit 2665c4c
Show file tree
Hide file tree
Showing 3 changed files with 112 additions and 16 deletions.
16 changes: 0 additions & 16 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,9 +29,6 @@ default. Instead, all wordnets are searched unless one (or more) are
specified:

```python
>>> from nltk.corpus import wordnet as nltk_wn
>>> nltk_wn.synsets('chat') # only English
>>> nltk_wn.synsets('chat', lang='fra') # only French
>>> import wn
>>> wn.synsets('chat') # all installed wordnets
>>> wn.synsets('chat', lgcode='en') # limit to one language
Expand Down Expand Up @@ -111,16 +108,3 @@ can also be downloaded and installed independently:
| Wordnet Bahasa | `zsmwn` | `1.3+omw` | Malaysian [zsm] |

The project index list is defined in [wn/index.toml](wn/index.toml).

## Migrating from the NLTK's wordnet Module

Some operations keep a compatible API with the NLTK's `wordnet`
module, but most will need some translation.

| Operation | `nltk.corpus.wordnet as wn` | `pwn = wn.Wordnet('pwn', '3.0')` |
| --------------------------- | ----------------------------- | -------------------------------- |
| Lookup Synsets by word form | `wn.synsets("chat")` | `pwn.synsets("chat")` |
| | `wn.synsets("chat", pos="v")` | `pwn.synsets("chat", pos="v")` |
| Lookup Synsets by POS | `wn.all_synsets(pos="v")` | `pwn.synsets(pos="v")` |

(this table is incomplete)
111 changes: 111 additions & 0 deletions docs/guides/nltk-migration.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
Migrating from the NLTK
=======================

This guide is for users of the `NLTK <https://www.nltk.org/>`_\ 's
``nltk.corpus.wordnet`` module who are migrating to Wn. It is not
guaranteed that Wn will produce the same results as the NLTK's module,
but with some care its behavior can be very similar.

Overview
--------

One important thing to note is that Wn will search all wordnets in the
database by default where the NLTK would only search the English.

>>> from nltk.corpus import wordnet as nltk_wn
>>> nltk_wn.synsets('chat') # only English
>>> nltk_wn.synsets('chat', lang='fra') # only French
>>> import wn
>>> wn.synsets('chat') # all wordnets
>>> wn.synsets('chat', lang='fr') # only French

With Wn it helps to create a :class:`wn.WordNet` object to pre-filter
the results by language or lexicon.

>>> pwn = wn.WordNet('pwn', '3.0')
>>> pwn.synsets('chat') # only Princeton WordNet 3.0

Equivalent Operations
---------------------

The following table lists equivalent API calls for the NLTK's wordnet
module and Wn assuming the respective modules have been instantiated
(in separate Python sessions) as follows:

NLTK:

>>> from nltk.corpus import wordnet as wn
>>> ss = wn.synsets("chat", pos="v")[0]

Wn:

>>> import wn
>>> pwn = wn.WordNet('pwn', '3.0')
>>> ss = pwn.synsets("chat", pos="v")[0]

.. default-role:: python

Primary Queries
'''''''''''''''

============================= =========================================
NLTK Wn
============================= =========================================
`wn.langs()` `[lex.language for lex in wn.lexicons()]`
`wn.lemmas("chat")` --
-- `pwn.words("chat")`
-- `pwn.senses("chat")`
`wn.synsets("chat")` `pwn.synsets("chat")`
`wn.synsets("chat", pos="v")` `pwn.synsets("chat", pos="v")`
`wn.all_synsets()` `pwn.synsets()`
`wn.all_synsets(pos="v")` `pwn.synsets(pos="v")`
`wn.all_lemma_names()` `[w.lemma() for w in pwn.words()]`
============================= =========================================

Synsets -- Basic
''''''''''''''''

=================== =================
NLTK Wn
=================== =================
`ss.lemmas()` --
-- `ss.senses()`
-- `ss.words()`
`ss.lemmas_names()` `ss.lemmas()`
`ss.definition()` `ss.definition()`
`ss.examples()` `ss.examples()`
`ss.pos()` `ss.pos`
=================== =================

Synsets -- Relations
''''''''''''''''''''

===================================== ========================
NLTK Wn
===================================== ========================
`ss.hypernyms()` `ss.hypernyms()`
`ss.hyponyms()` `ss.hyponyms()`
`ss.holonyms()` `ss.holonyms()`
`ss.meronyms()` `ss.meronyms()`
`ss.closure(lambda x: x.hypernyms())` `ss.closure("hypernym")`
===================================== ========================

Synsets -- Taxonomic Structure
''''''''''''''''''''''''''''''

================================ ========================================================
NLTK Wn
================================ ========================================================
`ss.min_depth()` `ss.min_depth()`
`ss.max_depth()` `ss.max_depth()`
`ss.hypernym_paths()` `[list(reversed(path)) for path in ss.hypernym_paths()]`
`ss.common_hypernyms(ss)` `ss.common_hypernyms(ss)`
`ss.lowest_common_hypernyms(ss)` `ss.lowest_common_hypernyms(ss)`
`ss.shortest_path_distance(ss)` `len(ss.shortest_path(ss))`
================================ ========================================================

.. reset default role
.. default-role::

(these tables are incomplete)

1 change: 1 addition & 0 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,7 @@ Contents
guides/basic.rst
guides/interlingual.rst
guides/wordnet.rst
guides/nltk-migration.rst

.. toctree::
:caption: API Reference
Expand Down

0 comments on commit 2665c4c

Please sign in to comment.