Skip to content

Latest commit

 

History

History
56 lines (37 loc) · 3.42 KB

README.md

File metadata and controls

56 lines (37 loc) · 3.42 KB

x-tagger: A Natural Language Processing Toolkit for Sequence Labeling in Its Simplest Form.

Downloads PyPI version

x-tagger is a Natural Language Processing toolkit for sequence labeling in its simplest form. x-tagger helps basic sequence labeling tasks like part-of-speech tagging and named entity recognition. With its pure Hidden Markov Model implementation, it also provides neural PyTorch models with highest level abstraction. Besides, it allows you to play with all kind of data: pandas dataframe, nltk tagged corpus, .txt, torchtext iterator and 🤗 datasets. Due to its data transformations, abstraction and wrapper, you can train almost any kind of sequence labeling model.

x-tagger has built-in models like Hidden Markov Models (bigram, trigram, deleted interpolation, morphological analyzer, prior support), Long Short-Term Memory (unidirectional, bidirectional) and BERT. While you can train and inference those models with .fit(), x-tagger serves nearly 8 different built-in metrics as well. Besides, if one might want to write custom metrics, x-tagger serves a base class for all kind of metrics!

For gradient based models, we provide a model monitoring and checkpointing class for saving best model and loading them with 2-3 lines of code.

So, what if you want a sequence labeling model but x-tagger does not have it? x-tagger provides a PyTorch Sequence Labeling Wrapper module for your model. Once you have wrote your custom PyTorch model, PyTorchTagTrainer module does everything else!

Remainder: x-tagger is currently in beta release and one-person project.

Update: I am planning to release new version of this library to expand and separate from torchtext.

Getting Started

Installation

  • Using pip:
pip install x-tagger
  • From source:
pip install git+https://github.com/safakkbilici/x-tagger

Documentation

See.

Examples

Beautiful Carbon Example