Skip to content

Latest commit

 

History

History
421 lines (360 loc) · 22 KB

natural-language-processing.md

File metadata and controls

421 lines (360 loc) · 22 KB

https://github.com/rylans/getlang

Fast natural language detection package.


https://github.com/ThePaw/go-eco

Similarity, dissimilarity and distance matrices; diversity, equitability and inequality measures; species richness estimators; coenocline models.


https://github.com/nicksnyder/go-i18n/

Package and an accompanying tool to work with localized text.


https://github.com/dveselov/mystem

CGo bindings to Yandex.Mystem - russian morphology analyzer.


https://github.com/nuance/go-nlp

Utilities for working with discrete probability distributions and other tools useful for doing NLP work.


https://github.com/mozillazg/go-pinyin

CN Hanzi to Hanyu Pinyin converter.


https://github.com/agonopol/go-stem

Implementation of the porter stemming algorithm.


https://github.com/mozillazg/go-unidecode

ASCII transliterations of Unicode text.


https://github.com/danieldk/go2vec

Reader and utility functions for word2vec embeddings.


https://github.com/yanyiwu/gojieba

This is a Go implementation of jieba which a Chinese word splitting algorithm.


https://github.com/rjohnsondev/golibstemmer

Go bindings for the snowball libstemmer library including porter 2.


https://github.com/xujiajun/gotokenizer

A tokenizer based on the dictionary and Bigram language models for Golang. (Now only support chinese segmentation)


https://github.com/fiam/gounidecode

Unicode transliterator (also known as unidecode) for Go.


https://github.com/go-ego/gse

Go efficient text segmentation; support english, chinese, japanese and other.


https://github.com/goodsign/icu

Cgo binding for icu4c C library detection and conversion functions. Guaranteed compatibility with version 50.1.


https://github.com/ikawaha/kagome

JP morphological analyzer written in pure Go.


https://github.com/goodsign/libtextcat

Cgo binding for libtextcat C library. Guaranteed compatibility with version 2.2.


https://github.com/awsong/MMSEGO

This is a GO implementation of MMSEG which a Chinese word splitting algorithm.


https://github.com/Shixzie/nlp

Extract values from strings and fill your structs with nlp.


https://github.com/james-bowman/nlp

Go Natural Language Processing library supporting LSA (Latent Semantic Analysis).


https://github.com/rookii/paicehusk

Golang implementation of the Paice/Husk Stemming Algorithm.


https://github.com/striker2000/petrovich

Petrovich is the library which inflects Russian names to given grammatical case.


https://github.com/a2800276/porter

This is a fairly straightforward port of Martin Porter's C implementation of the Porter stemming algorithm.


https://github.com/zhenjl/porter2

Really fast Porter 2 stemmer.


https://github.com/jdkato/prose

Library for text processing that supports tokenization, part-of-speech tagging, named-entity extraction, and more.


https://github.com/Obaied/RAKE.go

Go port of the Rapid Automatic Keyword Extraction Algorithm (RAKE).


https://github.com/blevesearch/segment

Go library for performing Unicode Text Segmentation as described in Unicode Standard Annex 29


https://github.com/neurosnap/sentences

Sentence tokenizer: converts text into a list of sentences.


https://github.com/osamingo/shamoji

The shamoji is word filtering package written in Go.


https://github.com/goodsign/snowball

Snowball stemmer port (cgo wrapper) for Go. Provides word stem extraction functionality Snowball native.


https://github.com/dchest/stemmer

Stemmer packages for Go programming language. Includes English and German stemmers.


https://github.com/pebbe/textcat

Go package for n-gram based text categorization, with support for utf-8 and raw text.


https://github.com/abadojack/whatlanggo

Natural language detection package for Go. Supports 84 languages and 24 scripts (writing systems e.g. Latin, Cyrillic, etc).


https://github.com/olebedev/when

Natural EN and RU language date/time parser with pluggable rules.


https://github.com/arbox/nlp-with-ruby

Awesome List for Practical Natural Language Processing done in Ruby.


http://kschiess.github.io/parslet/

A small Ruby library for constructing parsers in the PEG (Parsing Expression Grammar) fashion.


https://github.com/watsonbox/pocketsphinx-ruby

Ruby speech recognition with Pocketsphinx.


https://github.com/diasks2/pragmatic_segmenter

Pragmatic Segmenter is a rule-based sentence boundary detection gem that works out-of-the-box across many languages.


https://github.com/diasks2/ruby-nlp

Collection of links to Ruby Natural Language Processing (NLP) libraries, tools and software.


https://github.com/threedaymonk/text

A collection of text algorithms including Levenshtein distance, Metaphone, Soundex 2, Porter stemming & White similarity.


https://github.com/louismullie/treat

Treat is a toolkit for natural language processing and computational linguistics in Ruby.


https://github.com/cjheath/treetop

PEG (Parsing Expression Grammar) parser.


https://github.com/abitdodgy/words_counted

A highly customisable Ruby text analyser and word counter.


https://github.com/RaRe-Technologies/gensim

Topic Modelling for Humans.


https://github.com/saffsd/langid.py

Stand-alone language identification system.


http://www.nltk.org/

A leading platform for building Python programs to work with human language data.


https://github.com/clips/pattern

A web mining module for the Python.


https://github.com/aboSamoor/polyglot

Natural language pipeline supporting hundreds of languages.


https://github.com/facebookresearch/pytext

A natural language modeling framework based on PyTorch.


https://github.com/PetrochukM/PyTorch-NLP

A toolkit enabling rapid deep learning NLP prototyping for research.


https://spacy.io/

A library for industrial-strength natural language processing in Python and Cython.


https://github.com/stanfordnlp/stanfordnlp

The Stanford NLP Group's official Python library, supporting 50+ languages.


https://github.com/fxsjy/jieba

The most popular Chinese text segmentation library.


https://github.com/lancopku/pkuseg-python

A toolkit for Chinese word segmentation in various domains.


https://github.com/isnowfy/snownlp

A library for processing Chinese text.


https://github.com/fighting41love/funNLP

A collection of tools and datasets for Chinese NLP.


https://github.com/wooorm/retext

An extensible natural language system.


https://github.com/wooorm/franc

Detect the language of text.


https://github.com/sindresorhus/leven

Measure the difference between two strings using the Levenshtein distance algorithm.


https://github.com/NaturalNode/natural

Natural language facility.