Skip to content

Releases: PyThaiNLP/pythainlp

PyThaiNLP v3.0.5 Released!

14 Feb 10:44
e9b8962
Compare
Choose a tag to compare

PyThaiNLP v3.0.5 is This release is a bug fix release of PyThaiNLP 3.0.4.

Bug Fixed

You can install by pip install pythainlp or upgrade by pip install -U pythainlp.

Documentation: https://pythainlp.github.io/docs/3.0/index.html

Report bug: https://github.com/PyThaiNLP/pythainlp/issues

See PyThaiNLP 3.0 change log#545

Contributors

Thanks all the contributors. (Image made with contributors-img)

PyThaiNLP v3.0.4 Released!

14 Feb 09:47
abbd4e4
Compare
Choose a tag to compare

PyThaiNLP v3.0.4 is This release is a bug fix release of PyThaiNLP 3.0.3.

Bug Fixed

  • Remove pythainlp.tag.named_entity.ThaiNameTagger to fixed import pycrfsuite. cc628d8

You can install by pip install pythainlp or upgrade by pip install -U pythainlp.

Documentation: https://pythainlp.github.io/docs/3.0/index.html

Report bug: https://github.com/PyThaiNLP/pythainlp/issues

See PyThaiNLP 3.0 change log#545

Contributors

Thanks all the contributors. (Image made with contributors-img)

PyThaiNLP v3.0.3 Released!

09 Feb 12:14
b403435
Compare
Choose a tag to compare

PyThaiNLP v3.0.3 is This release is a bug fix release of PyThaiNLP 3.0.2.

Bug Fixed

  • Fixed TypeError in pythainlp.spell.symspellpy #650

You can install by pip install pythainlp or upgrade by pip install -U pythainlp.

Documentation: https://pythainlp.github.io/docs/3.0/index.html

Report bug: https://github.com/PyThaiNLP/pythainlp/issues

See PyThaiNLP 3.0 change log#545

Contributors

Thanks all the contributors. (Image made with contributors-img)

PyThaiNLP v3.0.2 Release!

09 Feb 07:37
497d18d
Compare
Choose a tag to compare

PyThaiNLP v3.0.2 is This release is a bug fix release of PyThaiNLP 3.0.1.

Bug Fixed

  • Fixed some wrong code. from #645

You can install by pip install pythainlp or upgrade by pip install -U pythainlp.

Documentation: https://pythainlp.github.io/docs/3.0/index.html

Report bug: https://github.com/PyThaiNLP/pythainlp/issues

See PyThaiNLP 3.0 change log#545

Contributors

Thanks all the contributors. (Image made with contributors-img)

PyThaiNLP v3.0.1 Release!

09 Feb 07:12
3b4abdf
Compare
Choose a tag to compare

PyThaiNLP v3.0.1 is This release is a bug fix release of PyThaiNLP 3.0.

Bug Fixed

  • Remove warning message in pythainlp.tag.thainer. Fixed #644
  • Add PYTHAINLP_READ_MODE environment variable is config PyThaiNLP to read-only mode. Fixed #645

You can install by pip install pythainlp or upgrade by pip install -U pythainlp.

Documentation: https://pythainlp.github.io/docs/3.0/index.html

Report bug: https://github.com/PyThaiNLP/pythainlp/issues

See PyThaiNLP 3.0 change log#545

Contributors

Thanks all the contributors. (Image made with contributors-img)

PyThaiNLP v3.0.0 Released!

29 Jan 17:04
66373c8
Compare
Choose a tag to compare

After a long time of the development of PyThaiNLP 3.0, We released PyThaiNLP 3.0. PyThaiNLP 3.0 has many improvements and new features to help with Thai language processing tasks.

You can install by pip install pythainlp or upgrade by pip install -U pythainlp.

Documentation: https://pythainlp.github.io/docs/3.0/index.html

Report bug: https://github.com/PyThaiNLP/pythainlp/issues

See PyThaiNLP 3.0 change log#545

If you want to contribute to PyThaiNLP, you can read Contributing to PyThaiNLP.

News

Since PyThaiNLP 3.0, We will end supporting PyThaiNLP on Python 3.6. Python 3.6 users can use PyThaiNLP 2.3.2.

We have updated the Thai word dictionary & rule for newmm. We recommend retraining your model if you use newmm for word tokenization in your model.

What is new?

Deprecation and other API changes

  • Deprecated syllable_tokenize. syllable_tokenize is deprecated, use subword_tokenize instead
  • pythainlp.tag.named_entity.ThaiNameTagger is change to pythainlp.tag.thainer.ThaiNameTagger. This old class will be deprecated in PyThaiNLP version 3.1.

Augment

  • Add Thai Text Augmentation

Corpus

  • Fix lots of misspellings in the dictionary (words_th.txt)
  • Add get_corpus_default_db and thainer 1.5 model. You can add corpus on default_db.json, and you don't load the last trainer model from the Internet.

Tag

  • Add TLTK (pos_tag and ner) - add TLTK wrapper to pythainlp functions ex ner, word_tokenize and more.
  • Add NER class - NER class for Named-entity recognizer tasks.

Translate

  • Add pythainlp.translate.Translate Class
  • Add Chinese-Thai Machine Translation
  • Add Thai-French Machine Translation

Tokenization

  • Tokenize repeating dots and commas from numbers
  • Fix token_max_len bug that makes it always zero
  • Tokenize repeating dots and commas from numbers (fix #461)
  • Retrained sentenceseg_crfcut.model for PyThaiNLP 2.4
  • Add SEFR CUT to pythainlp
  • Add TLTK (sentence_tokenize and word_tokenize) - add TLTK wrapper to pythainlp functions ex ner, word_tokenize, and more.
  • Add nlpo3

Transliterate

  • Refactor Royin Transliterate: Avoid embedded if blocks and simplified consonant replacing operations
  • Manually merge update-royin branch with dev branch to add O-ANG rule
  • Add TLTK (g2p and ipa) - add TLTK wrapper to pythainlp functions ex ner, word_tokenize, and more.
  • Add pythainlp.transliterate.puan

Word Vector

  • Fix token_max_len bug that makes it always zero
  • Add pythainlp.word_vector.WordVector

Spell

  • Add more spelling engine
  • Add TLTK (spell) - add TLTK wrapper to pythainlp functions ex ner, word_tokenize, and more.

Generate

  • Add pythainlp.generate to generate a text.

Tool

  • Add misspell module

Other

  • Add TLTK - add TLTK wrapper to pythainlp functions ex ner, word_tokenize, and more.
  • Update requirements from ssg 0.0.6 to ssg 0.0.8
  • Spoonerism: Add supports words more three syllables
  • Add maiyamok; This function is preprocessing MaiYaMok in a Thai sentence.

Contributors

Thanks all the contributors. (Image made with contributors-img)

If you want to contributing to PyThaiNLP, you can read Contributing to PyThaiNLP.

This year is the 6th year's PyThaiNLP, and PyThaiNLP has more than one million downloads. I started to develop PyThaiNLP to help me do Thai language processing tasks. Now, PyThaiNLP has been used in many research and works worldwide. PyThaiNLP can't be grown if it doesn't have contributors, sponsors, and users.

Thank you for all supporting.

Thank you for using PyThaiNLP.

Wannaphong Phatthiyaphaibun

PyThaiNLP Founder

27 January 2022

PyThaiNLP v3.0.0-beta0

20 Jan 13:38
fae2bf6
Compare
Choose a tag to compare
Pre-release

PyThaiNLP 3.0 have many improvement and new features to help you in Thai language processing tasks. This release is PyThaiNLP v3.0.0-beta0. It is The first beta release of PyThaiNLP 3.0

You can install by pip install pythainlp==3.0.0b0.

Documentation: https://pythainlp.github.io/dev-docs/index.html
Report bug: https://github.com/PyThaiNLP/pythainlp/issues

See PyThaiNLP 3.0 change log #545

If you want to contributing to PyThaiNLP, you can read Contributing to PyThaiNLP.

News

Since PyThaiNLP 3.0, We will end support PyThaiNLP on Python 3.6. Python 3.6 users can use PyThaiNLP 2.3.2.
We have updated the dict & rule for newmm. If you use newmm for word tokenization in your model, we recommend you retrain your model.

What is new?

Deprecation and other API changes

  • Deprecated syllable_tokenize. syllable_tokenize is deprecated, use subword_tokenize instead
  • pythainlp.tag.named_entity.ThaiNameTagger is change to pythainlp.tag.thainer.ThaiNameTagger. This old class will be deprecated in PyThaiNLP version 3.1.

Augment

  • Add Thai Text Augmentation

Corpus

  • Fix lots of misspellings in dictionary (words_th.txt)
  • Add get_corpus_default_db and thainer 1.5 model. Now, You can add corpus on default_db.json and you dont load last thainer model from Internet.

Tag

  • Add tltk (pos_tag and ner) - add tltk wrapper to pythainlp functions ex ner, word_tokenize and more.
  • Add NER class - NER class for Named-entity recognizer tasks.

Translate

  • Add pythainlp.translate.Translate Class
  • Add Chinese-Thai Machine Translation

Tokenization

  • Tokenize repeating dots and commas from numbers
  • Fix token_max_len bug that makes it always zero
  • Tokenize repeating dots and commas from numbers (fix #461)
  • Retrained sentenceseg_crfcut.model for PyThaiNLP 2.4
  • Add SEFR CUT to pythainlp
  • Add tltk (sentence_tokenize and word_tokenize) - add tltk wrapper to pythainlp functions ex ner, word_tokenize and more.
  • Add nlpo3

Transliterate

  • Refactor Royin Transliterate: Avoid embedded if blocks and simplified consonant replacing operations
  • Manually merge update-royin branch with dev branch to add O-ANG rule
  • Add tltk (g2p and ipa) - add tltk wrapper to pythainlp functions ex ner, word_tokenize and more.
  • Add pythainlp.transliterate.puan

Word Vector

  • Fix token_max_len bug that makes it always zero
  • Add pythainlp.word_vector.WordVector

Spell

  • Add more spelling engine
  • Add tltk (spell) - add tltk wrapper to pythainlp functions ex ner, word_tokenize and more.

Generate

  • Add pythainlp.generate

Tool

  • Add misspell module

Other

  • Add tltk - add tltk wrapper to pythainlp functions ex ner, word_tokenize and more.
  • Update requirements from ssg 0.0.6 to ssg 0.0.8
  • Spoonerism: Add supports words more 3 syllables
  • Add maiyamok; This function is preprocessing MaiYaMok in Thai sentence.

Contributors

Thanks all the contributors. (Image made with contributors-img)

If you want to contributing to PyThaiNLP, you can read Contributing to PyThaiNLP.

#PyThaiNLP #ThaiNLP

PyThaiNLP v3.0.0-dev0

27 Dec 11:21
3414373
Compare
Choose a tag to compare
PyThaiNLP v3.0.0-dev0 Pre-release
Pre-release

PyThaiNLP v3.0.0-dev0 is The first development release of PyThaiNLP 3.0 (For development only)

Docs: https://pythainlp.github.io/dev-docs/index.html
Report bug: https://github.com/PyThaiNLP/pythainlp/issues
GitHub: https://github.com/PyThaiNLP/pythainlp

News

Since PyThaiNLP 2.4, We will end support PyThaiNLP on Python 3.6. Python 3.6 users can use PyThaiNLP 2.3.1
We have updated the dict & rule for newmm. If you use newmm for word tokenization in your model, we recommend you retrain your model.

What is new?

Deprecation and other API changes

  • #550 Deprecated syllable_tokenize. syllable_tokenize is deprecated, use subword_tokenize instead
  • 701fb3a pythainlp.tag.named_entity.ThaiNameTagger is change to pythainlp.tag.thainer.ThaiNameTagger. This old class will be deprecated in PyThaiNLP version 2.5.

Augment

  • #580 Add Thai Text Augmentation

Corpus

  • #557 Fix lots of misspellings in dictionary (words_th.txt)
  • #576 Add get_corpus_default_db and thainer 1.5 model. Now, You can add corpus on default_db.json and you dont load last thainer model from Internet.

Tag

  • #599 Add tltk (pos_tag and ner) - add tltk wrapper to pythainlp functions ex ner, word_tokenize and more.
  • #600 Add NER class - NER class for Named-entity recognizer tasks.

Translate

  • #589 Add pythainlp.translate.Translate Class
  • #588 Add Chinese-Thai Machine Translation

Tokenization

  • #562 Tokenize repeating dots and commas from numbers
  • #585 Fix token_max_len bug that makes it always zero
  • #562 Tokenize repeating dots and commas from numbers (fix #461)
  • #594 Retrained sentenceseg_crfcut.model for PyThaiNLP 2.4
  • 3144110 Add SEFR CUT to pythainlp
  • #599 Add tltk (sentence_tokenize and word_tokenize) - add tltk wrapper to pythainlp functions ex ner, word_tokenize and more.
  • #622 Add nlpo3

Transliterate

  • #566 Refactor Royin Transliterate: Avoid embedded if blocks and simplified consonant replacing operations
  • #585 Manually merge update-royin branch with dev branch to add O-ANG rule
  • #599 Add tltk (g2p and ipa) - add tltk wrapper to pythainlp functions ex ner, word_tokenize and more.
  • #624 Add pythainlp.transliterate.puan

Word Vector

  • #573 Fix token_max_len bug that makes it always zero
  • #583 Add pythainlp.word_vector.WordVector

Spell

  • #591 Add more spelling engine
  • #599 Add tltk (spell) - add tltk wrapper to pythainlp functions ex ner, word_tokenize and more.

Generate

  • #579 Add pythainlp.generate

Tool

  • #614 Add misspell module

Other

  • #599 Add tltk - add tltk wrapper to pythainlp functions ex ner, word_tokenize and more.
  • e357cf8 Update requirements from ssg 0.0.6 to ssg 0.0.8
  • Spoonerism: Add supports words more 3 syllables #631
  • Add maiyamok #623 This function is preprocessing MaiYaMok in Thai sentence.

PyThaiNLP v2.3.2 Release!

25 Aug 02:45
ef3503a
Compare
Choose a tag to compare

PyThaiNLP v2.3.2 is This release is a bug fix release of PyThaiNLP 2.3.

Bug Fixed

  • Fixed clause_tokenize returns an empty list. #609

Documentation: https://pythainlp.github.io/docs/2.3/index.html
Report bug: https://github.com/PyThaiNLP/pythainlp/issues

You can install or upgrade using pip install -U pythainlp

See PyThaiNLP 2.3 change log #445

PyThaiNLP v2.4.0-dev0

01 Aug 06:59
648a608
Compare
Choose a tag to compare
PyThaiNLP v2.4.0-dev0 Pre-release
Pre-release

PyThaiNLP v2.4.0-dev0 is The first development release of PyThaiNLP 2.4 (For development only)

Documentation: https://pythainlp.github.io/dev-docs/index.html
Report bug: https://github.com/PyThaiNLP/pythainlp/issues

See PyThaiNLP 2.4 change log #545

News

Since PyThaiNLP 2.4, We will end support PyThaiNLP on Python 3.6. Python 3.6 users can use PyThaiNLP 2.3.1
We have updated the dict & rule for newmm. If you use newmm for word tokenization in your model, we recommend you retrain your model.

Deprecation and other API changes

  • #550 Deprecated syllable_tokenize. syllable_tokenize is deprecated, use subword_tokenize instead
  • 701fb3a pythainlp.tag.named_entity.ThaiNameTagger is change to pythainlp.tag.thainer.ThaiNameTagger. This old class will be deprecated in PyThaiNLP version 2.5.

Augment

  • #580 Add Thai Text Augmentation

Corpus

  • #557 Fix lots of misspellings in dictionary (words_th.txt)
  • #576 Add get_corpus_default_db and thainer 1.5 model. Now, You can add corpus on default_db.json and you dont load last thainer model from Internet.

Tag

  • #599 Add tltk (pos_tag and ner) - add tltk wrapper to pythainlp functions ex ner, word_tokenize and more.
  • #600 Add NER class - NER class for Named-entity recognizer tasks.

Translate

  • #589 Add pythainlp.translate.Translate Class
  • #588 Add Chinese-Thai Machine Translation

Tokenization

  • #562 Tokenize repeating dots and commas from numbers
  • #585 Fix token_max_len bug that makes it always zero
  • #562 Tokenize repeating dots and commas from numbers (fix #461)
  • #594 Retrained sentenceseg_crfcut.model for PyThaiNLP 2.4
  • 3144110 Add SEFR CUT to pythainlp
  • #599 Add tltk (sentence_tokenize and word_tokenize) - add tltk wrapper to pythainlp functions ex ner, word_tokenize and more.

Transliterate

  • #566 Refactor Royin Transliterate: Avoid embedded if blocks and simplified consonant replacing operations
  • #585 Manually merge update-royin branch with dev branch to add O-ANG rule
  • #599 Add tltk (g2p and ipa) - add tltk wrapper to pythainlp functions ex ner, word_tokenize and more.

Word Vector

  • #573 Fix token_max_len bug that makes it always zero
  • #583 Add pythainlp.word_vector.WordVector

Spell

  • #591 Add more spelling engine
  • #599 Add tltk (spell) - add tltk wrapper to pythainlp functions ex ner, word_tokenize and more.

Generate

  • #579 Add pythainlp.generate

Other

  • #599 Add tltk - add tltk wrapper to pythainlp functions ex ner, word_tokenize and more.