Skip to content

Releases: taishi-i/toiro

toiro 0.0.9

31 Jul 15:07
Compare
Choose a tag to compare

toiro 0.0.9 incorporates the following changes:

  • fix a scikit-learn installation error
  • fix wheels in PyPI
  • update README.md

toiro 0.0.8

02 Nov 18:14
Compare
Choose a tag to compare

toiro 0.0.8 incorporates the following changes:

  • add chABSA_dataset to download_corpus method
    datadownloader.download_corpus('chABSA_dataset')
    train_df, dev_df, test_df = datadownloader.load_corpus('chABSA_dataset')
  • add Python3.8 to travis and GitHub Actions
  • fix preprocess.py and test_datadownloader.py

toiro 0.0.7

08 Sep 08:32
Compare
Choose a tag to compare

toiro 0.0.7 incorporates the following changes:

toiro 0.0.6

23 Aug 10:31
Compare
Choose a tag to compare

toiro 0.0.6 incorporates the following changes:

  • fix a generator error in tokenizer_janome.py due to an update of janome v0.4.0 e2b3e73
  • fix a failure Build and publish v0.0.5
  • add 05_svm_vs_bert_benchmarking_application_tasks_ja.ipynb to examples

toiro 0.0.4

16 Aug 16:04
Compare
Choose a tag to compare

toiro 0.0.4 incorporates the following changes:

  • add disable_tokenizers function to tokenizers.compare
  • fix a bug in the initial release.
  • fix error for a long input text in Jumanpp
  • add 01_getting_started_ja.ipynb to README.md

toiro 0.0.3

14 Aug 13:15
Compare
Choose a tag to compare

toiro 0.0.3 incorporates the following changes:

  • fix a bug in the initial release.
  • fix typo: SVMClassifitionModel to SVMClassificationModel
  • fix docker example in README.md

toiro 0.0.2

13 Aug 14:22
Compare
Choose a tag to compare

This is the first release of this library.

Toiro is a comparison tool of Japanese tokenizers.

  • Compare the processing speed of tokenizers
  • Compare the words segmented in tokenizers
  • Compare the performance of tokenizers by benchmarking application tasks (e.g., text classification)

It also provides useful functions for natural language processing in Japanese.

  • Data downloader for Japanese text corpora
  • Preprocessor of these corpora
  • Text classifier for Japanese text (e.g., SVM, BERT)