Releases: meilisearch/charabia
Releases · meilisearch/charabia
Charabia v0.5.0
Changes
New Language support
- Hebrew: Remove diacritics via a specialized Normalizer (#101) @benny-n
- Japanese: Use Lindera as the specialized Segmenter for Japanese (#105) @loiclec
Chores
- Guide feature proposals to the product discussions space (#96) @gmourier
- Enhance benchmarks (#103) @ManyTheFish
Breaking changes ⚠️
- Fix a typo in a function name
original_lenghts
becomesoriginal_lengths
(#97) @ManyTheFish
Thanks again to @ManyTheFish, @benny-n, @gmourier, and @loiclec! 🎉
Charabia v0.4.0
Breaking changes ⚠️
- Completely refactor the API in order to make it more contribuable and more practical to use
- Rename repository to Charabia
Discover our newly created crates released in beta on crates.io!
Thanks again to @ManyTheFish, @Kerollmops, and @curquiza! 🎉
Tokenizer v0.2.9
Tokenizer v0.2.8
Changes
- Changes related to the rebranding (#66)
- Update LICENSE (#67) @curquiza
- Small fix in
benches/
(#71) @Thearas - Setup lindera tokenizer for ja support ( related with #49 ) (#70) @miiton
- Benchmark and optimize japanese (#73) @ManyTheFish
- Decompose Japanese compound words (#75) @mosuka
- Update the dependencies (#80) @Kerollmops
Thanks again to @Kerollmops, @ManyTheFish, @Thearas, @curquiza, @miiton and @mosuka! 🎉
Tokenizer v0.2.7
Tokenizer v0.2.6
Changes
- Test Meilisearch issue 1714 (#58) @ManyTheFish
- Please exclude Hangul from is_cjk. (#60) @datamaker
- Add mapping between bytes in original word and normalized word (#59) @Samyak2
Thanks again to @ManyTheFish, @Samyak2, @datamaker and JB! 🎉
Tokenizer v0.2.5
Changes
- change ZeroRemover into ControlCharacterRemover (#55) @ManyTheFish
- Add a rustfmt config file into the project (#57) @Kerollmops
Thanks again to @Kerollmops, @ManyTheFish, and @curquiza! 🎉
Tokenizer v0.2.4
Changes
- Introduce a new default normalizer that removes zeroes from tokens (#52) @Kerollmops
Thanks again to @Kerollmops ! 🎉
Tokenizer v0.2.3
Changes
- Make legacy tokenizer handle unicode separators (#47) @ManyTheFish
Thanks again to @ManyTheFish! 🎉
Tokenizer v0.2.2
Changes
- Fix non-breaking space separator (#44) @shekhirin
Thanks again to @LegendreM, and @shekhirin! 🎉