pycorrector is a toolkit for text error correction. 文本纠错,实现了Kenlm,T5,MacBERT,ChatGLM3,Qwen2.5等模型应用在纠错场景,开箱即用。
-
Updated
Oct 28, 2024 - Python
pycorrector is a toolkit for text error correction. 文本纠错,实现了Kenlm,T5,MacBERT,ChatGLM3,Qwen2.5等模型应用在纠错场景,开箱即用。
State-of-the-art (ranked #1 Aug 2022) German Speech Recognition in 284 lines of C++. This is a 100% private 100% offline 100% free CLI tool.
Training an n-gram based Language Model using KenLM toolkit for Deep Speech 2
CTC+Beam_Search+kenlm 是用于以汉字为声学模型建模单元的解码系统
Wave2vec 2.0 Recognize pipeline
A complete instruction for training a Persian spell checker and a language model based on SymSpell and KenLM, respectively using Wikipedia dataset.
Optical Character Recognition + Instance Segmentation for russian and english languages
Romanian Automatic Speech Recognition from the ROBIN project
🎲 KenLM extension for spaCy 2.0.
A Java JNI wrapper for KenLM: Faster and Smaller Language Model Queries
INACTIVE - http://mzl.la/ghe-archive - Generate language models from OSCAR corpora
Neural Grammatical Error Correction for Romanian using Transformer
Developed an AI tool to automatically generate captions and transcripts for YouTube videos in 67 languages and can generate summarized texts in 133 languages.
Create and adapt n-gram and JSGF language models, e.g. for Kaldi-ASR nnet3 chain models from Zamia-Speech
End-to-End Automatic Speech Recognition on PyTorch with CTC Decoder and Ken LM
Automatic Speech Recognition using Conformer with Speech Sentiment Analysis & Text Summarizer
Scripts to train a n-gram language models on Wikipedia articles
This repo shows how to finetune the wav2vec2.0 model along with its prerequisites.
Basic setup to use kenlm library in cpp
Add a description, image, and links to the kenlm topic page so that developers can more easily learn about it.
To associate your repository with the kenlm topic, visit your repo's landing page and select "manage topics."