A Language Detection (LID) library written in Python, able to detect language from text, image or audio content.
Back-end: Python (numpy, pandas, etc.)
Server: Flask
Models: scikit-learn (sklearn), TensorFlow, Keras
Front-end: HTML, CSS (Bootstrap), vanilla JavaScript
Install LingoDect with pip
Clone this repository, and from the repository root run the following command:
pip install -e .You should now have lingodect installed as a library and can use it for development and testing.
Clone the project
git clone https://github.com/hadarsharon/lingodectGo to the project directory
cd lingodectInstall dependencies
pip install -r requirements.txtStart the server
python app.pyAlternatively, you can run commands via the CLI (run with the -h flag for help and information about available commands)
python cli.py -hTo run tests, run the following command from the project root
pytestThe library currently supports all textual input that can be written directly to it (whether as a string via the CLI or
in a text box via the web application GUI), or via plaintext files such as .txt files.
Audio (Speech) and Image (Handwriting) inputs are available using most common input
formats (.wav, .flac, .jpg, .png etc.).
In case your input file format is not supported, .wav and .png are a safe bet, so you should convert your audio or
image file to them, respectively.
The library models are currently trained on over 100 languages, so there is a good chance whatever language you want to predict is part of the support languages.