Indic Translation App

This project tackles the challenge of on-demand machine translation for multiple Indic languages in a resource-constrained environment. By leveraging the Gemma2 2B LLM and employing adapter switching (LoRA switch), the model can efficiently translate between various Indian languages without the need for separate models for each one, reducing computational demands.

The project specifically addresses the difficulties of training machine translation from English to nine Indic languages—Tamil, Hindi, Kannada, Malayalam, Telugu, Bengali, Marathi, Gujarati, and Odia—by utilizing adapter switching to enhance performance and manage the complexity of multilingual translation.

Analyzing the translation of source segments into target segments using an agentic approach. The system leverages the Groq API to ensure that the source segment is accurately translated into its corresponding target segment. The process involves validating the translation quality by comparing both segments and ensuring semantic and contextual alignment.

Key features include:

Machine translation done with the fine tuned Gemma2-2B model
Agentic methods Analysis for quality assurance and check for correctness in the translation process.
Use of Groq API to analysis

huggingface model

base_model: Hemanth-thunder/gemma-2-2b-bnb-4bit
peft_adapters: Hemanth-thunder/indic_mt_fine_tuned_peft_adapter

Demo

indic_adaptive_machine_translation_with_gemma2_2b_demo.mp4

In the root directory, create a new file called .env.

Create a .env file:

Generate your hugging face token and Groq Api(Agent) from the relevant service and Store the API key in the .env file:

HUGGING_FACE_TOKEN=your_api_key_here
AGENT_GROQ=your_api_key_here

Prerequisites

This repository holds the code for an Indic language translation application, packaged with Docker for easy deployment.

Make sure you have Docker installed on your machine.

Steps to Build and Run the App

Clone the repository:

git clone https://github.com/Hemanthkumar2112/Indic-machine-translation-gemma2-2B
cd Indic-machine-translation-gemma2-2B

Build the Docker image:
```
docker build -t indic_translation .
```

Run the Docker container:

docker run --gpus all -p 8080:8080 indic_translation

Access the application:
```
http://localhost:8080
```

logging

docker logs -f container_id --tail 1000

Citation

@article{
  title = {Efficient Multilingual Machine Translation for Indic Languages: Leveraging Gemma2-2B LLM with Agent},
  author = {Hemanth-thunder},
  year={2024}
}

License This project is licensed under the MIT License - see the LICENSE file for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Indic Translation App

huggingface model

Demo

Prerequisites

Steps to Build and Run the App

Citation

Files

README.md

Latest commit

History

README.md

File metadata and controls

Indic Translation App

huggingface model

Demo

Prerequisites

Steps to Build and Run the App

Citation