Use TranslationCORRECT Now!

The application is deployed here.

Full Repo Run Instructions

PIP & Packages

Important: Please ensure pip is updated to the latest version using the following command

python.exe -m pip install --upgrade pip

Make your own virutal environment in the root folder of our repo using:

pip install virtualenv

python -m venv {name your env}

e.g.

python -m venv myenv

Activate the environment using the following for Windows:

{the name of your env}\Scripts\activate

e.g.

myenv\Scripts\activate

For Unix-based systems (Linux/macOS) use bin instead of scripts as follows

{the name of your env}/bin/activate

e.g.

myenv/bin/activate

Now install all required packages into your newly created environment:

pip install -r requirements.txt

To Run Backend

Make sure your virtual environment is active, using these commands

cd .\backend\

uvicorn main:app --reload --host 0.0.0.0 --port 63030

You can set the port to whatever port is open on your personal network

You should something like this as output:

INFO:     Will watch for changes in these directories: ['C:\\Users\\mekae\\TranslationCorrect\\backend']
INFO:     Uvicorn running on http://0.0.0.0:63030 (Press CTRL+C to quit)
INFO:     Started reloader process [68584] using WatchFiles
INFO:     Started server process [63380]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
WARNING:  WatchFiles detected changes in 'main.py'. Reloading...
INFO:     Shutting down
INFO:     Waiting for application shutdown.
INFO:     Application shutdown complete.
INFO:     Finished server process [63380]
INFO:     Started server process [51168]
INFO:     Waiting for application startup.
INFO:     Application startup complete.

To Run Frontend

You're going to need to install react on your machine

In a new terminal, from the repository's parent folder, do the following:

cd .\frontend\

Should be able to find versions using node -v and npm -v

PS C:\Users\mekae\TranslationCorrect\frontend> node -v
v20.13.1 
PS C:\Users\mekae\TranslationCorrect\frontend> npm -v       
10.8.0

Then I you should be able to:

npm install

and then to run it :

npm run dev

  VITE v5.2.13  ready in 244 ms

  ➜  Local:   http://localhost:5173/
  ➜  Network: use --host to expose
  ➜  press h + enter to show help

ctrl+click on the localhost link to open it up in the browser

Translate should work on your typed input, if not, check the console log in the browser for hints of what went wrong

Project Description

Our proposed system, TranslationCorrect, is designed to function as a robust and comprehensive NMT system with iterative improvement. It enables users to generate hypothesis translations, detect potential errors within them, and provide corrections.

The system's detailed architecture is composed of three main components:

I. Neural Machine Translation: The system allows users to input source text and generate high-quality hypothesis translations as output.

II. Fine-Grained Error Detection: Fine-grained error detection is performed on hypothesis translations, and a comprehensive analysis of potential translation errors is displayed to the user.

III. Error Correction UI: Users can make detailed edits, including error annotation scoring, annotation insertion, and text modifications (additions and deletions) to the hypothesis translation. Edits are tracked systematically to prioritize organization and clarity for the user. The edits are then collected and submitted to the Fine-Grained Error Detection component for iterative improvement in its error detection capabilities.

The three proposed components work closely together, creating a seamless experience for obtaining accurate MTs. The backend pipeline data flow is illustrated as follows:

A UI that facilitates effective translation correction through features such as error categorization and classification, text extraction, and hovering tooltips).

Data Usage & Training Details/Resources

MQM Error Dataset: To simulate human user activities, we generated MQM data using OpenAI's o1 model. We designed prompts guiding o1 to self-generate MQM data, focusing on the English-Chinese (en-zh) language pair. The resulting generated data was then evaluated with Unbabel's CometKiwi model, which yielded MQM scores for each data instance. After cleaning the duplicates and invalid outputs, we obtained a total of 2,899 MQM data samples, which we used for evaluation (reported in our paper).
Error Categories We have categorized potential errors into six categories:
Addition Of Text
Negation Errors
Mask In-Fill
Errors In Numbers
Named Entity Errors
Hallucination

The first five categories are major error types identified in Unbabel's xCOMET evaluation results, and hallucination is added to the categories as it is a recurring error over our extensive studies.

Name		Name	Last commit message	Last commit date
Latest commit History 65 Commits
Benchmark Baseline		Benchmark Baseline
Rough Work		Rough Work
backend		backend
error-in-translations		error-in-translations
flores		flores
frontend		frontend
.gitignore		.gitignore
.python-version		.python-version
POC.ipynb		POC.ipynb
Pipeline.ipynb		Pipeline.ipynb
README.md		README.md
package-lock.json		package-lock.json
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Table of Contents

Use TranslationCORRECT Now!

Full Repo Run Instructions

PIP & Packages

To Run Backend

To Run Frontend

Project Description

Data Usage & Training Details/Resources

About

Releases

Packages

Contributors 6

Languages

MekaelWasti/TranslationCorrect

Folders and files

Latest commit

History

Repository files navigation

Table of Contents

Use TranslationCORRECT Now!

Full Repo Run Instructions

PIP & Packages

To Run Backend

To Run Frontend

Project Description

Data Usage & Training Details/Resources

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 6

Languages

Packages