Skip to content

This repository contains the code and resources for our tutorial presented at AMTA 2024.

Notifications You must be signed in to change notification settings

surrey-nlp/AMTA-EditDistances-tutorial

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

logo       logo

AMTA 2024 Tutorial: Edit Distances and their application to downstream tasks, in research and commercial contexts.

Overview

This repository contains the code and resources for our tutorial presented at the 16th Biennial Conference of the Association for Machine Translation in the Americas (AMTA 2024). The tutorial covers the theoretical foundations of edit distances, their applications in Natural Language Processing (NLP), and the challenges and limitations when applying these metrics to tasks like Machine Translation (MT), Quality Estimation (QE), and Automatic Post-Editing (APE).

Tutorial Outline

The tutorial is structured into four parts:

  1. Part 1: Edit distances and their different implementations and applications

  2. Part 2: Analysing an incrementally-complex sequence of edits

  3. Part 3: Building a Computational Perspective

  4. Part 4: Implications for research and commercial applications of edit distances

Repository Structure

  • code/: Contains Python notebook used during the tutorial, including:

    • Examples for calculating Levenshtein, Damerau-Levenshtein, LCS, and N-gram distances.
    • Examples for computing TER (using multiple implementations), BLEU, and chrF score calculations.
    • Scripts for visualizing the results of different metrics using plots and charts.
  • data/: Example dataset used in the tutorial, which include the reference and hypothesis translations for analysis in two separate files.

  • AMTA-ED-Tutorial-2024.pdf: Slide deck used for presentation.

Execute on Google Colab

Click here to open the code in Google colab: Colab

About

This repository contains the code and resources for our tutorial presented at AMTA 2024.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published