Open ECG Digitizer

This repository provides a highly configurable tool for digitizing 12-lead ECGs, extracting raw time series data from scanned images or photographs (e.g., taken with a phone). It supports any subset of the 12 standard leads and is robust to perspective distortions and image quality variations.

Segmented and perspective corrected mobile phone photo

Features

Extracts raw time series data from 12-lead ECG images
Supports both scanned and photographed ECGs (with perspective correction)
Works with any subset of leads
Easily configurable via yaml config files

File structure and module overview

Each component of the ECG digitization pipeline is modularized under src/model.

Below is an overview of their purpose and debugging relevance, in approximate execution order:

Module	Description
`src/model/unet.py`	Semantic segmentation network - a U-Net model trained to identify ECG traces, grids, and background. Retrain or fine-tune it using `src/train.py` if it underperforms. You can modify the on-the-fly transforms to mimic your own data `src/transform/vision.py`.
(`src/model/dewarper.py`)	Experimental full dewarping - for folded or curved ECG paper. Not formally evaluated. Not recommended for flat papers, as perspective correction is more robust. Not enabled in the provided configuration YAML files.
`src/model/perspective_detector.py`	Perspective correction - estimates and corrects projective distortions. Handles up to ~45° rotation.
`src/model/cropper.py`	Cropping and bounding box extraction - used to crop the image based on the location of the ECG leads.
`src/model/pixel_size_finder.py`	Grid size estimation - autocorrelation-based template matching. Configure grid parameters (minor/major ratio, expected line counts) in your inference YAML in case this underperforms.
`src/model/lead_identifier.py`	Layout identification - matches cropped regions to known ECG lead layouts using predefined templates. Update or prune templates in `src/config/lead_layouts_*.yml`.
`src/model/signal_extractor.py`	Segmentation-to-trace conversion - converts segmented images into digitized voltage–time signals. Might set parts of signals to NaN in case of overlapping signals.
`src/model/inference_wrapper.py`	Main orchestration script - connects all components.

Questions or in need of help? Contact elias.stenhede at ahus.no

Installation

Requirements: Python 3.12 or later.

Note

This setup has been tested on Ubuntu 24.04.2 and Debian 12 with CUDA. You need to install git-lfs to download the weights.

Ensure you have installed python3.12, git and git-lfs.
Clone the repository: git clone git@github.com:Ahus-AIM/Electrocardiogram-Digitization.git
Navigate to the project_source_code folder.
Create and activate a virtual environment: python3.12 -m venv venv && source venv/bin/activate
Install dependencies python3 -m pip install -r requirements.txt
Download the pre-trained weights: git lfs pull

Running inference on a folder with images

Modify a config file with your paths and settings, for example src/config/inference_wrapper_ahus_testset.yml
Ensure that your config file points to a layout file containing your expected layouts, for example lead_layouts_reduced.yml or lead_layouts_george-moody-2024.yml
Run: python3 -m src.digitize --config src/config/your_config_file.yml
You can also override the config file, for example: python3 -m src.digitize --config src/config/your_config_file.yml DATA.output_path=my_output/folder

Note

The output values are expressed in microvolts (µV).

Train on custom dataset

Change data_path for TRAIN, VAL and TEST in src/config/unet.yml to the locations of the custom dataset.
Run: python3 -m src.train

Mandatory Citation

If you use this code or dataset in your research, please cite the following paper:

@misc{stenhede_digitizing_2025,
  title        = {Digitizing Paper {ECGs} at Scale: An Open-Source Algorithm for Clinical Research},
  author       = {Stenhede, Elias and Bjørnstad, Agnar Martin and Ranjbar, Arian},
  year         = {2025},
  doi          = {10.48550/ARXIV.2510.19590},
  shorttitle   = {Digitizing Paper {ECGs} at Scale}
}

Name		Name	Last commit message	Last commit date
Latest commit History 229 Commits
.github/workflows		.github/workflows
assets		assets
src		src
test		test
weights		weights
.flake8		.flake8
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitlint		.gitlint
CHANGELOG.md		CHANGELOG.md
README.md		README.md
format_and_check.sh		format_and_check.sh
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Open ECG Digitizer

Features

File structure and module overview

Installation

Running inference on a folder with images

Train on custom dataset

Mandatory Citation

About

Uh oh!

Releases 32

Packages

Contributors 2

Uh oh!

Languages

Ahus-AIM/Open-ECG-Digitizer

Folders and files

Latest commit

History

Repository files navigation

Open ECG Digitizer

Features

File structure and module overview

Installation

Running inference on a folder with images

Train on custom dataset

Mandatory Citation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases 32

Packages 0

Contributors 2

Uh oh!

Languages

Packages