Skip to content

Extaction of semantic data from diagrams in scientific and other technical/business documents

License

Notifications You must be signed in to change notification settings

Radhu903/openDiagram

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

36 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

openDiagram

Extraction of semantic data from diagrams in scientific and other technical/business documents.

Overview

In many documents the diagrams are a key component of the information. Data are created in semantic form and output as machine readable files and then, kin one of the great barbarism of this century are trashed into bitmaps futher degraded by JPEG technology. This lost data leads to irreproducible science and in the worst cases people die. (Clinical trials are often published as PDF and data extraction is hard or near impossible.)

This project tackles the impossible - reconstituting semantic data for the world - "turning hamburgers into cows".

Among the subjects I have successfully extracted semantic data from:

  • phylogenetic trees
  • chemical structures and reactions
  • study baseline data
  • cyclic voltammograms
  • forest plots

Many of these have common semantic diagrammatic abstractions and AMI builds these up using heuristics.

preprocessing with ami

see PREPROCESS.md

creation of project

`


About

Extaction of semantic data from diagrams in scientific and other technical/business documents

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 57.2%
  • HTML 42.4%
  • Python 0.4%