Skip to content

AniketGurav/Table-detection-and-Document-layout-analysis

 
 

Repository files navigation

Table-detection-and-Document-layout-analysis

Introduction

Using State of the Art techniques for table detection and Document layout analysis. For table detection we are using MMDetection version(1.2), however in Document layout analysis we are using the models which have been developed in MMDetection version(2.0)

Setup

Models are developed in Pytorch based MMdetection framework (Version 2.0)

git clone -'https://github.com/open-mmlab/mmdetection.git'
cd "mmdetection"
python setup.py install
python setup.py develop
pip install -r {"requirements.txt"}

Image Augmentation

We have followed Dilation and Smudge techniques for Data Augmentation


Model Zoo

Config file for the Models :

  1. For table detection Config_file

  2. For Document Analysis Config_file

Note: Config paths are only required to change during training

Checkpoints of the Models that have been trained :

Model NameCheckpoint File
Table structure recognitionCheckpoint
Document layout analysisCheckpoint

Datasets

  1. Table detection and Structure Recignition: You can refer to Dataset to have a better understanding of the Dataset

  2. Document layout Analysis: You can refer to Dataset to have a better understanding of the dataset.

Training

Refer to the two colab notebooks thathave been mentioned as they will direct you through the steps that need to be followed. If using a custom dataset do go through MMdet Docs

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 94.1%
  • Python 5.9%