Skip to content
Nathan Beals edited this page Jul 1, 2020 · 1 revision

Textension Wiki

Welcome to the homepage of the Textension wiki, here you can find technical details about implementations as well as key features.

Contents

Summary

Built With

  • Flask - The web framework used (Python 3)
  • Jinja2 - Template engine
  • Bootstrap - Front-end component library
  • Docker - Container / Dependency management

Versioning

This project is being developed using an iterative approach. Therefore, now releases have yet been made and the project will be subject to drastic changes. No versioning practices will be followed until release. To see a history of changes made to this project, see commit history.

Key Features

All of the key features relate to the actual text interaction object/page (interact.html), with some supporting features (file upload, etc) not covered.

Home

  1. Upload File
  2. Upload Image

Interaction Page

Function Description
  1. Download
    1. Page - Downloads an image of the page as you currently see it (including drawings with “Draw” tool)
    2. Text - JSON formatted .txt file of the OCR’d text
  2. Open/Close All Spaces
    • Opens or closes spacing in between lines/words on the interaction object
  3. Vertical Space / Horizontal Space
    • When the “Vertical Space” or “Horizontal Space” mode is selected, it modifies the actions of the open/close all spaces to apply to only the selected axis (vertical or horizontal)
  4. OCR
    • Enable/show OCR’d text generated from the uploaded page
  5. OCR Uncertainty
    • Display OCR uncertainty as a colour gradient. More Orange = More Uncertain
  6. N-Gram Usage
    • Display usage of N-Grams over time (from google ngrams)
  7. Locations
    • Identify locations within the OCR’d text (I.E “Toronto”) and then display a google map of said place
  8. Erase Words
    • Allows the user to erase or modify this word, or all occurrences of this word.
  9. Draw
    • Custom draw/markup the document
  10. Dictionary
    • Click and hold on a word to display a context menu with the dictionary definition
  11. Context map
    • Build a context map of words by clicking words in order to add them to the map.

Project Structure

/ - Root

  • Readme.md - project readme, getting started
  • file_upload.py - Main entrypoint for the flask app, set export FLASK_APP=file_upload.py to run
  • run.py - alternate entrypoint that redirects to file_upload.py

/server

Data storage for data used in the operation of the server (such as sample files)

/static - Bulk of Files

/static/css

Cascading stylesheets are found here. Additionally, some image resources used are found here too.

/static/js

Clientside javascript used to drive the UI/UX.

  • autosize.min.js

  • event.js

    • while interact.js implements the interactive functionality of the various tools, event.js is what actually creates and updates the corresponding objects in the DOM.
    • Details here.
  • interact.js

    • Bulk of the interaction resides here
    • Details found here
  • linguistic.js

    • Implementation of the “dictionary” and “context map” tools core functionality
    • Details found here
  • main.js

    • Entrypoint of the clientside javascript. Initializes values and then enters interact.js
  • textensionModel.js

    • New class (intended to be expanded if/when a re-write happens) that will be the single source of truth for the actual textension data
      • that means: the OCR’d text, confidence levels, Image/mesh map and locations to the images (TODO: currently stores images inline as base64, make it async file requests instead)

/static/lib

Libraries that have discrete functionality stored here

  • Bootstrap 3
  • Capture
    • custom library for capturing from webcams
  • Dropzone
  • Jquery

/static/py

Bulk of the python serverside components.

Note: file_upload.py modifies it’s own system.path so that it may import the files within this directory directly without having to address the file through the directory in between. (E.G: import pdf_text_extraction instead of import static.py.pdf_text_extraction)

/templates - Flask templates

Contains all the flask (jinja2) templates that are rendered server side before being sent to the client.

  • interact.html - template for the interact page (the main page used when interacting with the site).
  • index.html - Index (home) page.
  • base.html - Basic re-usable base used in interact.html

Code Supporting Key Features

This is non-exhaustive, meant to be a place on where to start.

1. Download Page/Text

File Note
interact.html html element id #download_data , <a> links named “Download Text” and “Download Page”
main.js javascript function downloadText
interact.js javascript function print (actually downloads the image of the page)
“whole backend” Of course downloading the data from a page downloads all of the data generated about it, which the whole backend is involved in generating.

2. Open/Close All Spaces

File Note
interact.js functions toggleMode, disableMode, setActiveMode openSpaces, closeSpaces, toggleSingeSpace
interact.html <a> links with text “Open All Spaces” & “Close all Spaces”
CAIS.py Opening and closing spaces between lines of text only works if there are lines to expand, this script “Content Aware Image Slicing” does that.
file_upload.py What responds to the webroute /interact and calls all the routines for processing the data and then templates it into the interact.html

3. Vertical/Horizontal Space Mode

Similar to Open/Close All spaces, Vertical/Horizontal space mode buttons changes the modes that those buttons use by calling the function and embedding the data right in the html tag e.g <input id="vertical-space" type="checkbox" onclick="toggleMode(this, false, false, false, false);" data-mode="vertical"/>

4. OCR

File Note
interact.html ocr-text element id and associated elements within it.
At the bottom of the file, find var ocr = {{ ocr }}, this is where flask templates data into js on load
interact.js injectMetaData() Loads data from the aforementioned var ocr = {{ocr}} onto the associated html
event.js Event handler for user interaction events, in this case, #ocr-text on click event which loads/unloads the ocr data into the appropriate fields
textension.py Container class pulling together all the functionality implemented in other files
ocr_*.py Various OCR related files, uses tesseract.

5. OCR Uncertainty

All the same files as OCR, except pertaining specifically to the uncertainty values generated by tesseract.

6. N-Gram Usage

File Note
interact.html #ngram element id and associated elements within it.
At the bottom of the file, find var ngram_plot = {{ ngram_plot }}, this is where flask templates data into js on load
interact.js setUniqueness()and drawOverlay both interact with the n-gram usage plots
event.js Event handler for user interaction events, in this case, #ngram on click event which hides/shows the usage plots
textension.py Container class pulling together all the functionality implemented in other files
getngrams.py Gets the N-Gram chart from google n-gram usage over time portal.

7. Locations

File Note
interact.html #locations element id and associated elements within it.
interact.js drawLocations turns on/off the drawing of location maps
event.js Event handler for location-related on click events.
googleMaps.py Gets the map image from google maps api

8. Erase Words

Very similar to the OCR/OCR Uncertainty headings with the same files, but also:

File Note
textensionModel.js The beginnings of cleaning up the clientside data so that only one source of truth exists.

9. Draw

File Note
interact.html #draw element id and associated elements within it. (draw-color, draw-line, etc)
interact.js draw turns on/off drawing. changeMode changes to drawing mode.
event.js Event handler for location-related on click events.

10. Dictionary

File Note
interact.html #dictionary element id and associated elements within it. (draw-color, draw-line, etc)
linguistic.js defineWord()
event.js Event handler for location-related on click events.

11. Context Map

File Note
interact.html #context* element ids and associated elements within it. (draw-color, draw-line, etc)
linguistic.js createContextMap()
event.js Event handler for location-related on click events.