Skip to content

Latest commit

 

History

History
55 lines (43 loc) · 2.81 KB

7.md

File metadata and controls

55 lines (43 loc) · 2.81 KB

Session 7: Adapting and integrating existing open source projects

graphic recording session 7

Scope and purpose

  • Guiding question
    Which of the available open-source projects may meet project needs?

  • Considerations
    Open-ONI, OpeNER, Apache, NERC-Fr, Palladio, Voyant Tools, D3

  • Goal
    Shortlist of open source tools to adapt

  • Discussants
    Ludovic Moncla (lead), Mary Elings & Elena Azadbakht

Documentation

Discussion summary

During this session, we reviewed several open-source projects, paying particular attention to the research activities and experience of our grant participants.In discussing the available tools, we attended to four main project needs: 1) browsing and sharing the document collection; 2) annotating the corpus; 3) processing the corpus using machine learning, geoparsing, and text mining techniques; and 4) visualizing and exploring the corpus.

Decisions

We resolved to adapt tools for the following purposes in building our project:

  1. Browsing Corpus OpenONI, The Online Newspaper Initiative which provides a function set for loading, modeling and indexing data
  2. Annotating Corpus BRAT A server-based tool used to annotate the training and verification data for natural language processing Pelagios and Recogito A semantic annotation tool for texts and images that can identify and map places PERDIDO Geoparser A flexible geoparser that could be adapted to use a manually annotated gazetteer of historic place names
  3. Natural Language Processing Spacy A flexible python library for natural language processing capable of performing most of the needed project tasks
  4. Visualization D3, Data-Driven Documents A Javascript library that will enable us to make custom visualizations that are web-based and interoperable across browsers

 


Back to main page