Skip to content
This repository was archived by the owner on Jul 21, 2025. It is now read-only.

Latest commit

 

History

History
41 lines (27 loc) · 1.2 KB

File metadata and controls

41 lines (27 loc) · 1.2 KB

Octroy classroom use

These notes are for a 6 lectures, 4 labs as taught at FaMAF 2017, Cordoba, Argentina:

  • Class 1: Intro to IE
  • Lab 1: (octroy master branch) perl baseline, maven, UIMA pipeline
  • Class 2: Name Entity Recognition
  • Lab 2: (octroy branch class 2) OpenNLP part of the pipeline, re-training
  • Class 3: Rule-based IE
  • Class 4: Statistical (CRFs) IE
  • Lab 3: (octroy branch class 3) UIMA RuTA part of the pipeline
  • Class 5: Hybrid IE
  • Lab 4: (octroy master branch) ClearTk, training and execution
  • Class 6: Research directions

Lab 1

Objectives: familiarize the participant with UIMA XMI format, the UIMA Eclipse environment, command-line compilation and execution using Maven. Evaluation using ruta-evaluation-standalone.

Lab 2

Objectives: delve into Apache OpenNLP named entity MaxEnt training and execution within Apache UIMA and outside. Prepare the background for ClearTk.

Lab 3

Objectives: familiarize the participant with UIMA RuTA Workbench. Deployment of UIMA RuTA scripts written in the workbench. Debugging of scripts

Lab 4

Objectives: create CRFs annotators using ClearTk. Feature extraction. Training and deployment.