Convenience

Overview

The repository includes a number of convenience scripts to illustrate and automate common usage.

Basic evaluation and reporting

The basic evaluation scripts automate the following workflow:

The following are written to the output directory:

detailed evaluation report for each run (*.evaluation),
summary evaluation report for comparing runs (00report.tab).

Usage for TAC14 output format:

./scripts/run_tac14_evaluation.sh \
    /path/to/gold.xml \              # TAC14 gold standard queries/mentions
    /path/to/gold.tab \              # TAC14 gold standard link and nil annotations
    /system/output/directory \       # directory containing (only) TAC14 system output files
    /script/output/directory \       # directory to which results are written
    number_of_jobs                   # number of jobs for parallel mode

Usage for TAC13 output format:

./scripts/run_tac13_evaluation.sh \
    /path/to/gold.xml \              # TAC13 gold standard queries/mentions
    /path/to/gold.tab \              # TAC13 gold standard link and nil annotations
    /system/output/directory \       # directory containing (only) TAC13 system output files
    /script/output/directory \       # directory to which results are written
    number_of_jobs                   # number of jobs for parallel mode

Analysis and confidence reporting

The analysis scripts automate the following workflow:

The following are written to the output directory:

detailed evaluation report for each run (*.evaluation),
summary evaluation report for comparing runs (00report.tab),
detailed confidence interval report for each run (*.confidence),
summary confidence interval report for comparing runs (00report.*),
error type distribution for each run (*.analysis).

Usage for TAC14 output format:

./scripts/run_tac14_all.sh \
    /path/to/gold.xml \              # TAC14 gold standard queries/mentions
    /path/to/gold.tab \              # TAC14 gold standard link and nil annotations
    /system/output/directory \       # directory containing (only) TAC14 system output files
    /script/output/directory         # directory to which results are written

Usage for TAC13 output format:

    /path/to/gold.xml \              # TAC13 gold standard queries/mentions
    /path/to/gold.tab \              # TAC13 gold standard link and nil annotations
    /system/output/directory \       # directory containing (only) TAC13 system output files
    /script/output/directory         # directory to which results are written

Filtered evaluation

The filtered evaluation scripts automate the following workflow:

filter gold data to include a specific subset of instances,
filter each system run to include a specific subset of instances,
run the basic evaluation over subset data.

The following are written to an output directory for each subset:

detailed evaluation report for each run (*.evaluation),
summary evaluation report for comparing runs (00report.tab).

The following subsets/directorys are defined:

PER - mentions with person entity type,
ORG - mentions with organisation entity type,
GPE - mentions with geo-political entity type,
NW - mentions from newswire documents,
WB - mentions from newsgroup and blog documents,
DF - mentions from discussion forum documents,
entity-document type combinations (PER_NW, PER_WB, PER_DF, ORG_NW, etc.).

Usage for TAC14 output format:

./scripts/run_tac14_filtered.sh \
    /path/to/gold.xml \              # TAC14 gold standard queries/mentions
    /path/to/gold.tab \              # TAC14 gold standard link and nil annotations
    /system/output/directory \       # directory containing (only) TAC14 system output files
    /script/output/directory         # directory to which results are written

Usage for TAC13 output format:

./scripts/run_tac13_filtered.sh \
    /path/to/gold.xml \              # TAC13 gold standard queries/mentions
    /path/to/gold.tab \              # TAC13 gold standard link and nil annotations
    /system/output/directory \       # directory containing (only) TAC13 system output files
    /script/output/directory         # directory to which results are written

Test evaluation on TAC 2013 data

The test evaluation script automates the following workflow:

run the basic evaluation,
compare evaluation output to official TAC13 results.

The following are written to the output directory:

detailed evaluation report for each run (*.evaluation),
summary evaluation report for comparing runs (00report.tab),
copy of the official results sorted for comparison (00official.tab),
a diff report if the test fails (00diff.txt).

Usage for TAC13 official results:

./scripts/test_tac13_evaluation.sh \
    /path/to/gold.xml \              # TAC13 gold standard queries/mentions
    /path/to/gold.tab \              # TAC13 gold standard link and nil annotations
    /system/output/directory \       # directory containing (only) TAC13 system output files
    /system/scores/directory \       # directory containing official score summary reports
    /script/output/directory         # directory to which results are written

The gold data from TAC13 is distributed by LDC. When running the test evaluation script, provide:

LDC2013E90_TAC_2013_KBP_English_Entity_Linking_Evaluation_Queries_and_Knowledge_Base_Links_V1.1/data/tac_2013_kbp_english_entity_linking_evaluation_queries.xml,
LDC2013E90_TAC_2013_KBP_English_Entity_Linking_Evaluation_Queries_and_Knowledge_Base_Links_V1.1/data/tac_2013_kbp_english_entity_linking_evaluation_KB_links.tab.

The system data from TAC13 is distributed by NIST. When running the test evaluation script, provide:

KBP2013_English_Entity_Linking_Evaluation_Results/KBP2013_english_entity-linking_runs,
KBP2013_English_Entity_Linking_Evaluation_Results/KBP2013_english_entity-linking_scores.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Convenience

Overview

Basic evaluation and reporting

Analysis and confidence reporting

Filtered evaluation

Test evaluation on TAC 2013 data

Clone this wiki locally