Skip to content

Latest commit

 

History

History
30 lines (26 loc) · 2.84 KB

README.md

File metadata and controls

30 lines (26 loc) · 2.84 KB

This folder regroups files used in order to convert the format of input/output data of our models.

Tagger and parser evaluation

json_to_conll.py: Creates CoNLL-U formatted tsv files from our tagger's or parser's output files.

It takes the following arguments:

  • -m [mode] where [mode] can be either tagger or parser.
  • -p [pred_file]' where [pred_file]is the model output from the tagger or parser, in the original.json` format.
  • -o [output_file] where [ouput_file] is where the generated CoNLL-U is to be saved. Defaults to <pred_file>.conllu if not specified.
  • --multi should be additionally specified when and only when more than one recipes are included in a single json file.

error_analysis.py: Creates files for manual error analysis.

It takes the following arguments:

  • -m [mode] where [mode] can be either tagger or parser.
  • -p [pred_file] where [pred_file] is the model prediction from the tagger or parser, in the original .json format.
  • -g [gold_file] where [gold_file] is the gold file, against which the prediction is to be compared.
  • -o [output_file] where [ouput_file] is where the generated TSV file will be saved. Defaults to <pred_file>.tsv if not specified.
  • -f [format] where the gold file format can be optionally specified (otherwise it is inferred from file extension). Both conllu and conll03 are allowed for the tagger, but only conllu is allowed for the parser.

parser_evaluation.py: Performs labeled evaluation on parser outputs.

It takes the following arguments:

  • -p [pred_file] where [pred_file] is the model prediction from the tagger or parser, in the original .json format.
  • -g [gold_file] where [gold_file] is the gold file, against which the prediction is to be evaluated.
  • -o [output_file] where [ouput_file] is where the evaluation results can be optionally saved as a .tsv file in addition to console output.

Others

  • brat_to_conll.py: Creates CoNLL-U and CoNLL2003 formatted tsv files from annotations files generated by the brat annotation tool and POS tags annotated with the ParZu parser.
  • flowgraph_to_conll.py: Creates CoNLL-U and CoNLL2003 formatted tsv files from flowgraph annotation files (described here).
  • id_mappings.tsv: Associates the names we use for the recipes with the names L'20 used, i.e. with the URLs to the original recipes.
  • reduce_graph.py: This script converts one CoNNL-U recipe graph with Y'20 labels and dependencies into an action graph or FAT graph.
  • reduce_dir_to_action_graphs: Traverses a directory and generates action graphs for all recipe graphs in it using reduce_graph.py.