forked from cltl/EventCoreference
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathusage
154 lines (135 loc) · 10.9 KB
/
usage
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
Class: EventCorefWordnetSim
Compares predicates using one of the WN similarity function.
Scripts:
- event-coreference-en.sh (for English texts)
- event-coreference-nl.sh (for Dutch texts)
Parameters:
--naf-folder <OPTIONAL: path to folder with input files>
--extension <extension of files to process when running in folder mode>
--naf-file <OPTIONAL: path to specific NAF file>
--output-tag <tag to label the extension of the output file. Only relevant for non-streaming output>
If naf-folder and naf-file are empty, we assume the input is an input stream NAF
--wn-lmf <path to wordnet file in lmf format>
--method <one of the following methods can be used leacock-chodorow, path, wu-palmer>
--sim <similarity threshold (double, e.g. 2.5) below which no coreference is no coreference relation is determined >
--sim-ont <fall back similarity threshold (double, e.g. 1.5) if sim is below threshold but there is still an ontology match (ESO or FrameNet) below which no coreference is no coreference relation is determined. This threshold needs to be lower than sim and wordnet-lmf need to have the ESO and FrameNet ontology layer>
--relations <synsets relations that are used for the distance measurement >
--drift-max <maximum number of lowest-common-subsumers allowed >
--wsd <all senses from WSD with proportional score above threshold (double) are used, e.g. 0.8 means all senses proportionally scoring 80% of the best scoring sense>
--wn-prefix <three letter prefix of the wordnet synset identifier in the >
--wn-source <if terms have been scored by different systems, e.g. ukb or ims, you can restrict the wsd to a system by giving the name. This will match the source field in the external reference of the term.>
--source-frames <List of framenet frames to be treates as source events. No coreference is applied>
--contexual-frames <List of framenet frames to be treates as contexual events. Coreference is applied>
--grammatical-frames <List of framenet frames to be treates as grammatical events. No coreference is applied>
--replace <Using this flag previous output of event-coreference is removed first>
--ignore-false <Using this flag removes srl with predicate status = \"false\">
Class: EventCorefLemmaBaseline
Groups events on the basis of lemmas in the SRL layer
Scripts:
- event-coreference-lemma.sh (any language)
Parameters:
--naf-folder <OPTIONAL: path to folder with input files>
--extension <extension of files to process when running in folder mode>
--naf-file <OPTIONAL: path to specific NAF file>
--output-tag <tag to label the extension of the output file. Only relevant for non-streaming output>
If naf-folder and naf-file are empty, we assume the input is an input stream NAF
--replace <Using this flag previous output of event-coreference is removed first>
--ignore-false <Using this flag removes srl with predicate status = \"false\">
Class: EventCorefSingletonBaseline
Every predicate in the SRL layer becomes a singleton set in the event coreference layer. No coreference.
Scripts:
- event-coreference-singleton.sh (any language)
Parameters:
--naf-folder <OPTIONAL: path to folder with input files>
--extension <extension of files to process when running in folder mode>
--naf-file <OPTIONAL: path to specific NAF file>
--output-tag <tag to label the extension of the output file. Only relevant for non-streaming output>
If naf-folder and naf-file are empty, we assume the input is an input stream NAF
--replace <Using this flag previous output of event-coreference is removed first>
--ignore-false <Using this flag removes srl with predicate status = \"false\">
Class: GetSemFromNafStream
Reads NAF stream and creates RDF-TRiG stream with SEm and GRasP
Scripts:
naf2sem-grasp.sh
Parameters:
--project <string> <The name of the project for creating URIs>
--non-entities <If used, additional FrameNet roles and non-entity phrases are included>
--contextual-frames <path> <Path to a file with the FrameNet frames considered contextual>
--communication-frames <path> <Path to a file with the FrameNet frames considered source>
--grammatical-frames <path> <Path to a file with the FrameNet frames considered grammatical>
--perspective <Perspective graph is added to the output>
--time-max <string int> <Maximum number of time-expressions allows for an event to be included in the output. Excessive time links are problematic. The defeault value is 5" +
--ili <(OPTIONAL) Path to ILI.ttl file to convert wordnet-synsets identifiers to ILI identifiers>
--ili-uri <(OPTIONAL) If used, the ILI-identifiers are used to represents events. This is necessary for cross-lingual extraction>
--verbose <(OPTIONAL) representation of mentions is extended with token ids, terms ids and sentence number\n"
--no-doc-time <(OPTIONAL) document creation time is ignored, default behaviour it is used>
--no-context-time <(OPTIONAL) context sentence times are ignored, default behaviour it is used>
Class: ProcessEventObjectsStream
Reads SEM-GRaSP input stream (generated by GetSemFromNafStream) and calls KnowledgeStore to do the event comparison
Scripts:
run_singl_naf.sh
Parameters:
--source-roles "pb\:A0,pb\:A1"
--contextual-match-type "ILILEMMA"
Class: ClusterEventObjects
Creates CompositeEvent objects and stores these into cluster folders based on type and time.
Scripts:
naf2sem-batch-cluster.sh
Parameters:
--naf-folder <path> <Folder with the NAF files to be processed. Reads NAF files recursively>
--event-folder <path> <Folder below which the event folders are created that hold the object file.
The output structure is event/other, event/grammatical and event/speech.>
--extension <string> <File extension to select the NAF files .>
--project <string> <The name of the project for creating URIs>
--non-entities <If used, additional FrameNet roles and non-entity phrases are included>
--contextual-frames <path> <Path to a file with the FrameNet frames considered contextual>
--communication-frames <path> <Path to a file with the FrameNet frames considered source>
--grammatical-frames <path> <Path to a file with the FrameNet frames considered grammatical>
--perspective <In addition to object files a spearate perspective RDF file is created for each NAF file>
--eurovoc-en <path> <Path to the eurovoc resource for converting labels to URIs>
--all <No restrictions are applied on the output, such as lacking participants, time-anchoring or too many time anchors.>
--no-doc-time <(OPTIONAL) document creation time is ignored, default behaviour it is used>
--no-context-time <(OPTIONAL) context sentence times are ignored, default behaviour it is used>
Class: NoClusterEventObjects
Creates CompositeEvent objects and stores these into a single folder "all".
Scripts:
naf2sem-batch-nocluster.sh
Parameters:
--naf-folder <path> <Folder with the NAF files to be processed. Reads NAF files recursively>
--event-folder <path> <Folder below which the event folders are created that hold the object file.
The output structure is event/other, event/grammatical and event/speech.>
--extension <string> <File extension to select the NAF files .>
--project <string> <The name of the project for creating URIs>
--non-entities <If used, additional FrameNet roles and non-entity phrases are included>
--contextual-frames <path> <Path to a file with the FrameNet frames considered contextual>
--communication-frames <path> <Path to a file with the FrameNet frames considered source>
--grammatical-frames <path> <Path to a file with the FrameNet frames considered grammatical>
--perspective <In addition to object files a spearate perspective RDF file is created for each NAF file>
--eurovoc-en <path> <Path to the eurovoc resource for converting labels to URIs>
--all <No restrictions are applied on the output, such as lacking participants, time-anchoring or too many time anchors.>
--no-doc-time <(OPTIONAL) document creation time is ignored, default behaviour it is used>
--no-context-time <(OPTIONAL) context sentence times are ignored, default behaviour it is used>
Class: MatchEventObjects
Reads object files with composite events and
Scripts:
naf2sem-batch-cluster.sh
naf2sem-batch-nocluster.sh
Parameters:
--event-folder <path> <Path to the event folder that has subfolders for each time-description, e.g. \"e-2012-03-29\". Object file (*.obj) should be stored in these subfolders
--gz <OPTIONAL for reading gzipped object files with gz extension and writing trig.gz"
--wordnet-lmf <path> <(OPTIONAL, not yet used) Path to a WordNet-LMF file
--concept-match <int> <threshold for conceptual matches of events, default is 50
--phrase-match <int> <threshold for phrase matches of events, default is 50>
--hypers <(OPTIONAL) use hypernyms to match events
--lcs <(OPTIONAL) use lowest-common-subsumers to match events
--chaining <Determines the chaining function used: 1, 2, 3, 4. Default value is 3
--match-type <string> <(OPTIONAL) Indicates what is used to match events across resources. Default value is \"LEMMA\". Values:\"LEMMA\", \"ILI\", \"ILILEMMA\">
--ili <(OPTIONAL) Path to ILI.ttl file to convert wordnet-synsets identifiers to ILI identifiers>
--source-data <path> <(OPTIONAL, Deprecated) Path to LexisNexis meta data on owners and authors to enrich the provenance>
--roles <string> <String with PropbBank roles, separated by \",\" for which there minimally needs to be a match, e.g. \"a0,a1\". This is especially relevant for sourceEvent, grammaticalEvent.
If value is \"all\", then all participants need to match. This can be used for futureEvent"
--verbose <(OPTIONAL) representation of mentions is extended with token ids, terms ids and sentence number
--time <string> <(OPTIONAL) year, month or day indicate granularity of temporal match. If empty time is not matched
--ili-uri <(OPTIONAL) If used, the ILI-identifiers are used to represents events. This is necessary for cross-lingual extraction>
--subfolder <(OPTIONAL) Processes any subfolder>
--debug <(OPTIONAL) default=0, 1=minimal, 2=max>;