-
Notifications
You must be signed in to change notification settings - Fork 4
DeepBank_OneOne
StephanOepen edited this page Mar 25, 2014
·
16 revisions
In early 2014, we are preparing a minor update of DeepBank to fix a handful of imperfections in the original release:
- missing segments: wsj09e, wsj12e, wsj20e
- synchronize final REPP treatment of ellipses and profiles
- benefit from bug fixes in EDS export
- corrections in head assignment for DT conversion
- pick up DM conversion improvements (from SDP release)
- include GML-based WDC profiles in release
Candidate fixes to apply (which would require an increment of the grammar version):
- introduction of more idiom patterns
- automated flagged of invalid idiom fragments in gold results
- ESD-related renaming (e.g. PT and its values)
- unknown_card and its CARG
- fix unbound ARG1 on ‘measure’ relation in ‘how quickly does he arrive?’
- known sources of systematic annotation consistency (if any; e.g. splitting U.S. as sandwiched period).
- rule naming (recover the one case using an underscore in the first field)
- predicate name on not ... but
- integrate --DT percolations
- cherry-pick from the trunk: ditch gapping analysis, plug SLASH blackholes; purge multi-word names.
n-nh_j-cpd_c => n-j_j-cpd_c n-nh_j-t-cpd_c => n-j_j-t-cpd_c n-nh_v-cpd_c => n-v_j-cpd_c
Finally, a few aspects of ‘packaging’ that did not make it into the first version
-
update WeScience and WDC treebanks to GML versions
-
export into MRG-style trees with PTB-style tokenization
-
investigate high failure percentage in robust meaning construction (and DM conversion)
Home | Forum | Discussions | Events