Skip to content

A Python-based tool that parses gzipped text database from the PPDB website (http://www.cis.upenn.edu/~ccb/ppdb/) into a Moses phrase translation table.

Notifications You must be signed in to change notification settings

mauryquijada/ppdb-parser

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 

Repository files navigation

ppdb-parser

A Python-based tool that parses text files from the PPDB website (http://www.cis.upenn.edu/~ccb/ppdb/), and writes the information to a file that Moses (an open-source statistical machine translation tool) can read. Note that this tool expects text files to have lines of the form "LHS ||| SOURCE ||| TARGET ||| (FEATURE=VALUE )* ||| ALIGNMENT".

Furthermore, to parallelize processing, the tool also expects the input text file to be sorted by TARGET.

About

A Python-based tool that parses gzipped text database from the PPDB website (http://www.cis.upenn.edu/~ccb/ppdb/) into a Moses phrase translation table.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages