-
Notifications
You must be signed in to change notification settings - Fork 0
Postgresql Copy command
Ben Serrette edited this page Jun 25, 2019
·
1 revision
Before you run a mag CSV through the clean the data file.
sed 's/\x00//g' mag_papers_1_paperAbstract.csv > mag_papers_1_paperAbstract_no_null.csv
sed -e 's/\\\\\"\([^\n]\)/\\\\\""\1/g' mag_papers_1_paperAbstract_no_null.csv > mag_papers_1_paperAbstract_no_null_quotes.csv
The first command removes null characters from abstracts. (there should be only about 10)
The second command converts \\"
into \\""
to escape double quotes for postgres. MAG includes a lot of triple escaped quotes (\\\"
) that need to be escaped for postgres by using double double quotes.