Darwin Core to SQL (dwca2sql) is a lightweight tool to ease the importation of Darwin Core archives into a relational database. It translates the structure and content of a Darwin Core Archive file into CREATE TABLE and/or INSERT INTO SQL statements, packaged as an .sql file, which you can then import in your database.
You use the tool from the command line.
mvn clean package
Once built, the .jar file will be available in the target
folder. If you want to move the tool, make sure to carry the lib
folder.
Argument | Description | Default | Mandatory |
---|---|---|---|
-s | Your Darwin Core Archive file (zipped) or folder (unzipped). | yes | |
-c or -i or -ci | The type of SQL statements you want to generate: CREATE TABLE or INSERT INTO or both. | yes | |
-o | The path and name of the output SQL file. | dwca2sql.sql in the current directory | no |
-p | A prefix for the generated table name. Will be combined with the core or extension name, e.g. prefix_occurrences. | none | no |
-d | Because not all vendors are following the SQL standard, you can specify a vendor for better results. | postgres | no |
--max-row-per-insert | Maximum number of rows to include in a single INSERT statement. | 100 | no |
-f | Force the execution: do not ask before overwriting existing files. | false | no |
Create the SQL file:
java -jar dwca2sql.jar -ci -s /Users/JamesHowlett/Docs/MyDarwinCoreArchive.zip -o /tmp/unicorn.sql -p unicorn -d mysql
Import into MySQL:
mysql -hHOST -uUSER -pPASSWORD -DDATABASE_NAME --default_character_set utf8 -e "source /tmp/unicorn.sql"
- Data types are somehow limited to text. They could be read from the meta.xml file (see Issue #3) but this is incorrect according to the XSD.
- Only mysql and postgres are implemented for database type.
- Only executable from the command line.
The tool makes the assumption that you trust the Darwin Core Archive file that you're trying to import into your database. You use the tool at your own risk.
These are the third-party software tools used for this tool:
Component | Version | Developed by | License | Source |
---|---|---|---|---|
DarwinCore Archive Reader | 1.14 | GBIF | Apache License 2.0 | https://github.com/gbif/dwca-reader |
Apache Commons IO | 2.1 | Apache | Apache License 2.0 | http://commons.apache.org/io/ |
Apache Commons CLI | 1.2 | Apache | Apache License 2.0 | http://commons.apache.org/cli/ |
Apache Commons Lang | 3.1 | Apache | Apache License 2.0 | http://commons.apache.org/lang/ |
Apache Commons Compress | 1.3 | Apache | Apache License 2.0 | http://commons.apache.org/compress/ |