Rust PostgreSQL and MariaDB libraries for processing USDA Branded Food Products data. The Cargo workspace also includes a CLI utility for loading CSV and REST API and GraphQL servers.
The repository is organized as a Cargo workspace.
./mariadb -- MariadDB library
./pg -- PostgreSQL library
./graphql -- graphql server
./restapi -- REST server
./ingest-csv -- cli utility for importing the USDA csv files into the database
./data/pg -- Diesel migration scripts to create the PostgreSQL database and schema.rs
./data/mariadb -- Diesel migration scripts to create the MariaDB database and schema.rs
You can choose either PostgreSQL (13 recommended) or MariaDB 10. In either case, it's assumed you have a running instance of one or the other.
For PostgreSQL:
createdb bfpd
For MariaDB:
mysql -u user -p -e"create schema bfpd;"
At this point, you have a couple of options: 1) download a dump of a recent version of the Branded Food Products database from https://go.littlebunch.com/posgresql for a PostgreSQL dump or https://go.littlebunch.com/mariadb for a MariaDB dump and restore to your local instance or 2) build the database from the ground-up by importing the USDA csv files using the provided ingest-csv command line utility.
If you are using the first option, download the dump files and restore to your database:
For postgreSQL:
dropdb bfpd5 && createdb bfpd5 # only if database exists
xz -d -c < [downloaded.sql.xz] | psql -U [user] bfpd5
For MariaDB:
mysql -u [user] -p -e'drop database bfpd5' && mysql -u [user] -p -e'create database bfpd5' # only if database exists
xz -d -c [download.sql.xz] | mysql -u [user] -p bfpd5
-
Build the binary Instructions for building the ingest-csv executable are provided in the ingest-csv/README.md.
-
Download and unzip the latest csv from the FDC website into a directory of your choice. You will need the Branded Foods and Supporting data for All Downloads zip files:
wget https://fdc.nal.usda.gov/fdc-datasets/FoodData_Central_branded_food_csv_2021-10-28.zip
wget https://fdc.nal.usda.gov/fdc-datasets/FoodData_Central_Supporting_Data_csv_2021-10-28.zip
-
Use the Diesel migration scripts in the data directory to create an empty database
For PostgreSQL:
psql -U user -W bfpd < data/pg/up.sql
For MariaDB:
mysql -u user -p bfpd < data/mariadb/up.sql
Note: You can use the up.sql and down.sql scripts to create a diesel migration. This is probably more trouble than it's worth unless you need to change the schema or just want to learn a bit more about diesel migrations.
-
Load the data by pointing the program to the full path containing the csv
./target/release/ingest-cvs -p /path/to/csv/
The load takes about 30 minutes to 2+ hours depending on your hardware. Note: you need to set a DATABASE_URL variable as described in Step 4 in the ingest-csv README.
You can use the provided graphql or REST binaries to publish the data. Instructions for building and running the graphql server are provided in the graphql/README.md. Instructions for the REST api are provided in the restapi/README.md.