Skip to content

greenhub-project/dataset-generator

Repository files navigation

Dataset generator

Helper scripts to export GreenHub's dataset to .csv files

Features

  • Generate whole dataset to 7z format
  • Single mode execution to export just one table
  • Separate 7z files for main dataset tables (devices, samples)
  • .env file to store credentials
  • Script logs
  • Exports database schema
  • Containerized setup

Requirements

Instructions

# First set the credentials and env variables
$ cp .env.example .env
# Run the application in the background
$ docker-compose up -d
# Display the logs
$ docker-compose logs

Options

SCRIPT

Type: string
Values: generate, schema, query
Default: generate

This options sets which script is executed, generate will export the tables to a .csv file while schema exports the database schema to a .sql file. All files are compressed to 7z files.

TABLE

Type: string
Values: See the file tables.conf for valid options. Empty value will export all tables.
Default: ''

Script Configuration

It is possible to make some tweaks in the script file to control the generation of additional dataset files.

# First, set the table name
TABLE_NAME="devices"

# Calling run_query without args, will only append results to dataset.7z
run_query

# Adding 'zip' arg will also create a separate TABLE_NAME.7z file
# and append results to dataset.7z
run_query "zip"