Datoosh is a Python tool to help you upload CSV files into SQL databases. It makes use of the multiprocessing library to open multiple connections and insert all data more efficiently.
It's very easy to start with Datoosh! First of all, make sure you have, at least, Python 3.6.
Start by creating your virtual environment:
-
Linux:
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
-
Windows:
python -m venv venv
venv\Scripts\activate.bat
for the regular CMD orvenv\Scripts\Activate.ps1
for Powershell
You will need to use a YAML file to specify some parameters of the database that will receive the data.
python main.py [OPTIONS]
The options are:
-f
- The CSV file to process (required)-w
- The maximum number of concurrent processes to read and process the CSV file (required)-s
- The YAML settings file (required)-d
- The delimiter of the CSV file (not required - default value: ",")
Example:
python main.py -w 50 -s settings.yaml -f file.csv