-
Notifications
You must be signed in to change notification settings - Fork 1
run
To execute a Data Pipeline by using pipelite you just have to first create a Configuration file and then run the appropriate command line via your shell prompt:
$ pipelite -cfg pipelite-config-file.json
The command output generates a lot of logs and at the end you should see a report which indicates all the work done by pipelite
Example:
--- PIPELITE REPORT ---
Type Description Start End Duration Rows Processed Order
id
S1 Extractor excelFileDS -> Output: [S1] 2023-10-16 17:31:02 2023-10-16 17:31:07 5.322478 1394 1
T Transformer passthroughTR -> Inputs: [S1] / Outputs: [S2] 2023-10-16 17:31:07 2023-10-16 17:31:07 0.000339 1394 2
S2 Loader csvFileDS -> Input: [S2] 2023-10-16 17:31:07 2023-10-16 17:31:07 0.015107 1394 3
Total Duration 5.337924000000001 sec
pipelite does not only generates logs in output of the command line, it also write those logs in a separate log file that can be configured in the Configuration file. A specific section (named config) just under the root is proposed for that purpose and you can precise the way you want these logs to be created:
- path Specifies the Log file location (with a slash at the end)
- filename Specifies the Log file name
- level (DEBUG|INFO|WARNING|ERROR) Log trace level
- format Log Format (Python format), use a %% to escape the % sign (% is the only character that needs to be escaped) (see the Python reference)
- maxbytes Max of byts for the log file (rolling over)
{
...
"config": {
"logger" : {
"level": "DEBUG",
"format" : "%(asctime)s|%(levelname)s|%(message)s",
"path": "logs/",
"filename" : "xes2csv_direct.log",
"maxbytes" : 1000000
}
}
...
}
🏠 Home
🔑 Main concepts
💻 Installation
🔨 Configuration
🚀 Running
Supported Data Sources
📄 CSV File
📑 XES File
📃 Excel File
📤 ODBC
🏢 SAP
🎢 ABBYY Timeline
Supported Transformations
🔀 Pass Through
📶 Dataset Profiling
🔂 Concat 2 Data sources
🆖 SubString
🆒 Column Transformation
🔃 Join data sources
🔃 Lookup
🔤 Rename Column Name
Extending pipelite
✅ how to
✅ Adding new Data sources
✅ Adding new Transformers
✅ Adding new Pipelines