Skip to content
datacorner edited this page Oct 17, 2023 · 1 revision

execute a pipeline

Execute in command line

To execute a Data Pipeline by using pipelite you just have to first create a Configuration file and then run the appropriate command line via your shell prompt:

$ pipelite -cfg pipelite-config-file.json

The command output generates a lot of logs and at the end you should see a report which indicates all the work done by pipelite

Example:

--- PIPELITE REPORT ---
           Type                                    Description                Start                  End  Duration Rows Processed Order
id                                                                                                                                     
S1    Extractor                    excelFileDS -> Output: [S1]  2023-10-16 17:31:02  2023-10-16 17:31:07  5.322478           1394     1
T   Transformer  passthroughTR -> Inputs: [S1] / Outputs: [S2]  2023-10-16 17:31:07  2023-10-16 17:31:07  0.000339           1394     2
S2       Loader                       csvFileDS -> Input: [S2]  2023-10-16 17:31:07  2023-10-16 17:31:07  0.015107           1394     3

Total Duration 5.337924000000001 sec

Managing logs

pipelite does not only generates logs in output of the command line, it also write those logs in a separate log file that can be configured in the Configuration file. A specific section (named config) just under the root is proposed for that purpose and you can precise the way you want these logs to be created:

  • path Specifies the Log file location (with a slash at the end)
  • filename Specifies the Log file name
  • level (DEBUG|INFO|WARNING|ERROR) Log trace level
  • format Log Format (Python format), use a %% to escape the % sign (% is the only character that needs to be escaped) (see the Python reference)
  • maxbytes Max of byts for the log file (rolling over)
{
    ...
    "config": {
        "logger" : {
            "level": "DEBUG",
            "format" : "%(asctime)s|%(levelname)s|%(message)s",
            "path": "logs/",
            "filename" : "xes2csv_direct.log",
            "maxbytes" : 1000000
        }
    }
    ...
}

🏠 Home
🔑 Main concepts
💻 Installation
🔨 Configuration
🚀 Running

Supported Data Sources
📄 CSV File
📑 XES File
📃 Excel File
📤 ODBC
🏢 SAP
🎢 ABBYY Timeline

Supported Transformations
🔀 Pass Through
📶 Dataset Profiling
🔂 Concat 2 Data sources
🆖 SubString
🆒 Column Transformation
🔃 Join data sources
🔃 Lookup
🔤 Rename Column Name

Extending pipelite
✅ how to
✅ Adding new Data sources
✅ Adding new Transformers
✅ Adding new Pipelines

Clone this wiki locally