Skip to content

Commit 7dc70c0

Browse files
committed
doc: Added description on contextual executions
1 parent 7a7d841 commit 7dc70c0

File tree

14 files changed

+223
-6
lines changed

14 files changed

+223
-6
lines changed

docs/Gemfile.lock

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,7 @@ GEM
3131
safe_yaml (~> 1.0)
3232
terminal-table (>= 1.8, < 4.0)
3333
webrick (~> 1.7)
34+
jekyll-mermaid (1.0.0)
3435
jekyll-sass-converter (3.0.0)
3536
sass-embedded (~> 1.54)
3637
jekyll-watch (2.2.1)
@@ -67,6 +68,7 @@ PLATFORMS
6768

6869
DEPENDENCIES
6970
jekyll (~> 4.3.0)
71+
jekyll-mermaid
7072
kramdown-parser-gfm
7173
tzinfo (~> 1.2)
7274
tzinfo-data

docs/_includes/block/divider.html

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
<div class="ui divider"></div>

docs/_includes/block/etl-step.html

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,4 +18,5 @@
1818
</div>
1919
</div>
2020
</div>
21-
</div>
21+
</div>
22+
{% include block/divider.html %}

docs/_includes/block/mermaid.html

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
<pre class="mermaid">
2+
{{ include.mermaid }}
3+
</pre>

docs/_includes/head.html

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,3 +2,5 @@
22
<script src="https://cdn.jsdelivr.net/npm/semantic-ui@2.5.0/dist/semantic.min.js"></script>
33

44
<link rel="stylesheet" href='{{ "/assets/css/custom.css" | absolute_url }}' />
5+
6+
<script src="https://cdn.jsdelivr.net/npm/mermaid@11.4.0/dist/mermaid.min.js"></script>

docs/_includes/menu.html

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,9 @@
3333
<a class="item" href="/doc/01-understand-the-etl/the-concept.html">
3434
The concept
3535
</a>
36+
<a class="item" href="/doc/01-understand-the-etl/execution-context.html">
37+
Execution Context
38+
</a>
3639

3740
<a class="item" href="/doc/01-understand-the-etl/item-types">
3841
Item types

docs/assets/css/custom.css

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,9 @@
11
@import "code.css";
2+
3+
img {
4+
width: 100%;
5+
}
6+
27
#main-div {
38
width: 100%
49
}
13.8 KB
Loading
14.3 KB
Loading
15.4 KB
Loading
22 KB
Loading
Lines changed: 69 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,72 @@
11
---
22
layout: base
3-
title: PHP-ETL - Understand the ETL
3+
title: PHP-ETL - Understand the ETL
44
subTitle: Execution Context - Why to have an execution context & what it does
5-
---
5+
width: large
6+
---
7+
8+
## Execution Context - Why to have an execution context & what it does
9+
10+
In most of our examples our chain had access to the whole file system.
11+
This means having multiple chains running together, or having a list of files each execution has generated is impossible.
12+
13+
Both the 🎵 Symfony Bundle(and therefore the 🦢 Sylius integration) and the Magento2 Module will use contextual chains.
14+
This means the "main" operations have only access to a particular directory created for the execution of the chain.
15+
16+
Additional operations such as the ExternalFileFinderOperation and ExternalFileProcessor will be use to
17+
process files that are either on a remote directory (sftp, bucket s3...) or files that are on the local file system.
18+
Because operations such as the CsvLoader will not have access to those files unless they are copied into the contextual directory of the current execution.
19+
20+
Let start by a simple example.
21+
22+
### Write the result of an API to a CSV File.
23+
24+
{% capture description %}
25+
For this we will first create a new ContextFactory using PerExecutionContextFactory.
26+
This context factory will create unique contexts for each execution. This means a unique directory to run the etl
27+
in; and a unique logger.
28+
29+
This is only needed if you are running the etl in **🐘 standalone**. With any integration this should be automatically $
30+
handled for you. This chapter will be the last one where we do mention standalone integrations.
31+
32+
33+
{% endcapture %}
34+
{% capture code %}
35+
```php
36+
<?php
37+
$workdir = __DIR__ . "/var/";
38+
$dirManager = new ChainWorkDirManager($workdir);
39+
$loggerFactory = new NullLoggerFactory();
40+
$fileFactory = new LocalFileSystemFactory($dirManager);
41+
42+
return new PerExecutionContextFactory(
43+
$dirManager,
44+
$fileFactory,
45+
$loggerFactory
46+
);
47+
```
48+
{% endcapture %}
49+
{% include block/etl-step.html code=code description=description %}
50+
51+
{% capture description %}
52+
The execution is identified with objects of type ExecutionInterface set on the processor:
53+
{% endcapture %}
54+
{% capture code %}
55+
```php
56+
$options = [
57+
'etl' => [
58+
'execution' => new PockExecution(new DateTime())
59+
]
60+
];
61+
62+
$chainProcessor->process(
63+
new ArrayIterator([[]]),
64+
$options
65+
);
66+
```
67+
{% endcapture %}
68+
{% include block/etl-step.html code=code description=description %}
69+
70+
Executing this will create a directory in `var/` with the output result. Everytime you execute the chain a new
71+
directory wil be created.
72+

docs/doc/01-understand-the-etl/the-concept.md

Lines changed: 56 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,61 @@ so a GroupedItem can not be in the input of an operation, they can only be the o
3737

3838
You can find the list of all native item types [here](doc/01-understand-the-etl/item-types.html).
3939

40+
41+
### How does it works
42+
43+
We will have more detailed real use cases with sample data a bit further in the document.
44+
45+
{% capture column1 %}
46+
In the simplest case the chains receive an iterator containing 2 items in input, both items are processed by each chain operation.
47+
This could be for example a list of customer. Each operation changes the items.
48+
images/concept-flows
49+
{% endcapture %}
50+
{% capture column2 %}
51+
![rr](/assets/images/concept-flows/flow-1.png)
52+
{% endcapture %}
53+
{% include block/2column.html column1=column1 column2=column2 %}
54+
55+
{% include block/divider.html %}
56+
57+
{% capture column1 %}
58+
In the following example the iterator sends a single item. The first operation will then send GroupedItems containing 2 items.
59+
The first item could be a customer, and then we fetch each order of the customer in the operation1.
60+
{% endcapture %}
61+
{% capture column2 %}
62+
![rr](/assets/images/concept-flows/flow-2.png)
63+
{% endcapture %}
64+
{% include block/2column.html column1=column1 column2=column2 %}
65+
66+
{% include block/divider.html %}
67+
68+
{% capture column1 %}
69+
We can also group items, to make aggregations. The chain receives an iterator containg 2 items, the first operation processes both items.
70+
It breaks the chain for the first item, and returns an aggregation of item1 & item 2.
71+
This can be used to count the number of customers. This kind of grouping can use more memory and should therefore be used with care.
72+
{% endcapture %}
73+
{% capture column2 %}
74+
![rr](/assets/images/concept-flows/flow-3.png)
75+
{% endcapture %}
76+
{% include block/2column.html column1=column1 column2=column2 %}
77+
78+
{% include block/divider.html %}
79+
80+
{% capture column1 %}
81+
Chains can also be split, this would allow 2 different operations to be executed on the same item.
82+
{% endcapture %}
83+
{% capture column2 %}
84+
![rrr](/assets/images/concept-flows/flow-4.png)
85+
{% endcapture %}
86+
{% include block/2column.html column1=column1 column2=column2 %}
87+
88+
{% include block/divider.html %}
89+
90+
The split operations is among the building blocks of complex executions. There are additional operations to merge
91+
multiple branches or to repeat a part of the chain.
92+
93+
94+
4095
## Example: Simple CSV Transformation
4196

4297
To demonstrate PHP-ETL’s capabilities, let’s walk through a basic example where we read a CSV file,
@@ -148,7 +203,7 @@ $chainProcessor->process(
148203
#### 🎵 Symfony
149204
For instance, the following command will process two input files and merge their output:
150205
```bash
151-
./bin/console etl:execute myetl.yaml "['./customers1.csv', './customers2.csv']"
206+
./bin/console etl:execute myetl "['./customers1.csv', './customers2.csv']"
152207
```
153208
{% endcapture %}
154209

docs/index.md

Lines changed: 80 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ subTitle:
77
## What is PHP-ETL
88

99
PHP-ETL is the go-to library for executing complex data import, export, and transformation tasks within PHP applications.
10-
It offers seamless integrations with the [Symfony Framework](https://symfony.com/), [Sylius](https://sylius.com/fr/) , and can easily be extended to
10+
It offers seamless integrations with the [🎵 Symfony Framework](https://symfony.com/), [🦢 Sylius](https://sylius.com/fr/) , and can easily be integrated to
1111
other CMS and &frameworks, making it ideal for handling intricate data workflows with ease.
1212

1313
## Why PHP-ETL
@@ -29,6 +29,84 @@ PHP-ETL handles asynchronous operations—such as API calls—natively, allowing
2929
like loading data into the database while making API calls. The library also supports visualizing data flows
3030
through auto-generated diagrams, making complex workflows easier to understand and manage.
3131

32-
## A screenshot
32+
## A execution tree
33+
34+
{% capture mermaid %}
35+
flowchart TD
36+
37+
subgraph Execution
38+
%% Nodes
39+
0B(Extract Get Article API Params Data<br/><br/>2<i class="sign in alternate icon"></i> / 2<i class="sign out alternate icon"></i><br/>00:00.064<i class="hourglass half icon"></i>)
40+
style 0B fill:#EEE;
41+
1B(Get products/articles until api stop's<br/><br/>2<i class="sign in alternate icon"></i> / 2<i class="sign out alternate icon"></i><br/>00:00.000<i class="hourglass half icon"></i>)@{ shape: hex}
42+
subgraph 1S[Get articles until api stop's]
43+
100B(Make get Article API call<br/><br/>4<i class="sign in alternate icon"></i> / 1<i class="clock icon"></i> / 0<i class="sign out alternate icon"></i><br/>00:05.243<i class="hourglass half icon"></i>)
44+
style 100B fill:#ffe294;
45+
end
46+
style 1B fill:#EEE;
47+
2B(Write api response to file to keep history<br/><br/>4<i class="sign in alternate icon"></i> / 4<i class="sign out alternate icon"></i><br/>00:00.057<i class="hourglass half icon"></i>)
48+
style 2B fill:#EEE;
49+
3B(Split response<br/><br/>5<i class="sign in alternate icon"></i> / 5<i class="sign out alternate icon"></i><br/>00:00.008<i class="hourglass half icon"></i>)
50+
style 3B fill:#EEE;
51+
4B(Map Api fields with Sylius attributes code<br/><br/>2085<i class="sign in alternate icon"></i> / 2085<i class="sign out alternate icon"></i><br/>00:01.482<i class="hourglass half icon"></i>)
52+
style 4B fill:#EEE;
53+
5B(Branch to handle attribute option values & product imports<br/><br/>2085<i class="sign in alternate icon"></i> / 2085<i class="sign out alternate icon"></i><br/>04:28.817<i class="hourglass half icon"></i>)@{ shape: hex}
54+
subgraph 5S[Branch to handle attribute option values & product imports]
55+
500B(Split each attribute items<br/><br/>2085<i class="sign in alternate icon"></i> / 2085<i class="sign out alternate icon"></i><br/>00:00.248<i class="hourglass half icon"></i>)
56+
style 500B fill:#EEE;
57+
501B(Load Attribute from database<br/><br/>89571<i class="sign in alternate icon"></i> / 89571<i class="sign out alternate icon"></i><br/>00:46.995<i class="hourglass half icon"></i>)
58+
style 501B fill:#EEE;
59+
502B(Add new choices to select attributes<br/><br/>89571<i class="sign in alternate icon"></i> / 2<i class="sign out alternate icon"></i><br/>00:09.363<i class="hourglass half icon"></i>)
60+
style 502B fill:#EEE;
61+
503B(Persist attribute<br/><br/>2<i class="sign in alternate icon"></i> / 2<i class="sign out alternate icon"></i><br/>00:00.001<i class="hourglass half icon"></i>)
62+
style 503B fill:#EEE;
63+
510B(Flush Doctrine before importing products<br/><br/>2085<i class="sign in alternate icon"></i> / 2085<i class="sign out alternate icon"></i><br/>00:00.961<i class="hourglass half icon"></i>)
64+
style 510B fill:#EEE;
65+
511B(Load Product from database<br/><br/>2085<i class="sign in alternate icon"></i> / 2085<i class="sign out alternate icon"></i><br/>00:00.904<i class="hourglass half icon"></i>)
66+
style 511B fill:#EEE;
67+
512B(Create or Update product<br/><br/>2085<i class="sign in alternate icon"></i> / 2085<i class="sign out alternate icon"></i><br/>00:27.247<i class="hourglass half icon"></i>)
68+
style 512B fill:#EEE;
69+
513B(Add price to product<br/><br/>2085<i class="sign in alternate icon"></i> / 2085<i class="sign out alternate icon"></i><br/>00:01.651<i class="hourglass half icon"></i>)
70+
style 513B fill:#EEE;
71+
514B(Persist entities<br/><br/>2085<i class="sign in alternate icon"></i> / 2085<i class="sign out alternate icon"></i><br/>00:00.338<i class="hourglass half icon"></i>)
72+
style 514B fill:#EEE;
73+
515B(Flush entities<br/><br/>2085<i class="sign in alternate icon"></i> / 2085<i class="sign out alternate icon"></i><br/>00:02.117<i class="hourglass half icon"></i>)
74+
style 515B fill:#EEE;
75+
516B(Clear doctrine<br/><br/>2085<i class="sign in alternate icon"></i> / 2085<i class="sign out alternate icon"></i><br/>00:00.213<i class="hourglass half icon"></i>)
76+
style 516B fill:#EEE;
77+
517B(Prepare data for Set association product API<br/><br/>2085<i class="sign in alternate icon"></i> / 2085<i class="sign out alternate icon"></i><br/>00:00.201<i class="hourglass half icon"></i>)
78+
style 517B fill:#EEE;
79+
518B(Set Sylius Product ID association - API call<br/><br/>2085<i class="sign in alternate icon"></i> / 2<i class="sign out alternate icon"></i><br/>00:00.687<i class="hourglass half icon"></i>)
80+
style 518B fill:#EEE;
81+
519B(Log association response<br/><br/>2085<i class="sign in alternate icon"></i> / 4168<i class="sign out alternate icon"></i><br/>00:00.012<i class="hourglass half icon"></i>)
82+
style 519B fill:#EEE;
83+
end
84+
style 5B fill:#EEE;
85+
%% Links
86+
0B --> 1B
87+
1B --> 100B
88+
1B --> 2B
89+
1S ~~~ 2B
90+
2B --> 3B
91+
3B --> 4B
92+
4B --> 5B
93+
5B --> 500B
94+
500B --> 501B
95+
501B --> 502B
96+
502B --> 503B
97+
5B --> 510B
98+
510B --> 511B
99+
511B --> 512B
100+
512B --> 513B
101+
513B --> 514B
102+
514B --> 515B
103+
515B --> 516B
104+
516B --> 517B
105+
517B --> 518B
106+
518B --> 519B
107+
end
108+
{% endcapture %}
109+
110+
{% include block/mermaid.html mermaid=mermaid %}
33111

34112

0 commit comments

Comments
 (0)