Skip to content

Commit

Permalink
Docs
Browse files Browse the repository at this point in the history
  • Loading branch information
taehyounpark committed Nov 11, 2023
1 parent b59cfd9 commit 22e6680
Show file tree
Hide file tree
Showing 3 changed files with 50 additions and 43 deletions.
43 changes: 22 additions & 21 deletions docs/features/aggregation/aggregation.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,10 @@
## Create

Aggregations are defined analogous as for custom column definitions, i.e. by providing its concrete type+constructor arguments:
Aggregations are defined analogous as for custom column definitions, i.e. by providing its concrete type and constructor arguments:
```cpp
auto hist = df.agg<Histogram<float>>(LinearAxis(100,0,100));
```


## Fill

An aggregation must be `fill()`ed with input columns of matching dimensionality:
Expand All @@ -25,8 +24,7 @@ An aggregation can be `fill()`ed as many times as needed:
```cpp title="Filling a histogram twice per-entry"
auto hist_xy = df.agg<Histogram<float>>("x_and_y",100,0,100).fill(x).fill(y);
```

!!! warning "Make sure to get the returned booker"
<!-- !!! warning "Make sure to get the returned booker"
Reminder: each (chained) method returns a new node with the lazy action booked.
In other words, make sure to obtain and use the returned aggregation for the columns to be actually filled!
Expand All @@ -43,36 +41,39 @@ auto hist_xy = df.agg<Histogram<float>>("x_and_y",100,0,100).fill(x).fill(y);
auto hbins = df.agg<Histogram<float>>(LinearAxis(100,0,100));
auto hx = hbins.fill(x);
auto hy = hbins.fill(y);
```

``` -->

## Book

An aggregation can be booked at (multiple) selection(s):

=== "One selection"
```cpp
hx_c = df.agg<Histogram<float>>(LinearAxis(100,0,100))
.fill(x)
.book(c);
hx = df.agg<Histogram<float>>(LinearAxis(100,0,100)).fill(x);

hx_a = hx.book(sel_a);
```
=== "Multiple selections"
```cpp
hxs = df.agg<Histogram<float>>(LinearAxis(100,0,100))
.fill(x)
.book(a, b, c);
hx = df.agg<Histogram<float>>(LinearAxis(100,0,100)).fill(x);
hx_abc = hx.book(a, b, c);
hx_a = hx_abc["a"];
```

### From a selection

When multiple aggregations are booked from a selection, they must be individually unpacked since aggregations can be of different types:

```cpp
auto hx = df.agg<Histogram<float>>("x",axis::regular(10,0,10));
auto hxy = df.agg<Histogram<float,float>>("xy",axis::regular(10,0,10),axis::regular(10,0,10));
=== "One aggregation"
```cpp
auto hx = df.agg<Histogram<float>>("x",LinearAxis(10,0,10));

auto [hx_a, hxy_a] = sel_a.book(hx, hxy);
```
auto hx_a = sel_a.book(hx);
```
=== "Multiple aggregations"
```cpp
auto hx = df.agg<Histogram<float>>("x",LinearAxis(10,0,10));
auto hxy = df.agg<Histogram<float,float>>("xy",LinearAxis(10,0,10),LinearAxis(10,0,10));
auto [hx_a, hxy_a] = sel_a.book(hx, hxy);
```

## Access result(s)

Expand All @@ -88,5 +89,5 @@ The aggregation behaves as a pointer to its result, which is automatically trigg
hist->at(0); // equivalent to hist.result()->at(0);
```
!!! note
Calling the result of any one aggregation triggers the dataset processing of *all*.
!!! info
Calling the result of any one aggregation triggers the dataset processing of *all* booked up to that point.
46 changes: 27 additions & 19 deletions docs/features/selection/selection.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,23 +5,22 @@
A selection can be applied by providing a "path" string and a column that correspond to the decision:
```{ .cpp .annotate }
auto decision = ds.read<bool>("decision");
auto cut_applied = df.filter("cut_on_decision")(decision);
auto filtered = df.filter("cut_on_decision")(decision);
```

Alternatively, one can apply a weight as:
```{ .cpp .annotate }
auto w = ds.read<float>("weight");
auto cut_and_weighted = cut_applied.weight("weight")(w);
auto filtered_n_weighted = filtered.weight("weight")(w);
```

!!! note "Compounding selections"

Notice that the second `weight` was called from a selection node, not the dataflow object; this compounds those two cuts, which is in this case:
!!! info
Calling a subsequent `filter()/weight()` operation from an existing selection node compound it on top of the chain.
In the example above, the final cut and weight decisions are:
```cpp
cut = decision && true;
weight = 1.0 * weight;
```
will be applied for each entry.

Any valid column can enter as an argument; alternatively, an expression can be provided as an optional argument:
=== "This is equivalent..."
Expand All @@ -37,24 +36,33 @@ Any valid column can enter as an argument; alternatively, an expression can be p
})(entry_index);
```

## Example cutflow
## Branching

The example cutflow from the previous section can be expressed as the following:
Selections can branch out from a common one as:
```{ .cpp .annotate }
auto inclusive = df.filter("inclusive")(df.constant(true));
auto filtered_a = inclusive.filter("a")(a);
auto filtered_b = inclusive.filter("b")(b);
```

auto cut_inclusive = df.filter("inclusive")(inclusive);
## (Advanced) Joining

auto weighted_w = inclusive.weight("w")(w);
auto channel_a = weighted_w.channel("a")(a);
auto channel_b = weighted_w.channel("b")(b);
Consider an arbitrary set of selections in a cutflow. Taking the AND/OR of them is commonly required, including scenarios such as:

auto region_a = channel_a.filter("region")(r);
auto region_b = channel_b.filter("region")(r);
- AND: Studying overlap between two regions.
- OR: Consolidating two non-orthogonal signal regions into one.

auto region_z = inclusive.filter("x")(x).weight("y")(y).filter("z")(z);
```
In other libraries, these typically must be done by the error-prone and arduous approach of defining a separate branch of selections, or sometimes even re-structuring of the entire cutflow.
Here, they can be easily done "post-selection" as:

## (Advanced) Joining selections
```cpp
auto even_entries = df.filter("even")(entry_number % df.constant(2));
auto odd_entries = df.filter("odd")(!(entry_number % df.constant(2)));
auto third_entries = df.filter("third")(entry_number % df.constant(3));

auto all_entries = df.filter("all")(even_entries || odd_entries);
auto sixth_entries = df.filter("sixth")(even_entries && third_entries);
```

!!! tip "Selections are columns"
A selection is just a special type of column whose output value is its decision.
!!! tip
Treat a selection as a column whose output value is its decision; with that in mind, it is clear as to why the above example should work so simply.
4 changes: 1 addition & 3 deletions docs/features/sensitivity/variation.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,5 @@
To perform a sensitivity analysis means to determine how variations in the input of a system affect its output.

## Systematic variation

A **systematic variation** constitutes a __change in a column value that affects the outcome of selections and aggregations__.
Processing these them within a single dataflow object offers the following benefits over applying them independently:

Expand All @@ -17,7 +15,7 @@ Any column can be varied with an alternate constructor of the same type, which t
| `equation` | Callable expression + input columns |
| `definition`/`representation` | Constructor arguments + input columns + direct-access |

## Propagation of variations
# Propagation of variations

Nominal and varied operations are automatically carried forward during an analysis, meaning:

Expand Down

0 comments on commit 22e6680

Please sign in to comment.