Docs

taehyounpark · Nov 11, 2023 · 22e6680 · 22e6680
1 parent b59cfd9
commit 22e6680
Show file tree

Hide file tree

Showing 3 changed files with 50 additions and 43 deletions.
diff --git a/docs/features/aggregation/aggregation.md b/docs/features/aggregation/aggregation.md
@@ -1,11 +1,10 @@
 ## Create
 
-Aggregations are defined analogous as for custom column definitions, i.e. by providing its concrete type+constructor arguments:
+Aggregations are defined analogous as for custom column definitions, i.e. by providing its concrete type and constructor arguments:
 ```cpp
 auto hist = df.agg<Histogram<float>>(LinearAxis(100,0,100));
 ```
 
-
 ## Fill
 
 An aggregation must be `fill()`ed with input columns of matching dimensionality:
@@ -25,8 +24,7 @@ An aggregation can be `fill()`ed as many times as needed:
 ```cpp title="Filling a histogram twice per-entry"
 auto hist_xy = df.agg<Histogram<float>>("x_and_y",100,0,100).fill(x).fill(y);
 ```
-
-!!! warning "Make sure to get the returned booker"
+<!-- !!! warning "Make sure to get the returned booker"
 
     Reminder: each (chained) method returns a new node with the lazy action booked.
     In other words, make sure to obtain and use the returned aggregation for the columns to be actually filled!
@@ -43,36 +41,39 @@ auto hist_xy = df.agg<Histogram<float>>("x_and_y",100,0,100).fill(x).fill(y);
     auto hbins = df.agg<Histogram<float>>(LinearAxis(100,0,100));
     auto hx = hbins.fill(x);
     auto hy = hbins.fill(y);
-    ```
-
+    ``` -->
 
 ## Book
 
 An aggregation can be booked at (multiple) selection(s):
 
 === "One selection"
     ```cpp
-    hx_c = df.agg<Histogram<float>>(LinearAxis(100,0,100))
-             .fill(x)
-             .book(c);
+    hx = df.agg<Histogram<float>>(LinearAxis(100,0,100)).fill(x);
+
+    hx_a = hx.book(sel_a);
     ```
 === "Multiple selections"
     ```cpp
-    hxs = df.agg<Histogram<float>>(LinearAxis(100,0,100))
-              .fill(x)
-              .book(a, b, c);
+    hx = df.agg<Histogram<float>>(LinearAxis(100,0,100)).fill(x);
+    hx_abc = hx.book(a, b, c);
+    hx_a = hx_abc["a"];
     ```
 
-### From a selection
-
 When multiple aggregations are booked from a selection, they must be individually unpacked since aggregations can be of different types:
 
-```cpp
-auto hx = df.agg<Histogram<float>>("x",axis::regular(10,0,10));
-auto hxy = df.agg<Histogram<float,float>>("xy",axis::regular(10,0,10),axis::regular(10,0,10));
+=== "One aggregation"
+    ```cpp
+    auto hx = df.agg<Histogram<float>>("x",LinearAxis(10,0,10));
 
-auto [hx_a, hxy_a] = sel_a.book(hx, hxy);
-```
+    auto hx_a = sel_a.book(hx);
+    ```
+=== "Multiple aggregations"
+    ```cpp
+    auto hx = df.agg<Histogram<float>>("x",LinearAxis(10,0,10));
+    auto hxy = df.agg<Histogram<float,float>>("xy",LinearAxis(10,0,10),LinearAxis(10,0,10));
+    auto [hx_a, hxy_a] = sel_a.book(hx, hxy);
+    ```
 
 ## Access result(s)
 
@@ -88,5 +89,5 @@ The aggregation behaves as a pointer to its result, which is automatically trigg
 hist->at(0);  // equivalent to hist.result()->at(0);
 ```
 
-!!! note
-    Calling the result of any one aggregation triggers the dataset processing of *all*.
+!!! info
+    Calling the result of any one aggregation triggers the dataset processing of *all* booked up to that point.
diff --git a/docs/features/selection/selection.md b/docs/features/selection/selection.md
@@ -5,23 +5,22 @@
 A selection can be applied by providing a "path" string and a column that correspond to the decision:
 ```{ .cpp .annotate }
 auto decision = ds.read<bool>("decision");
-auto cut_applied = df.filter("cut_on_decision")(decision);
+auto filtered = df.filter("cut_on_decision")(decision);
 ```
 
 Alternatively, one can apply a weight as:
 ```{ .cpp .annotate }
 auto w = ds.read<float>("weight");
-auto cut_and_weighted = cut_applied.weight("weight")(w);
+auto filtered_n_weighted = filtered.weight("weight")(w);
 ```
 
-!!! note "Compounding selections"
-
-    Notice that the second `weight` was called from a selection node, not the dataflow object; this compounds those two cuts, which is in this case:
+!!! info 
+    Calling a subsequent `filter()/weight()` operation from an existing selection node compound it on top of the chain.
+    In the example above, the final cut and weight decisions are:
     ```cpp
     cut = decision && true;
     weight = 1.0 * weight;
     ```
-    will be applied for each entry.
 
 Any valid column can enter as an argument; alternatively, an expression can be provided as an optional argument:
 === "This is equivalent..."
@@ -37,24 +36,33 @@ Any valid column can enter as an argument; alternatively, an expression can be p
       })(entry_index);
     ```
 
-## Example cutflow
+## Branching
 
-The example cutflow from the previous section can be expressed as the following:
+Selections can branch out from a common one as:
 ```{ .cpp .annotate }
+auto inclusive = df.filter("inclusive")(df.constant(true));
+auto filtered_a = inclusive.filter("a")(a);
+auto filtered_b = inclusive.filter("b")(b);
+```
 
-auto cut_inclusive = df.filter("inclusive")(inclusive);
+## (Advanced) Joining
 
-auto weighted_w = inclusive.weight("w")(w);
-auto channel_a = weighted_w.channel("a")(a);
-auto channel_b = weighted_w.channel("b")(b);
+Consider an arbitrary set of selections in a cutflow. Taking the AND/OR of them is commonly required, including scenarios such as:
 
-auto region_a = channel_a.filter("region")(r);
-auto region_b = channel_b.filter("region")(r);
+- AND: Studying overlap between two regions.
+- OR: Consolidating two non-orthogonal signal regions into one.
 
-auto region_z = inclusive.filter("x")(x).weight("y")(y).filter("z")(z);
-```
+In other libraries, these typically must be done by the error-prone and arduous approach of defining a separate branch of selections, or sometimes even re-structuring of the entire cutflow.
+Here, they can be easily done "post-selection" as:
 
-## (Advanced) Joining selections
+```cpp
+auto even_entries = df.filter("even")(entry_number % df.constant(2));
+auto odd_entries = df.filter("odd")(!(entry_number % df.constant(2)));
+auto third_entries = df.filter("third")(entry_number % df.constant(3));
+
+auto all_entries = df.filter("all")(even_entries || odd_entries);
+auto sixth_entries = df.filter("sixth")(even_entries && third_entries);
+```
 
-!!! tip "Selections are columns"
-    A selection is just a special type of column whose output value is its decision.
+!!! tip
+    Treat a selection as a column whose output value is its decision; with that in mind, it is clear as to why the above example should work so simply.
diff --git a/docs/features/sensitivity/variation.md b/docs/features/sensitivity/variation.md
@@ -1,7 +1,5 @@
 To perform a sensitivity analysis means to determine how variations in the input of a system affect its output.
 
-## Systematic variation
-
 A **systematic variation** constitutes a __change in a column value that affects the outcome of selections and aggregations__.
 Processing these them within a single dataflow object offers the following benefits over applying them independently:
 
@@ -17,7 +15,7 @@ Any column can be varied with an alternate constructor of the same type, which t
 | `equation` | Callable expression + input columns |
 | `definition`/`representation` | Constructor arguments + input columns + direct-access |
 
-## Propagation of variations
+# Propagation of variations
 
 Nominal and varied operations are automatically carried forward during an analysis, meaning: