Skip to content

Commit 66584ab

Browse files
committed
Docs
1 parent 24bfe57 commit 66584ab

16 files changed

+272
-159
lines changed

docs/guide/dataset.md renamed to docs/guide/datasets.md

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -12,8 +12,8 @@ auto col = ds.read(dataset::column<DTYPE>(NAME));
1212
```
1313

1414
```{seealso}
15-
- [`dataset::reader`](#dataset-reader)
16-
- [`dataset::column`](#dataset-column)
15+
- [`dataset::source`](#dataset-source) and [`dataset::reader`](#dataset-reader)
16+
- [`column::reader`](#column-reader)
1717
```
1818

1919
## Loading-in a dataset
@@ -58,11 +58,15 @@ An input dataset should be loaded-in *once*, i.e. make sure whatever columns bei
5858

5959
## Working with multiple datasets
6060

61-
A dataflow can load multiple datasets (of different input formats) into one dataflow.
61+
A dataflow can load multiple datasets of different input formats into one dataflow.
6262

63-
:::{card} Loading JSON and CSV datasets side-by-side.
63+
:::{card}
64+
:text-align: center
65+
<!-- :::{topic} JSON and CSV side-by-side. -->
6466
```{image} ../images/json_csv.png
6567
```
68+
+++
69+
JSON and CSV side-by-side.
6670
:::
6771

6872
```{code} cpp

docs/guide/index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
:maxdepth: 2
55
66
dataflow
7-
dataset
7+
datasets
88
columns
99
selections
1010
queries
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
(column-definition)=
2+
# column::definition
3+
4+
```{eval-rst}
5+
.. doxygenclass:: queryosity::column::definition< Out(Ins...)>
6+
:project: queryosity
7+
:members:
8+
```

docs/references/abc/column_reader.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
(column-reader)=
2+
# column::reader
3+
4+
```{eval-rst}
5+
.. doxygenclass:: queryosity::column::reader
6+
:project: queryosity
7+
:members:
8+
```

docs/references/abc/dataset_column.md

Lines changed: 0 additions & 8 deletions
This file was deleted.

docs/references/abc/dataset_source.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
(dataset-source)=
2+
# dataset::source
3+
4+
```{eval-rst}
5+
.. doxygenclass:: queryosity::dataset::source
6+
:project: queryosity
7+
:members:
8+
```

docs/references/abc/index.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,9 @@
55
```{toctree}
66
:maxdepth: 2
77
8-
dataset_column.md
8+
dataset_source.md
99
dataset_reader.md
10+
column_reader.md
11+
column_definition.md
12+
query_definition.md
1013
```
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
(query-definition)=
2+
# query::definition
3+
4+
```{eval-rst}
5+
.. doxygenclass:: queryosity::query::definition< Out(Ins...)>
6+
:project: queryosity
7+
:members:
8+
```
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
# column::observable
2+
3+
```{eval-rst}
4+
.. doxygenclass:: queryosity::column::observable
5+
:project: queryosity
6+
:members:
7+
```

docs/references/api/index.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,8 @@
33
```{toctree}
44
:maxdepth: 2
55
6-
dataflow
7-
lazy
6+
dataflow.md
7+
lazy.md
8+
todo.md
9+
column_observable.md
810
```

docs/references/api/todo.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
# todo
2+
3+
```{eval-rst}
4+
.. doxygenclass:: queryosity::todo
5+
:project: queryosity
6+
:members:
7+
```

docs/start/conceptual.md

Lines changed: 75 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -2,12 +2,16 @@
22

33
## Dataflow
44

5-
A `dataflow` consists of a directed, acyclic graph of tasks performed for each entry.
5+
Dataflow
6+
: A directed, acyclic graph of task performed for each entry of a tabular dataset.
7+
![dataflow](../images/dataflow.png)
68

7-
![dataflow](../images/dataflow.png)
9+
Action
10+
: A node in the dataflow belonging to one of three task sub-graphs.
811

9-
An action is a node belonging to one of three task sub-graphs, each of which are associated with a set of applicable methods.
10-
Actions of each task graph can receive ones of the previous graphs as inputs:
12+
***
13+
14+
Actions of each task sub-graph belongs to its own sub-type, and can receive actions from the previous graph(s) as inputs:
1115

1216
| Action | Description | Methods | Description | Task Graph | Inputs |
1317
| :--- | :-- | :-- | :-- | :-- | :-- |
@@ -22,8 +26,12 @@ Actions of each task graph can receive ones of the previous graphs as inputs:
2226

2327
## Lazy actions
2428

25-
All actions are *lazy*, meaning they are not executed them unless required.
26-
Accessing the result of a query turns it and all other actions *eager*, triggering the dataset traversal.
29+
Lazy action
30+
: An action that is not performed, i.e. initialized/executed/finalized, unless requested by the user.
31+
32+
***
33+
34+
Accessing the result of a lazy query turns it and all other actions *eager*, triggering the dataset traversal.
2735
The eagerness of actions in each entry is as follows:
2836

2937
1. A query is performed only if its associated selection passes the cut.
@@ -32,65 +40,91 @@ The eagerness of actions in each entry is as follows:
3240

3341
## Columns
3442

35-
A `column` holds a value of some data type `T` to be updated for each entry.
36-
Columns that are read-in from a dataset or user-defined constants are *independent*, i.e. their values do not depend on others, whereas columns evaluated out of existing ones as inputs are *dependent*.
37-
The tower of dependent columns evaluated out of more independent ones forms the computation graph:
43+
Column
44+
: An action that holds a value of some data type `T` to be updated for each entry.
45+
46+
Independent column
47+
: A column whose value does not depend on others
48+
49+
Dependent column
50+
: A column whose value is evaluated out of those from other columns as inputs.
51+
52+
***
53+
54+
The tower of dependent columns can be constructed to form the computation graph:
3855

56+
:::{card}
57+
:text-align: center
3958
![computation](../images/computation.png)
59+
+++
60+
Example computation graph.
61+
:::
4062

4163
Only the minimum number of computations needed are performed for each entry:
42-
- If and when a column value is computed for an entry, it is cached and never re-computed.
43-
- A column value is not copied when used as an input for dependent columns.
44-
- It *is* copied if a conversion is required.
64+
- A column value is computed *once* for an entry (if needed), then cached and never re-computed.
65+
- A column value is not copied when used as an input for dependent columns (unless a conversion is needed).
4566

4667
## Selections
4768

48-
A `selection` represents a scalar-valued decision made on an entry:
49-
50-
- A boolean `cut` to determine if a query should be performed for a given entry.
69+
Selection
70+
: A scalar-valued column corresponding to a "decision" on an entry:
71+
- A boolean `cut` to determine if a query should be performed for the entry.
5172
- A series of two or more cuts becomes their intersection, `and`
52-
- A floating-point `weight` to assign a statistical significance to the entry.
73+
- A floating-point `weight` to assign a statistical significance to the entry.
5374
- A series of two or more weights becomes to their product, `*`.
5475

55-
A cutflow can have from the following types connections between selections:
76+
***
5677

57-
![cutflow](../images/cutflow.png)
78+
A cutflow can contain the following types of connections between selections:
5879

5980
- Applying a selection from an existing node, which determines the order in which they are compounded.
6081
- Branching selections by applying more than one selection from a common node.
6182
- Merging two selections, e.g. taking the union/intersection of two cuts.
6283

63-
Selections constitute a specific type of columns; as such, they are subject to the value-caching and evaluation behaviour of the computation graph.
64-
Addditionally, the cutflow imposes the following rules on them:
84+
:::{card}
85+
:text-align: center
86+
![cutflow](../images/cutflow.png)
87+
+++
88+
Example cutflow.
89+
:::
90+
91+
Selections constitute a specific type of columns, so they are subject to the lazy-evaluation and value-caching behaviour of the computation graph.
92+
Addditionally, the cutflow imposes the following rules:
6593
- The cut at a selection is evaluated only if all previous cuts have passed.
6694
- The weight at a selection is evaluated only if its cut has passed.
6795

6896
## Queries
6997

70-
A `query` specifies an output result obtained from counting entries of the dataset.
71-
For multithreaded runs, the user must also define how outputs from individual threads should be merged together to yield a result representative of the full dataset.
72-
73-
- It must be associated with a selection whose cut determines which entries to count.
98+
Query
99+
: An action that outputs result of some data type `T` after traversing the dataset.
100+
- It must be associated with a selection whose cut determines which entries to count.
74101
- (Optional) The result is populated with the weight taken into account.
75-
- How an entry populates the query depends on its implementation.
102+
- How the query counts an entry is a user-implemented arbitrary action.
76103
- (Optional) The result is populated based on values of inputs columns.
77104

78-
Two common workflows exist in associating queries with selections:
105+
***
106+
107+
:::{card}
108+
:text-align: center
109+
```{image} ../images/query_1.png
110+
+++
111+
Making, filling, and booking a query.
112+
:::
79113
80-
@image html query_1.png "Running a single query at multiple selections."
114+
## Systematic variations
81115
82-
@image html query_2.png "Running multiple queries at a selection."
116+
Systematic variation
117+
: A change in a column value that affects the outcomes of associated selections and queries.
83118
84-
@section conceptual-variations Systematic variations
119+
***
85120
86-
A sensitivity analysis means to study how changes in the system's inputs affect the output.
87-
In the context of dataset queries, a **systematic variation** constitutes a __change in a column value that affects the outcome of selections and queries__.
121+
A sensitivity analysis means to study how changes in the system's inputs affect its output.
122+
In the context of a dataflow, the inputs are column values and outputs are query results.
88123
89-
Encapsulating the nominal and variations of a column creates a `varied` node in which each variation is mapped by the name of its associated systematic variation.
90-
A varied node can be treated functionally identical to a non-varied one, with all nominal+variations being propagated through the relevant task graphs implicitly:
124+
The nominal and variations of a column can be encapsulted within a *varied* node, which can be treated functionally identical to a nominal-only one except that all nominal+variations are propagated through downstream actions implicitly:
91125
92126
- Any column definitions and selections evaluated out of varied input columns will be varied.
93-
- Any queries performed filled with varied input columns and/or at varied selections will be varied.
127+
- Any queries performed with varied input columns and/or at varied selections will be varied.
94128
95129
The propagation proceeds in the following fashion:
96130
@@ -99,6 +133,12 @@ The propagation proceeds in the following fashion:
99133
100134
All variations are processed at once in a single dataset traversal; in other words, they do not incur any additional runtime overhead other than what is needed to perform the actions themselves.
101135
102-
@image html variation.png "Propagation of systematic variations."
136+
:::{card}
137+
:text-align: center
138+
```{image} ../images/variation.png
139+
```
140+
+++
141+
Propagation of systematic variations on $z = x+y$.
142+
:::
103143

104144
@see @ref guide

include/queryosity/column_definition.h

Lines changed: 0 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -26,13 +26,7 @@ class column::definition<Out(Ins...)> : public column::calculation<Out> {
2626
using obstuple_type = std::tuple<observable<Ins>...>;
2727

2828
public:
29-
/**
30-
* @brief Default constructor.
31-
*/
3229
definition() = default;
33-
/**
34-
* @brief Default destructor.
35-
*/
3630
virtual ~definition() = default;
3731

3832
public:

0 commit comments

Comments
 (0)