Docs

taehyounpark · Oct 23, 2023 · e062a40 · e062a40
1 parent dfcf5f2
commit e062a40
Show file tree

Hide file tree

Showing 6 changed files with 18 additions and 20 deletions.
diff --git a/README.md b/README.md
@@ -20,10 +20,10 @@ Its key features include:
 
 ## Design goals
 
-- **Clear interface.** Higher-level languages have a myriad of libraries available to do intuitive and efficient data analysis. The syntax here aims to achieve a similar level of abstraction in its own way.
-- **Interface-only.** No implementation of a data formats or aggregation output is provided out-of-the-box. Instead, the interface allows defining operations with arbitrary inputs, execution, and outputs as needed.
-- **Sensitivity analysis.** Often times, changes to an analysis need to be explored for sensitivity analysis. How many times has this required the dataset to be re-processed? With built-in handling of systematic variations, changes can their impacts retrieved all together.
-- **Computational efficiency.** All operations within the dataset processing is performed at most once per-entry, only when needed. All systematic variations are processed at once. The dataset processing is multithreaded for thread-safe plugins.
+- **Clear interface.** Higher-level languages have an abundance of available libraries to do intuitive and efficient data analysis. An interface with a similar level of abstraction with modern C++ syntax.
+- **Customizable plugins.** Arbitrary operations with custom input(s), execution, and output(s) receive first-class treatment. From non-trivial datasets to complex computations and aggregations, there is an ABC available for implementation.
+- **Sensitivity analysis.** With built-in handling of systematic variations, changes in operations can be processed *once* to retrieve all results under nominal and varied scenarios simultaneously.
+- **Computational efficiency.** Operations within the dataset processing are performed at most once per-entry and only when needed. If enabled, the processing is multithreaded.
 
 ## Documentation
 
@@ -32,8 +32,6 @@ Its key features include:
 
 ## Installation
 
-Requirements: Unix OS, C++17
-
 ### [Single-header](https://raw.githubusercontent.com/taehyounpark/analogical/master/analogical.h)
 ```cpp
 #include "analogical.h"

diff --git a/docs/features/basic.md b/docs/features/basic.md
@@ -30,6 +30,6 @@ table th:nth-of-type(4) {
 | `selection` | A boolean/floating-point decision | `filter()` | Apply a cut. | 
 | | | `weight()` | Apply a statistical significance. |
 | | | `channel()` | Same as filter, but remember its "path". |
-| `aggregation` | Perform an action and output a result | `book()` | Book the creation of a result. |
-| | | `fill()` | Perform aggregation with column value(s). |
-| | | `at()` | Perform aggregation for entries passing the selection(s). |
+| `aggregation` | Perform an action and output a result | `agg()` | Create an aggregation. |
+| | | `fill()` | Fill with column value(s) of the entry. |
+| | | `book()` | Book execution for entries passing the selection(s). |
diff --git a/docs/features/column/column.md b/docs/features/column/column.md
@@ -1,4 +1,4 @@
-## Reading from dataset
+## Read from dataset
 
 Consider the following JSON data:
 ```json
@@ -84,7 +84,7 @@ Consider the following JSON data:
 It can be opened by a dataflow:
 ```{ .cpp .annotate } 
 #include <nlohmann/json.hpp>
-#include "analogical"
+#include "analogical.h"
 
 using dataflow = ana::dataflow;
 
@@ -109,7 +109,7 @@ auto [a, b, c] = df.open<ana::json>(data)\
 1.    Note the initializer braces around the column names.
 
 !!! info "Arbitrary column types"
-    The interface is agnostic (ignorant, to be more precise) to the underlying column data types.
+    The interface is completely agnostic to the underlying column data types.
     As long the `dataset::column` of a given arbitrary type is properly implemented, it can be used.
     Even in the "worst" case, explicit template specialization can be used to cherry-pick how to read a specific data type.
     ```cpp
@@ -122,7 +122,7 @@ auto [a, b, c] = df.open<ana::json>(data)\
     ```cpp
     auto x = ds.read<CustomData>("x");  // success!
     ```
-## Computing from dataflow
+## Compute quantities
 
 ### Simple expressions
 

diff --git a/docs/home/design.md b/docs/home/design.md
@@ -1,10 +1,10 @@
 ## Promises
 
-- **Clear interface.** Higher-level languages have an abundance of available libraries to do intuitive and efficient data analysis. The aim is to achieve a similar level of abstraction with modern C++ syntax.
-- **Customizable plugins.** Custom operations with arbitrary input(s), execution, and output(s) receive first-class treatment. From non-trivial datasets to complex computations and aggregations, there is an ABC that can be implemented.
-- **Sensitivity analysis.** When changes to select column(s) need to be explored for sensitivity analysis, they have often required the dataset to be re-processed each time. With built-in handling of systematic variations, the dataset is processed *once* to retrieve all results under nominal and varied scenarios together.
-- **Computational efficiency.** All operations within the dataset processing is performed at most once per-entry and only when needed. The dataset processing can be multithreaded for thread-safe operations.
+- **Clear interface.** Higher-level languages have an abundance of available libraries to do intuitive and efficient data analysis. An interface with a similar level of abstraction with modern C++ syntax.
+- **Customizable plugins.** Arbitrary operations with custom input(s), execution, and output(s) receive first-class treatment. From non-trivial datasets to complex computations and aggregations, there is an ABC available for implementation.
+- **Sensitivity analysis.** With built-in handling of systematic variations, changes in operations can be processed *once* to retrieve all results under nominal and varied scenarios simultaneously.
+- **Computational efficiency.** Operations within the dataset processing are performed at most once per-entry and only when needed. If enabled, the processing is multithreaded.
 
 ## What it is *not* suited for
 
-- Columnar analysis. `analogical` is **designed to handle non-trivial/highly-nested data types**, and the dataset processing is **inherently row-wise**. If an analysis can be expressed entirely in terms of by array(-esque) operations, e.g. [`awkward`](https://awkward-array.org/doc/main/), then those libraries with an indexing API and SIMD support will likely be cleaner and faster.
+- Columnar analysis. `analogical` is **designed to handle non-trivial/highly-nested data types**, and the dataset processing is **inherently row-wise**. If an analysis can be expressed entirely in terms of by array operations, then libraries with an index-based API (and SIMD support) will be cleaner (and faster).
diff --git a/docs/index.md b/docs/index.md
@@ -10,7 +10,7 @@ _**Ana**lysis **Logic** **A**bstraction **L**ayer_
 ![Version](https://img.shields.io/badge/Version-0.1.0-blue.svg)
 [![Ubuntu](https://github.com/taehyounpark/analogical/actions/workflows/ubuntu.yml/badge.svg?branch=master)](https://github.com/taehyounpark/analogical/actions/workflows/ubuntu.yml)
 [![macOS](https://github.com/taehyounpark/analogical/actions/workflows/macos.yml/badge.svg?branch=master)](https://github.com/taehyounpark/analogical/actions/workflows/macos.yml)
-[![Documentation](https://img.shields.io/badge/mkdocs-Documentation-blue.svg)](https://opensource.org/licenses/MIT)
+[![Documentation](https://img.shields.io/badge/Documentation-mkdocs-blue.svg)](https://opensource.org/licenses/MIT)
 [![MIT License](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
 
 `analogical` is a C++ library for dataset transformation.

diff --git a/include/ana/interface/dataset_column.h b/include/ana/interface/dataset_column.h
@@ -63,7 +63,7 @@ template <typename T> T const &ana::dataset::column<T>::value() const {
 template <typename T>
 void ana::dataset::column<T>::execute(const ana::dataset::range &part,
                                       unsigned long long entry) {
-  this->m_entry = entry;
   this->m_part = &part;
+  this->m_entry = entry;
   this->m_updated = false;
 }