NIM simPLEX: A concise high-performance scientific Nim library (with CLI and Python binding) providing samplings, uniform grids, traversal graphs, and more in compositional (simplex) spaces, where traditional methods designed for euclidean spaces fail or otherwise become impractical.
Such spaces are considered when an entity can be split into a set of distinct components (a composition), and they play a critical role in many disciplines of science, engineering, and mathematics. For instance, in materials science, chemical composition refers to the way a material (or, more generally, matter) is split into distinct components, such as chemical elements, based on considerations such as fraction of atoms, occupied volume, or contributed mass. And in economics, portfolio composition may refer to how finite capital is split across assets, such as cash, equity instruments, real estate, and commodities, based on their monetary value.
If you have a GitHub account, you can get started with nimplex
very quickly by just clicking the button below to launch a CodeSpaces environment with everything installed (per instructions in Reproducible Installation section) and ready to go! From there, you can either use the CLI tool (as explained in CLI section) or import the library in Python (as explained in Usage in Python section) and start using it right away. Of course, it also comes with a full Nim compiler and VSCode IDE extensions for Nim, so you can efortlessely modify/extend the source code and re-compile it if you wish.
There are several easy ways to quickly get nimplex up and running on your system. The choice depends primarily on your preffered way of interacting with the library (CLI, Nim, or Python) and your system configuration.
The recommended way is compiling the library yourself, which may sound scary but is fairly easily and the whole process should not take more than a couple of minutes.
First, you need to install Nim language compiler which on most Unix (Linux/MacOS) systems is very straightforward.
-
On MacOS, assuming you have Homebrew installed, simply:
brew install nim
-
Using
conda
,miniconda
,mamba
, ormicromamba
cross-platform package manager:conda install -c conda-forge nim
-
On most Linux distributions, you should also be able to use your built-in package manager like
pacman
,apt
,yum
, orrpm
; however, the default channel/repository, especially on enterprise systems, may have an unsupported version (nim<2.0
). While we do testnimCSO
with1.6
versions too, your experience may be degraded, so you may want to update it or go with another option. -
You can, of course, also build it yourself from
nim
source code! It is relatively straightforward and fast compared to many other languages.
On Windows, you may consider using WSL
, i.e., Windows Subsystem for Linux, which is strongly recommended, interplays well with VS Code, and will let you act as if you were on Linux. If you need to use Windows directly, you can follow these installation instructions.
Then, you can use the bundled Nimble tool (package manager for Nim, similar to Rust's crate
or Python's pip
) to install two top-level nim
dependencies:
arraymancer
, which is a powerful N-dimensional array librarynimpy
which helps with the Python bindings.
It's a single command:
nimble install --depsOnly
or, explicitly:
nimble install -y arraymancer nimpy
Finally, you can clone the repository and compile nimplex
with:
git clone https://github.com/amkrajewski/nimplex
cd nimplex
nim c -r -d:release nimplex.nim --benchmark
which will compile the library and run a few benchmarks to make sure everything runs smoothly. You should then see a compiled binary file nimplex
in the current directory which exposes the CLI tool.
If you want to use the Python bindings, you can now compile the library with slightly different flags (depending on your system configuration) like so for Linux/MacOS:
nim c --d:release --threads:on --app:lib --out:nimplex.so nimplex
and you should see a compiled library file nimplex.so
in the current directory which can be immediately imported and used in Python as explained later. For Windows and other platforms, consult nimpy
documentation on what flags and formats should be used.
If you happen to be on one of the common systems (for which we auto-compile the binaries) and you do not need to modify anything in the source code, there is a good chance you can simply download the latest release from the nimplex GitHub repository and run the executable (nimplex / nimplex.exe) or Python library (nimplex.so / nimplex.pyd) directly just by placing it in your working directory and using it as:
- An interactive command line interface (CLI) tool, which will guide you through how to use it if you run it without any arguments like so (on Linux/MacOS):
or with a concise configuration defining the task type and parameters (explained later in Usage in Nim):
./nimplex
./nimplex -c IFP 3 10
- An compiled Python library for Unix, which you can import and use in your Python code like so:
and immediately use the functions provided by the library, as described in Usage in Python:
import nimplex
nimplex.simplex_internal_grid_fractional(dim=3, ndiv=10)
Note: Full technical discussion of methods and motivations is provided in the manuscript. The sections below are meant to provide a concise overview of the library's capabilities.
The library provides a growing number of methods specific to compositional (simplex) spaces:
-
Monte Carlo sampling is the simplest method conceptually, where points are rendomly sampled from a simplex. In low dimensional cases, this can be accomplished by sampling from a uniform distribution in (d-1)-Cartesian space and then rejecting points outside the simplex (left panel below). However, in this approach, the inefficiency growth is factorial with the dimensionality of the simplex space. Instead, some try to sample from a uniform distribution in (d)-Cartesian space and normalize the points to sum to 1, however, this leads to over-sampling in the center of each simplex dimension (middle panel below).
One can, however, fairly easily sample from a special case of Dirichlet distribution, as explained in the manuscript, which leads to uniform sampling in the simplex space (right panel below). Nimplex can sample around 10M points per second in 9-dimensional space on a modern CPU.
-
Simplex / Compositional Grids are a more structured approach to sampling, where all possible compositions quantized to a given resolution, like 1% for 100 divisions per dimension, are generated. This is useful for example when one wants to map a function over the simplex space. In total
N_S(d, n_d) = \binom{d-1+n_d}{d-1} = \binom{d-1+n_d}{n_d}
are generated, whered
is the dimensionality of the simplex space andn_d
is the number of divisions per dimension. Nimplex uses a modified version of NEXCOM algorithm to do that procedurally (see manuscript for details) and can generate around 5M points per second in 9-dimensional space on a modern CPU. A choice is given between generating the gird as a list of integer numbers of quantum units (left panel below) or as a list of fractional positions (right panel below). -
Internal Simplex / Compositional Grids are a modification of the above method, where only points inside the simplex, i.e. all components are present, are generated. This is useful in cases where, one cannot discard any component entirely, for instance, because manufacturing setup has minimum feed rate (leakage). Nimplex introduces a new algorithm to generate these points procedurally (see manuscript for details) based on further modification of NEXCOM algorithm.
In total
N_I(d, n_d) = \binom{n_d-1}{d-1}
are generated, critically without any performance penalty compared to the full grid, which can reach orders of magnitude whend
approachesn_d
. Similar to the full grid, a choice is given between generating the gird as a list of integer numbers of quantum units or as a list of fractional positions. -
Simplex / Compositional Graphs generation is the most critical capability, first introduced in the nimplex manuscript. They are created by using combinatorics and disocvered patterns to assign edges between all neighboring nodes during the simplex grid (graph nodes) generation process. Effectively, a traversal graph is generated, spanning all possible compositions (given a resolution) creating an extremely efficient representation of the problem space, which allows deployment of numerous graph algorithms.
Critically, unlike the O(N^2) distance-based graph generation methods, this approach scales linearly with the resulting number of nodes. Because of that, it is extremely efficient even in high-dimensional spaces, where the number of edges goes into trillions and beyond. Nimplex can both generate and find neighbors for around 2M points per second in 9-dimensional space on a modern CPU.
As explored in the manuscript, such representations, even of different dimensions, can can then be used to efficeintly encode complex problem spaces where some prior assumptions and knowledge are available. In the Example #2 from manuscript, inspired by problem of joining titanium with stainless steel in 10.1016/j.addma.2022.102649 using 3-component spaces, one encode 3 separate paths where some components are shared in predetermined fashion. This to efficiently encode the problem space in form of a structure graph (left panel below) and then use it to construct a single simplex graph complex (right panel below) as a single consistent structure.
With such graph representation, one can very easily deploy any scientific library for graph exploration, constrained and biased by models operating in the elemental space mapping
nimplex
provides. A neat and concise demonstration of this is provided in the02.AdditiveManufacturingPathPlanning.ipynb
underexamples
directory, where thermodynamic phase stability models constrain a 4-component (tetrahedral) design space existing in 7-component chemical space and property model related to yield strength (RMSAD) is used to bias designed paths towards objectives like property maximization or gradient minimization with extremely concise code simply modifying the weights on unidirectional edges in the graph. For instance, the figure below (approximately) depicts the shortest path through a subset of tetrahedron formed by solid solution phases, later stretched in space proportionally to RMDAS gradient magnitude.
Several other methods are in testing and will likely be added in the future releases. If you have any suggestions, please open an issue on GitHub as we are always soliciting new ideas and use cases based on real-world problems in the scientific computing community.
Usage within Nim is fairly straightforward. You can install it using Nimble as explained earlier, or install it directly from GitHub, making sure to use the slightly modified @#nimble
branch:
nimble install -y https://github.com/amkrajewski/nimplex@#nimble
or, if you wish to modify the source code, you can simply download the core file nimplex.nim
and place it in your own code, as long as you have the dependencies installed, since it is standalone.
Then simply follow the API documentation (amkrajewski.github.io/nimplex) which goes over all core functions and extra utilities like nimplex/utils/plotting and nimplex/utils/stitching.
To use the library in Python, you can interact with it just like any other Python library. All input/output types are native Python types, so no additional conversion is necessary!. Once you have the library installed and imported,
simply follow the API documentation, with an exception that you need to add _py
to the function names. If you happen to forget adding _py
, the Python interpreter will throw an error with a suggestion to do so. A couple of additional conveninece functions are listed under nimplex/#usage-in-python.
Using Nimplex through the CLI relies on the same core library, but provides a simple interface for users who do not want to write any code. It can be used interactively, where the user is guided through the configuration process by just running the executable without any arguments:
./nimplex
Or it can be run with a concise configuration defining the task type and parameters. The configuration is a 3-letter string and 2-3 additional parameters, as explained below.
- 3-letter configuration:
- Grid type or uniform random sampling:
- F: Full grid (including the simplex boundary)
- I: Internal grid (only points inside the simplex)
- R: Random/Monte Carlo uniform sampling over simplex.
- G: Graph (list of grid nodes and list of their neighbors)
- Fractional or Integer positions:
- F: Fractional grid/graph (points are normalized to fractions of 1)
- I: Integer grid/graph (points are integers)
- Print full result, its shape, or persist in a file:
- P: Print (presents full result as a table)
- S: Shape (only the shape / size information)
- N: Persist to NumPy array file ("nimplex_.npy" or optionally a custom path as an additonal argument)
- Grid type or uniform random sampling:
- Simplex Dimensions / N of Components: An integer number of components in the simplex space.
- N Divisions per Dimension / N of Samples: An integer number of either:
- Divisions per each simplex dimension for grid or graph tasks (F/I/G__)
- Number of samples for random sampling tasks (R__)
- (optional) NumPy Array Output Filename: A custom path to the output NumPy array file (only for __N tasks).
For instance, to generate a 3-dimensional internal fractional grid with 10 divisions per dimension and persist it to a NumPy array file, you can run:
./nimplex -c IFN 3 10
and the output will be saved to nimplex_IF_3_10.npy
in the current directory. If you want to save it to a different path, you can provide it as an additional argument:
./nimplex -c IFN 3 10 path/to/outfile.npy
Or if you want to print the full result to the console, allowing you to pipe it to virtually any other language or tool as plain text, you can run:
./nimplex -c IFP 3 10
You can also utilize the following auxiliary flags:
--help
or-h
--> Show help.--benchmark
or-b
--> Run a set of tasks to benchmark performnace (simplex_grid(9, 12)
,simplex_internal_grid(9, 12)
,simplex_sampling_mc(9, 1_000_000)
,simplex_graph(9, 12)
) and compare performance across implementations (simplex_graph(3, 1000)
vssimplex_graph_3C(1000)
).