diff --git a/CITATION.cff b/CITATION.cff new file mode 100644 index 0000000..c1a6a8e --- /dev/null +++ b/CITATION.cff @@ -0,0 +1,15 @@ +# YAML 1.2 +--- +abstract: "Memory efficient stack of multiple 2D sparse arrays." +authors: + - + affiliation: "Centre for Digitalisation and Digitality, Univery of Applied Sciences Düsseldorf" + family-names: Huber + given-names: Florian + orcid: https://orcid.org/0000-0002-3535-9406 + +cff-version: 1.2.0 +license: "MIT Licence" +message: "If you use this software, please cite it using these metadata." +repository-code: "https://github.com/florian-huber/sparsestack" +title: sparsestack diff --git a/README.md b/README.md index 4adb170..13724ef 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,7 @@ ![GitHub](https://img.shields.io/github/license/florian-huber/sparsestack) [![PyPI](https://img.shields.io/pypi/v/sparsestack?color=teal)](https://pypi.org/project/sparsestack/) ![GitHub Workflow Status](https://img.shields.io/github/workflow/status/florian-huber/sparsestack/CI%20Build) +[![fair-software.eu](https://img.shields.io/badge/fair--software.eu-%E2%97%8F%20%20%E2%97%8F%20%20%E2%97%8F%20%20%E2%97%8F%20%20%E2%97%8B-yellow)](https://fair-software.eu) # sparsestack Memory efficient stack of multiple 2D sparse arrays. @@ -43,3 +44,30 @@ sparsestack[3, :, "scores_1"] # => same as the one before # Scores can also be converted to a dense numpy array: scores2_after_merge = sparsestack.to_array("scores_2") ``` + +## Adding data to a `sparsestack`-array +Sparsestack provides three options to add data to a new layer. +1) `.add_dense_matrix(input_array)` +Can be used to add all none-zero elements of `input_array` to the sparsestack. Depending on the chosen `join_type` either all such values will be added (`join_type="outer"` or `join_type="right"`), or only those which are already present in underlying layers ("left" or "inner" join). +2) `.add_sparse_matrix(input_coo_matrix)` +This method will expect a COO-style matrix (e.g. scipy) which has attributes .row, .col and .data. The join type can again be specified using `join_type`. +3) `.add_sparse_data(row, col, data)` +This essentially does the same as `.add_sparse_matrix(input_coo_matrix)` but might in some cases be a bit more flexible because row, col and data are separate input arguments. + +## Accessing data from `sparsestack`-array +The collected sparse data can be accessed in multiple ways. + +1) Slicing. +`sparsestack` allows multiple types of slicing (see also code example above). +```python +sparsestack[3, 4] # => tuple with all scores at position row=3, col=4 +sparsestack[3, :] # => tuple with row, col, scores for all entries in row=3 +sparsestack[:, 2] # => tuple with row, col, scores for all entries in col=2 +sparsestack[3, :, 0] # => tuple with row, col, scores_1 for all entries in row=3 +sparsestack[3, :, "scores_1"] # => same as the one before +``` +2) `.to_array()` +Creates and returns a dense numpy array of size `.shape`. Can also be used to create a dense numpy array of only a single layer when used like `.to_array(name="layerX")`. +**Carefull:** Obviously by converting to a dense array, the sparse nature will be lost and all empty positions in the stack will be filled with zeros. +3) `.to_coo(name="layerX")` +Returns a scipy sparse COO-matrix of the specified layer. diff --git a/sparsestack/__version__.py b/sparsestack/__version__.py index 10939f0..7fd229a 100644 --- a/sparsestack/__version__.py +++ b/sparsestack/__version__.py @@ -1 +1 @@ -__version__ = '0.1.2' +__version__ = '0.2.0'