Merge branch 'main' of https://github.com/florian-huber/stacked-spars…

…e-array
matchms · Oct 6, 2022 · 8d3979a · 8d3979a
2 parents 17d3ece + e64c65c
commit 8d3979a
Show file tree

Hide file tree

Showing 3 changed files with 44 additions and 1 deletion.
diff --git a/CITATION.cff b/CITATION.cff
@@ -0,0 +1,15 @@
+# YAML 1.2
+---
+abstract: "Memory efficient stack of multiple 2D sparse arrays."
+authors:
+  -
+    affiliation: "Centre for Digitalisation and Digitality, Univery of Applied Sciences Düsseldorf"
+    family-names: Huber
+    given-names: Florian
+    orcid: https://orcid.org/0000-0002-3535-9406
+
+cff-version: 1.2.0
+license: "MIT Licence"
+message: "If you use this software, please cite it using these metadata."
+repository-code: "https://github.com/florian-huber/sparsestack"
+title: sparsestack
diff --git a/README.md b/README.md
@@ -1,6 +1,7 @@
 ![GitHub](https://img.shields.io/github/license/florian-huber/sparsestack)
 [![PyPI](https://img.shields.io/pypi/v/sparsestack?color=teal)](https://pypi.org/project/sparsestack/)
 ![GitHub Workflow Status](https://img.shields.io/github/workflow/status/florian-huber/sparsestack/CI%20Build)
+[![fair-software.eu](https://img.shields.io/badge/fair--software.eu-%E2%97%8F%20%20%E2%97%8F%20%20%E2%97%8F%20%20%E2%97%8F%20%20%E2%97%8B-yellow)](https://fair-software.eu)
 
 # sparsestack
 Memory efficient stack of multiple 2D sparse arrays.
@@ -43,3 +44,30 @@ sparsestack[3, :, "scores_1"]  # => same as the one before
 # Scores can also be converted to a dense numpy array:
 scores2_after_merge = sparsestack.to_array("scores_2")
 ```
+
+## Adding data to a `sparsestack`-array
+Sparsestack provides three options to add data to a new layer.
+1) `.add_dense_matrix(input_array)`
+Can be used to add all none-zero elements of `input_array` to the sparsestack. Depending on the chosen `join_type` either all such values will be added (`join_type="outer"` or `join_type="right"`), or only those which are already present in underlying layers ("left" or "inner" join).
+2) `.add_sparse_matrix(input_coo_matrix)`
+This method will expect a COO-style matrix (e.g. scipy) which has attributes .row, .col and .data. The join type can again be specified using `join_type`.
+3) `.add_sparse_data(row, col, data)`
+This essentially does the same as `.add_sparse_matrix(input_coo_matrix)` but might in some cases be a bit more flexible because row, col and data are separate input arguments.
+
+## Accessing data from `sparsestack`-array
+The collected sparse data can be accessed in multiple ways.
+
+1) Slicing.
+`sparsestack` allows multiple types of slicing (see also code example above).
+```python
+sparsestack[3, 4]  # => tuple with all scores at position row=3, col=4
+sparsestack[3, :]  # => tuple with row, col, scores for all entries in row=3
+sparsestack[:, 2]  # => tuple with row, col, scores for all entries in col=2
+sparsestack[3, :, 0]  # => tuple with row, col, scores_1 for all entries in row=3
+sparsestack[3, :, "scores_1"]  # => same as the one before
+```
+2) `.to_array()`
+Creates and returns a dense numpy array of size `.shape`. Can also be used to create a dense numpy array of only a single layer when used like `.to_array(name="layerX")`.
+**Carefull:** Obviously by converting to a dense array, the sparse nature will be lost and all empty positions in the stack will be filled with zeros.
+3) `.to_coo(name="layerX")`
+Returns a scipy sparse COO-matrix of the specified layer.
diff --git a/sparsestack/__version__.py b/sparsestack/__version__.py
@@ -1 +1 @@
-__version__ = '0.1.2'
+__version__ = '0.2.0'