Skip to content

Commit

Permalink
Browse files Browse the repository at this point in the history
  • Loading branch information
florian-huber committed Oct 6, 2022
2 parents 17d3ece + e64c65c commit 8d3979a
Show file tree
Hide file tree
Showing 3 changed files with 44 additions and 1 deletion.
15 changes: 15 additions & 0 deletions CITATION.cff
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# YAML 1.2
---
abstract: "Memory efficient stack of multiple 2D sparse arrays."
authors:
-
affiliation: "Centre for Digitalisation and Digitality, Univery of Applied Sciences Düsseldorf"
family-names: Huber
given-names: Florian
orcid: https://orcid.org/0000-0002-3535-9406

cff-version: 1.2.0
license: "MIT Licence"
message: "If you use this software, please cite it using these metadata."
repository-code: "https://github.com/florian-huber/sparsestack"
title: sparsestack
28 changes: 28 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
![GitHub](https://img.shields.io/github/license/florian-huber/sparsestack)
[![PyPI](https://img.shields.io/pypi/v/sparsestack?color=teal)](https://pypi.org/project/sparsestack/)
![GitHub Workflow Status](https://img.shields.io/github/workflow/status/florian-huber/sparsestack/CI%20Build)
[![fair-software.eu](https://img.shields.io/badge/fair--software.eu-%E2%97%8F%20%20%E2%97%8F%20%20%E2%97%8F%20%20%E2%97%8F%20%20%E2%97%8B-yellow)](https://fair-software.eu)

# sparsestack
Memory efficient stack of multiple 2D sparse arrays.
Expand Down Expand Up @@ -43,3 +44,30 @@ sparsestack[3, :, "scores_1"] # => same as the one before
# Scores can also be converted to a dense numpy array:
scores2_after_merge = sparsestack.to_array("scores_2")
```

## Adding data to a `sparsestack`-array
Sparsestack provides three options to add data to a new layer.
1) `.add_dense_matrix(input_array)`
Can be used to add all none-zero elements of `input_array` to the sparsestack. Depending on the chosen `join_type` either all such values will be added (`join_type="outer"` or `join_type="right"`), or only those which are already present in underlying layers ("left" or "inner" join).
2) `.add_sparse_matrix(input_coo_matrix)`
This method will expect a COO-style matrix (e.g. scipy) which has attributes .row, .col and .data. The join type can again be specified using `join_type`.
3) `.add_sparse_data(row, col, data)`
This essentially does the same as `.add_sparse_matrix(input_coo_matrix)` but might in some cases be a bit more flexible because row, col and data are separate input arguments.

## Accessing data from `sparsestack`-array
The collected sparse data can be accessed in multiple ways.

1) Slicing.
`sparsestack` allows multiple types of slicing (see also code example above).
```python
sparsestack[3, 4] # => tuple with all scores at position row=3, col=4
sparsestack[3, :] # => tuple with row, col, scores for all entries in row=3
sparsestack[:, 2] # => tuple with row, col, scores for all entries in col=2
sparsestack[3, :, 0] # => tuple with row, col, scores_1 for all entries in row=3
sparsestack[3, :, "scores_1"] # => same as the one before
```
2) `.to_array()`
Creates and returns a dense numpy array of size `.shape`. Can also be used to create a dense numpy array of only a single layer when used like `.to_array(name="layerX")`.
**Carefull:** Obviously by converting to a dense array, the sparse nature will be lost and all empty positions in the stack will be filled with zeros.
3) `.to_coo(name="layerX")`
Returns a scipy sparse COO-matrix of the specified layer.
2 changes: 1 addition & 1 deletion sparsestack/__version__.py
Original file line number Diff line number Diff line change
@@ -1 +1 @@
__version__ = '0.1.2'
__version__ = '0.2.0'

0 comments on commit 8d3979a

Please sign in to comment.