Skip to content

Commit

Permalink
Merge pull request #13 from solislemuslab/develop
Browse files Browse the repository at this point in the history
MiNAA v1.1.0
  • Loading branch information
Reed Nelson authored Jan 18, 2024
2 parents ef5522d + 35f5537 commit 7dcb4ba
Show file tree
Hide file tree
Showing 49 changed files with 4,546 additions and 2,525 deletions.
42 changes: 21 additions & 21 deletions LICENCE → LICENSE
Original file line number Diff line number Diff line change
@@ -1,21 +1,21 @@
MIT License
Copyright (c) 2023 Solis-Lemus Lab
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
MIT License

Copyright (c) 2024 Solis-Lemus Lab

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ HEADER_FILES = $(addprefix include/,$(HEADER))
SOURCE_FILES = $(addprefix src/,$(SOURCE))
OBJECT_FILES = $(addprefix obj/,$(SOURCE:.cpp=.o))

ifdef SystemRoot # Wndows
ifdef SystemRoot # Windows
RM = del /Q
RMOBJ = obj\*.o
MKDIR =
Expand Down
92 changes: 61 additions & 31 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

<img src="logo.png" style="width:40%;" align=right>

[![GitHub Releases](https://img.shields.io/github/v/release/solislemuslab/minaa?display_name=tag)](https://github.com/solislemuslab/minaa/releases) [![GitHub license](https://img.shields.io/github/license/solislemuslab/minaa)](https://github.com/solislemuslab/minaa/blob/main/LICENCE) [![GitHub Issues](https://img.shields.io/github/issues/solislemuslab/minaa)](https://github.com/solislemuslab/minaa/issues) ![ ](https://img.shields.io/github/languages/code-size/solislemuslab/minaa) [![status](https://joss.theoj.org/papers/b4d9f26021065b1759d50413f60aa9c3/status.svg)](https://joss.theoj.org/papers/b4d9f26021065b1759d50413f60aa9c3)
[![GitHub Releases](https://img.shields.io/github/v/release/solislemuslab/minaa?display_name=tag)](https://github.com/solislemuslab/minaa/releases) [![GitHub license](https://img.shields.io/github/license/solislemuslab/minaa)](https://github.com/solislemuslab/minaa/blob/main/LICENSE) [![GitHub Issues](https://img.shields.io/github/issues/solislemuslab/minaa)](https://github.com/solislemuslab/minaa/issues) ![ ](https://img.shields.io/github/languages/code-size/solislemuslab/minaa) [![status](https://joss.theoj.org/papers/b4d9f26021065b1759d50413f60aa9c3/status.svg)](https://joss.theoj.org/papers/b4d9f26021065b1759d50413f60aa9c3)

## Description

Expand All @@ -16,11 +16,13 @@ This program requires C++20 or higher, and g++.

### Unix

`make`
```bash
make
```

### Windows

```cmd
```bash
mkdir obj
make
```
Expand All @@ -33,61 +35,89 @@ This utility has the form `./minaa.exe <G> <H> [-B=bio] [-a=alpha] [-b=beta]`.

### Required Arguments (ordered)

1. **G**: A network to align.
2. **H**: A network to align.
1. G; a network to align.
2. H; a network to align.

- Require:
- The networks are represented by adjacency matrices in CSV format, with labels in both the first column and row.
- The CSV delimiter must be one of {comma, semicolon, space, tab}, and will be detected automatically.
- |G| is lesser or equal to |H|.
- Notes:
- Any nonzero entry is considered an edge.
- |G| should be lesser or equal to |H|.

### Optional Arguments (unordered)

- **bio**: The path to the biological cost matrix file.
- Require: CSV adjacency matrix where the first column consists of the labels of G, and first row consists of the labels of H.
#### Common

- **-B=**: the path to the biological cost matrix file.
- Require: a CSV adjacency matrix where the first column consists of the labels of G, in order, and first row consists of the labels of H, in order.
- Default: the algorithm will run using only topological calculations.
- **alpha**: GDV-edge weight balancer
- Notes:
- The input matrix is normalized by MiNAA such that all entries are in range [0, 1].
- The input is assumed to be a cost matrix. If it is a similarity matrix, use the **-s** option detailed below.
- **-a=**: alpha; the GDV-edge weight balancer.
- Require: a real number in range [0, 1].
- Default: 1 (100% GDV data).
- **-b=**: beta; the topological-biological cost matrix balancer.
- Require: a real number in range [0, 1].
- Default: 1
- **beta**: topological-biological cost matrix balancer.
- Default: 1 (100% topological data).
- **-st=**: similarity threshold; The similarity value above which aligned pairs are included in the output.
- Require: a real number in range [0, 1].
- Default: 1
- Default: 0.

#### Uncommon

- **-Galias=**: an alias for the G file.
- Require: a valid file name.
- Default: the G file keeps its original name.
- **-Halias=**: an alias for the H file.
- Require: a valid file name.
- Default: the H file keeps its original name.
- **-Balias=**: an alias for the B file.
- Require: a valid file name.
- Default: the B file keeps its original name.
- **-p**: passthrough; whether or not to write the input files into the output folder.
- Require: none.
- Default: the files are not passed through to the output folder.
- Note: the output reflects the input data after having been processed by the algorithm, this is not a direct copy and paste.
- **-t**: timestamp; the output folder's name includes the date and time of execution.
- Require: none.
- Default: the output folder's name does not include date and time.
- **-g**: greekstamp; the output folder's name includes the values for alpha and beta.
- Require: none.
- Default: the output folder's name does not include the values for alpha and beta.
- **-s**: similarity conversion; for each entry in the given biological matrix, the value (post normalization) is replaced with 1 - value.
- Require: none.
- Default: the given biological matrix is left as is.
- Note: use this if and only if the provided biological matrix is a similarity matrix.

### Outputs

- **X-Y-T/**: (where "X", "Y" are the input networks, "T" is the date and time of execution) The folder containing the output files specified below.
- **log.txt**: Record of the important details from the alignment.
- **X_gdvs.csv**: (where "X" is the input network) The Graphlet Degree Vectors for network "X".
- **top_costs.csv**: The topological cost matrix.
- **bio_costs.csv**: The biologocal cost matrix (as inputed). Not created unless biological input is given.
- **overall_costs.csv**: The combination of the topological and biological cost matrix. Not created unless biological input is given.
- **alignment_list.csv**: A complete list of all aligned nodes, with rows in the format `g_node,h_node,similarity`, descending acording to similarity. The first row in this list is the total cost of the alignment, or the sum of (1 - similarity) for all aligned pairs.
- **alignment_matrix.csv**: A matrix form of the same alignment, where the first column and row are the labels from the two input networks, respectively.
- **G-H/**: (where G, H are the input networks) The folder containing the output files specified below.
- **log.txt**: record of the important details from the alignment.
- **G_gdvs.csv**: (where G is the input network) the Graphlet Degree Vectors for network G.
- **H_gdvs.csv**: (where H is the input network) the Graphlet Degree Vectors for network H.
- **top_costs.csv**: the topological cost matrix.
- **bio_costs.csv**: the biologocal cost matrix (as inputed). Not created unless biological input is given.
- **overall_costs.csv**: the combination of the topological and biological cost matrix. Not created unless biological input is given.
- **alignment_list.csv**: a complete list of all aligned nodes, with rows in the format `g_node,h_node,similarity`, descending acording to similarity. The first row in this list is the total *cost* of the alignment, or the sum of (1 - similarity) for all aligned pairs.
- **alignment_matrix.csv**: a matrix form of the same alignment, where the first column and row are the labels from the two input networks, respectively.

### Examples

`./minaa.exe network0.csv network1.csv -a=0.6`

Here we align network0 with network1 using no biological data. `-a=0.6` sets alpha equal to 0.6, meaning 60% of the topological cost function comes from similarity calculated by GDVs, and 40% from simpler node degree data.

`./minaa.exe network0.csv network1.csv bio_costs.csv -b=0.85`

Here we align network0 with network1 using topological information and the given biological cost matrix, bio_costs. Since alpha and gamma were unspecified, they default to 0.5 and 1 respectively. Since beta was set to 0.85, 85% of the cost weight is from the topological cost matrix, and 15% is from the given biological cost matrix.

See the `example/` directory for a sample input and output to MiNAA, which you can look at and replicate yourself.
Examples of MiNAA's usage with real data and in-depth explanations can be found in the `examples/` directory.

## Simulations in the Manuscript

All scripts and instructions to reproduce the analyses in the manuscript can be found in the `simulations` folder.
All scripts and instructions to reproduce the analyses in the manuscript can be found in the `simulations/` directory.

## Contributions, Questions, Issues, and Feedback

Users interested in expanding functionalities in MiNAA are welcome to do so. Issues reports are encouraged through Github's [issue tracker](https://github.com/solislemuslab/minaa/issues). See details on how to contribute and report issues in [CONTRIBUTING.md](https://github.com/solislemuslab/minaa/blob/master/CONTRIBUTING.md).

## License

MiNAA is licensed under the [MIT](https://opensource.org/licenses/MIT) licence. &copy; SolisLemus lab projects (2023)
MiNAA is licensed under the [MIT](https://opensource.org/licenses/MIT) license. &copy; SolisLemus lab (2024).

## Citation

Expand Down
Loading

0 comments on commit 7dcb4ba

Please sign in to comment.