Skip to content

Commit

Permalink
documentation (apps)
Browse files Browse the repository at this point in the history
  • Loading branch information
mbaudis committed Jan 31, 2024
1 parent fb5b8ca commit 269edd5
Show file tree
Hide file tree
Showing 10 changed files with 119 additions and 25 deletions.
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,16 +57,16 @@ the [`byconaut`](https://github.com/progenetix/byconaut/) repository using the

## Data maintenance scripts

### `callsetsStatusmapsRefresher` (CNV)
### `analysesStatusmapsRefresher` (CNV)

The `callsetsStatusmapsRefresher` script creates CNV status data for binned
The `analysesStatusmapsRefresher` script creates CNV status data for binned
genomic intervals, for each CNV callset (_i.e._ the CNV data of all corresponding
variants from the same experiment/sample).


#### Examples

* `bin/callsetsStatusmapsRefresher.py -d examplez`
* `bin/analysesStatusmapsRefresher.py -d examplez`

### `collationsCreator`

Expand Down
2 changes: 1 addition & 1 deletion bin/ISCNsegmenter.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ def iscn_segmenter():
initialize_bycon_service(byc)
set_processing_modes(byc)
parse_variants(byc)
generate_genomic_mappings(byc)
set_genome_rsrc_path(byc)
generate_genome_bins(byc)

group_parameter = byc["form_data"].get("groupBy", "histological_diagnosis_id")
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,15 +15,14 @@

"""
## `callsetsStatusmapsRefresher`
## `analysesStatusmapsRefresher`
"""

################################################################################
"""
* `bin/callsetsStatusmapsRefresher.py -d 1000genomesDRAGEN -s variants`
* `bin/callsetsStatusmapsRefresher.py -d progenetix -s biosamples -f "icdom-81703"`
* `bin/callsetsStatusmapsRefresher.py`
* `bin/analysesStatusmapsRefresher.py -d progenetix -f "icdom-81703"`
* `bin/analysesStatusmapsRefresher.py`
- default; new statusmaps for all `progenetix` analyses
"""
################################################################################
Expand Down
2 changes: 1 addition & 1 deletion bin/frequencymapsCreator.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ def frequencymaps_creator():

# re-doing the interval generation for non-standard CNV binning
# genome_binning_from_args(byc)
generate_genomic_mappings(byc)
set_genome_rsrc_path(byc)
generate_genome_bins(byc)

print(f'=> Using data values from {ds_id} for {byc.get("genomic_interval_count", 0)} intervals...')
Expand Down
4 changes: 2 additions & 2 deletions bin/housekeeping.py
Original file line number Diff line number Diff line change
Expand Up @@ -66,8 +66,8 @@ def housekeeping():
#>------------------------- analyses -------------------------------------<#

if "y" in todos.get("update_cs_statusmaps", "y").lower():
print(f'==> executing "{dir_path}/callsetsStatusmapsRefresher.py -d {ds_id}"')
system(f'{dir_path}/callsetsStatusmapsRefresher.py -d {ds_id}')
print(f'==> executing "{dir_path}/analysesStatusmapsRefresher.py -d {ds_id}"')
system(f'{dir_path}/analysesStatusmapsRefresher.py -d {ds_id}')

#>------------------------ / analyses ------------------------------------<#

Expand Down
68 changes: 68 additions & 0 deletions docs/applications.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
---
title: Helper Applications
---

The `byconaut` repository provides a number of helper applications with different
types of functionalities, e.g.

* data I/O
* plotting (see [plotting](plotting.md))
* database maintenance
* data transformation

These applications are in some way used to populate or manage data resources for
`bycon` driven implementations of the Beacon protokol (_i.e._ genomic data resources).


## Plotting Apps

For more information see the dedicated [documentation page](plotting.md)).

## Data transformation & database maintenance

### `analysesStatusmapsRefresher`

This is one of the housekeeping scripts which has to be run after CNV data has
been added or modified in the database. It creates CNV status data for binned
genome intervals, used for histogram generation, sample clustering etc.,
as well as some other statistics (e.g. CNV coverage per chromosomal arms ...).

#### Arguments

* `-d`, `--datasetIds` ... to select the dataset (only one per run)
* `--filters` ... to (optionally) limit the processing to a subset of samples
(e.g. after a limited update)

#### Use

* `bin/analysesStatusmapsRefresher.py -d progenetix`
* `bin/analysesStatusmapsRefresher.py -d progenetix --filters "pgx:icdom-81703"`
* `bin/analysesStatusmapsRefresher.py -d cellz --filters "cellosaurus:CVCL_0312"`

### `collationsCreator`

The `collationsCreator` script updates the dataset specific `collations` collections
which provide the aggregated data (sample numbers, hierarchy trees etc.) for all
individual codes belonging to one of the entities defined in the `filter_definitions`
in the `bycon` configuration.

**TBD** The filter definition should be one of the configuration where users can
provide additions and overrides in the `byconaut/local` directory.

#### Arguments

* `bin/collationsCreator.py -d progenetix`
* `bin/collationsCreator.py -d examplez --collationTypes "PMID"`

### `frequencymapsCreator`

This app creates the frequency maps for the "collations" collection. Basically,
all samples matching any of the collation codes and representing CNV analyses
are selected and the frequencies of CNVs per genomic bin are aggregated. The
result contains teh gain and loss frquencies for all genomic intervals, for the
given entity.

#### Arguments

* `bin/frequencymapsCreator.py -d progenetix`
* `bin/frequencymapsCreator.py -d examplez --collationTypes "icdot"`
6 changes: 3 additions & 3 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,16 +48,16 @@ mongorestore --db $database .../mongodump/examplez/

## Data maintenance scripts

### `callsetsStatusmapsRefresher` (CNV)
### `analysesStatusmapsRefresher` (CNV)

The `callsetsStatusmapsRefresher` script creates CNV status data for binned
The `analysesStatusmapsRefresher` script creates CNV status data for binned
genomic intervals, for each CNV callset (_i.e._ the CNV data of all corresponding
variants from the same experiment/sample).


#### Examples

* `bin/callsetsStatusmapsRefresher.py -d examplez`
* `bin/analysesStatusmapsRefresher.py -d examplez`

### `collationsCreator`

Expand Down
46 changes: 36 additions & 10 deletions docs/plotting.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,33 @@ title: Plotting

## `byconaut` Plot functionality

Starting with version v1.0.30 (2023-04-14) the bycon package added the ability
Starting with version v1.0.30 (2023-04-14) the `bycon` package added the ability
to produce the typical _Progenetix_-style CNV histograms and CNV sample plots,
later (`bycon` v.1.3.4) moved to the `byconaut` repository since being too specific for Beacon
core functionality.
later (`bycon` v.1.3.4) moved to the `byconaut` repository.

### Plotting services

Plotting "services" are online service endpoints for generating visualizations
of mostly CNV data from the databases of the respective beaconized resources. In
the case of the developers these would be e.g. [progenetix.org](https://progenetix.org)
and [cancercelllines.org](https://cancercelllines.org) whcich also are being used
for the active examples below. The plotting services - which are maintained inside
`byconaut/services` but installed in the corresponding webserver CGI directory -
are:

* `services/collationplots/`
* `services/sampleplots/`

### Plotting Applications

Plotting "applications" provide command line utilities for the plotting of database
content and local files. These are maintained in the `byconaut/bin` directory:

* `bin/collationsPlotter.py`
* `bin/samplesPlotter.py`
* `bin/pgxsegPlotter.py`

### Plotting Functionality

Plots can now be generated:

Expand Down Expand Up @@ -47,11 +70,13 @@ if not indicated.

##### Examples

* [/services/collationplots/?filters=NCIT:C35562,NCIT:C3709](http://progenetix.org/services/collationplots/?filters=NCIT:C35562,NCIT:C3709)
The examples below link to `progenetix.org`.

* [/services/collationplots/?filters=NCIT:C35562,NCIT:C3709](https://progenetix.org/services/collationplots/?filters=NCIT:C35562,NCIT:C3709)
- a combination of 2 histograms
* [/services/collationplots?filters=NCIT:C35562,NCIT:C3709&datasetIds=progenetix,cellz](http://progenetix.org/services/collationplots?filters=NCIT:C35562,NCIT:C3709&datasetIds=progenetix,cellz)
* [/services/collationplots?filters=NCIT:C35562,NCIT:C3709&datasetIds=progenetix,cellz](https://progenetix.org/services/collationplots?filters=NCIT:C35562,NCIT:C3709&datasetIds=progenetix,cellz)
- a combination of 2 histograms
* [/services/collationplots/?filters=pgx:icdom-85003,pgx:icdom-81703,pgx:icdom-87003,pgx:icdom-87203,pgx:icdom-94003,pgx:icdom-95003,pgx:icdom-81403&plotPars=plot_title=CNV+Comparison::plot_area_height=50::plot_axis_y_max=80::plot_label_y_values=50](http://progenetix.org/services/collationplots/?filters=pgx:icdom-85003,pgx:icdom-81703,pgx:icdom-87003,pgx:icdom-87203,pgx:icdom-94003,pgx:icdom-95003,pgx:icdom-81403&plotPars=plot_title=CNV+Comparison::plot_area_height=50::plot_axis_y_max=80::plot_label_y_values=50)
* [/services/collationplots/?filters=pgx:icdom-85003,pgx:icdom-81703,pgx:icdom-87003,pgx:icdom-87203,pgx:icdom-94003,pgx:icdom-95003,pgx:icdom-81403&plotPars=plot_title=CNV+Comparison::plot_area_height=50::plot_axis_y_max=80::plot_label_y_values=50](https://progenetix.org/services/collationplots/?filters=pgx:icdom-85003,pgx:icdom-81703,pgx:icdom-87003,pgx:icdom-87203,pgx:icdom-94003,pgx:icdom-95003,pgx:icdom-81403&plotPars=plot_title=CNV+Comparison::plot_area_height=50::plot_axis_y_max=80::plot_label_y_values=50)
- a collations based example showing the use of some extra parameters such as
* `plot_title`
* `plot_area_height`
Expand All @@ -63,17 +88,18 @@ Sample selection based plotting uses the standard bycon query stack for sample r
(_i.e._ aggregation over the data model) and then generates CNV plots from the found
samples, either as clustered individual profiles or as binned frequency plot data (histograms or heatstrips).

CAVE: Sample plots are _very_ time consuming due to the retrieval and plotting of
all variants per sample.
**CAVE**: Sample plots may be time consuming due to the retrieval and plotting of
all variants per sample. Therefore, usually a limit (default or via Beacon `limit`
parameter) is being applied.

##### Examples

* [/services/sampleplots?filters=pgx:icdom-95003&plotPars=plot_filter_empty_samples=y::plotGeneSymbols=MYCN::plotType=samplesplot&limit=100](http://progenetix.org/services/sampleplots?filters=pgx:icdom-95003&plotPars=plot_filter_empty_samples=y::plotGeneSymbols=MYCN::plotType=samplesplot&limit=100)
* [/services/sampleplots?filters=pgx:icdom-95003&plotPars=plot_filter_empty_samples=y::plotGeneSymbols=MYCN::plotType=samplesplot&limit=100](https://progenetix.org/services/sampleplots?filters=pgx:icdom-95003&plotPars=plot_filter_empty_samples=y::plotGeneSymbols=MYCN::plotType=samplesplot&limit=100)
- this example is based on the histoplot example above, but based on individual
sample retrieval and plotting and with some plot modifications:
* limits the output to 100 samples (`limit=100`)
* removes samples w/o CNVs (`plot_filter_empty_samples=y`)
* [/services/sampleplots?filters=pgx:icdom-95003&plotPars=plotGeneSymbols=MYCN&limit=100&plotType=samplesplot](http://progenetix.org/services/sampleplots?filters=pgx:icdom-95003&plotPars=plotGeneSymbols=MYCN::limit=100&plotType=samplesplot)
* [/services/sampleplots?filters=pgx:icdom-95003&plotPars=plotGeneSymbols=MYCN&limit=100&plotType=samplesplot](https://progenetix.org/services/sampleplots?filters=pgx:icdom-95003&plotPars=plotGeneSymbols=MYCN::limit=100&plotType=samplesplot)
- this example gets samples for ICD-O Morphology 95003/3 (a.k.a. `pgx:icdom-95003`)
- limits the output to the first 1000 samples (`limit=1000`)
- adds a label for the **MYCN** gene
Expand Down
1 change: 1 addition & 0 deletions mkdocs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ nav:
- Documentation Home: /
- Changes & To Do: changes-todo
- Plotting: plotting
- Helper Applications: applications
- bycon Documentation: http://progenetix.org
- Progenetix Site: http://progenetix.org
- Baudisgroup: http://info.baudisgroup.org
Expand Down
2 changes: 1 addition & 1 deletion services/doc/services.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: Bycon <i>services</i>
title: Byconaut <i>services</i>
---

The _bycon_ environment provides a number of data services which make use of
Expand Down

0 comments on commit 269edd5

Please sign in to comment.