Updated draft for the volcanoplot function by vbrennsteiner · Pull Request #167 · MannLabs/alphapepttools

vbrennsteiner · 2026-01-28T16:42:18Z

"After archiving the initial volcanoplot branch, this is the new and improved volcanoplot branch"

This PR adds a volcanoplot wrapper

By popular and LLM demand, there should be a volcanoplot wrapper function in alphatools. As such, it should abstract the following functionalities:

scatterplot
adding lines to the scatterplot
adding anchored labels to the scatterplot
adding a legend
setting plot x/y limits

Main difference to the previous PR

Previously, layering of plot features was handled by volcano(), which created a suboptimal entanglement of general purpose plotting functionality (layering plots) and the specific application of a volcanoplot. Currently, two new methods + helpers have been added to the plots.py module:

layered_plot: Calls a plotting function repeatedly for any number of specified data layers, ensuring that indices are only plotted once and that indices not used by any layer are plotted in a default color. It operates using a PlotConfig instance and a Callable capable of interpreting the parameters from the PlotConfig and layer-specific parameters. An example is added in tutorials/tutorial_02_basic_plotting_workflow.ipynb.

color_dict = {
    "upregulated": BaseColors.get("red"),
    "downregulated": BaseColors.get("blue"),
}

To avoid having to specify basic parameters for each layer, we can summarize them in a PlotConfig instance with flexible attributes to accomodate different plotting functions:

plot_config = pl.make_scatter_config(
    data=adata,
    x_column="x",
    y_column="y",
    scatter_kwargs={"alpha": 0.7, "s": 50},  # add any kind of coloring/marker specification that should apply to all layers
)

The main specification of 'layered_plot' is the layers list, which consists of tuples with 3-4 elements:

Which column in the data to use for filtering (here 'diff_exp_status')
Which value to match in that column (single value or list) (here 'upregulated' for the first and 'downregulated' for the second layer)
Which key to look up in the color_dict for that layer (here synonymous to the layer match values in 2.)
Optional: kwargs dict for that layer (here, upregulated points should be triangular and slightly larger)

plot_layers = [
    ("diff_exp_status", "upregulated", "upregulated", {"marker": "^", "s": 100}),
    ("diff_exp_status", "downregulated", "downregulated"),
]

The function is called like this on an AnnData or DataFrame object:

fig, axm = create_figure(1, 1, figsize=(6, 6))
ax = axm.next()
Plots.layered_plot(
    ax=ax,
    base_config=plot_config,
    layers=plot_layers,
    color_dict=color_dict,
)

New `volcano()`:

Using layered_plot internally, the volcano function is drastically simplified and can be called as demonstrated in tutorials/tutorial_04_volcanoplot.ipynb. It also incorporates labelling of different layers, which can be specified by their color_dict key. volcano internally constructs a PlotConfig instance to handle default arguments, meaning that users do not have to interact with PlotConfig.

As before, the first element refers to the column in data to be used for filtering, the second element specifies a match value or list for that column and the third provides the color lookup for that layer.

pois = ["P10291", "P10292", "P10293", "P10294", "P10295"]
plot_layers = [
    ("id", pois, "POI_hypothesis"),
    ("diff_exp_status", "upregulated", "upregulated"),
    ("diff_exp_status", "downregulated", "downregulated"),
]

color_dict = {
    "POI_hypothesis": BaseColors.get("purple", lighten=0.7), 
    "upregulated": BaseColors.get("orange"),
    "downregulated": BaseColors.get("blue"), 
}

label_layers = [
    "POI_hypothesis",  # label only the points whose layer points to the "POI_hypothesis" color-key
]

Generating the volcanoplot (with anchored labels):

Plots.volcano(
    data=adata,
    x_column="log2fc",
    y_column="neg_log10pval",
    color_dict=color_dict,
    layers=plot_layers,
    label_layers=label_layers,
    x_label_anchors=[-3.5, 3.5],
)

To create something like this:

Open points:

Pending the merge of Extend pl module docstrings #164, the label_plot method will take AnnData/DataFrame instances with specified columns instead of separate arrays.
The capabilities of layered_plot theoretically allow for layering arbitrary Callable instances (plotting functions) with arbitrary global and layer-specific arguments. Perhaps there is a good use case for that in future demonstration notebooks?

update with main

…ract some steps to helper functions and extend the docstring

update with main

…ayers

…k 04

Copilot

Pull request overview

This pull request introduces a volcano plot wrapper function and a general layered plotting system for alphapepttools. The implementation adds new data manipulation utilities and a flexible PlotConfig dataclass to support hierarchical scatterplot layering, where points can be colored and styled based on multiple overlapping criteria.

Changes:

Added layered_plot method to enable hierarchical plotting with automatic point assignment and deduplication
Added volcano wrapper function that uses layered_plot internally for differential expression visualization
Introduced PlotConfig dataclass and make_scatter_config factory for flexible plot configuration
Added utility functions (data_index_to_array, subset_data, _tolist) to support data manipulation
Enhanced scatter method with automatic limit calculation and renamed limit parameters from singular to plural

Reviewed changes

Copilot reviewed 4 out of 6 changed files in this pull request and generated 16 comments.

File	Description
src/alphapepttools/pp/data.py	Added data utility functions for index extraction, subsetting, and list conversion; updated documentation
src/alphapepttools/pl/plots.py	Core implementation of layered plotting system, PlotConfig dataclass, volcano function, and helper utilities
src/alphapepttools/pl/init.py	Exported new PlotConfig and make_scatter_config for public API
docs/notebooks/studies/study_01_biomarker_csf.ipynb	Unintentional notebook output changes from re-execution

Comments suppressed due to low confidence (1)

src/alphapepttools/pl/plots.py:1083

The documentation still references the old parameter names 'xlim' and 'ylim', but the actual parameters have been renamed to 'xlims' and 'ylims' (plural). Update the documentation to match the actual parameter names.

        xlim : tuple[float, float], optional
            Limits for the x-axis. By default None.
        ylim : tuple[float, float], optional
            Limits for the y-axis. By default None.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

src/alphapepttools/pl/plots.py

update with main

codecov-commenter · 2026-02-24T09:59:11Z

Codecov Report

❌ Patch coverage is 13.48315% with 154 lines in your changes missing coverage. Please review.
✅ Project coverage is 72.32%. Comparing base (cbd4ca3) to head (83498b1).
⚠️ Report is 9 commits behind head on main.

Files with missing lines	Patch %	Lines
src/alphapepttools/pl/plots.py	12.57%	139 Missing ⚠️
src/alphapepttools/pp/data.py	16.66%	15 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #167      +/-   ##
==========================================
- Coverage   73.61%   72.32%   -1.30%     
==========================================
  Files          35       35              
  Lines        1823     1980     +157     
==========================================
+ Hits         1342     1432      +90     
- Misses        481      548      +67

Files with missing lines	Coverage Δ
src/alphapepttools/pl/__init__.py	`100.00% <100.00%> (ø)`
src/alphapepttools/pp/data.py	`76.33% <16.66%> (-5.01%)`	⬇️
src/alphapepttools/pl/plots.py	`31.58% <12.57%> (-8.59%)`	⬇️

... and 4 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

lucas-diedrich

I like the plots that are returned by the function a lot, they look really clean.

I commented on some specific things associated with the pull request.

I marked issues that were identified by Claude with [Claude] based on the following prompt:

You are a scientific software developer who is putting a lot of thought into user-friendliness of APIs etc. You are working on a software package for the analysis of proteomics data with anndata as data container. Critically review the changes in the current branch by comparing them to the main branch that  aim to implement a convenience function for volcano plots

On a general note, I find the plots module increasingly confusing to review:

the Plots class does not provide a unifying config as its config is not used in any of the functions. This in combination with the @classmethod pattern means that individual functions would behave equivalently. This would also make the package more aligned with the other scverse packages (that use individual functions).
the number of utility functions + implicit contracts (e.g. expected behavior of configs) makes it complicated to understand the logic

src/alphapepttools/pl/plots.py

lucas-diedrich · 2026-02-24T10:02:32Z

src/alphapepttools/pl/plots.py

+    -----
+    Uses pandas.isna() to handle both NaN and None values correctly.
+    """
+    keep_mask = ~(pd.isna(x_values) | pd.isna(y_values))


This action might lead to surprising results (imagine all data coords have a missing value and suddenly no point gets plotted) - should it emit a warning?

I suppose this is the tradeoff, if a datapoint has a missing value in either the x or y axis, how would it be plottable at all? I am quite averse to imputing them with zeros, if there are nonstandard axis choices or only a single point is concerned, it could lead to false conclusions about points that have no actual values - I'd much rather handle this upstream with imputation.

As for warnings, this would then raise a warning if any point has a missing value, which could be seen as excessive/annoying.

The scenario where this causes (or rather exposes) a problem is when users expect to see e.g. a certain gene but it does not show up in the plot, which would likely cause them to search for it in the data where the missing values could be identified & addressed.

A pure discovery scenario would (in my opinion) not benefit from somehow showing/including points that have no values associated.

lucas-diedrich · 2026-02-24T10:03:52Z

src/alphapepttools/pl/plots.py

+    >>> _get_plot_lims(values, 1.1, set_left=0)
+    (0, 3.3)
+    """
+    series = pd.Series(values)


Why is this conversion necessary?

Wouldn't

max(abs(min(values)), abs(max(values)))

produce equivalent results?

lucas-diedrich · 2026-02-24T10:07:53Z

src/alphapepttools/pl/plots.py

+        cls,
+        # Required data parameters
+        data: ad.AnnData | pd.DataFrame,
+        x_column: str,


could these values all default to the standardized output of the tl.diff_exp functions?

We would expect that the users just plug in our DEG results (and if they don't, they can change it)

Nevermind, I see that the diff-exp outputs are not really standardized at the moment as they vary depending on which kinds of contrasts are passed (t_value__condition1_condition2). Should we update that?

The diff exp outputs are standardized with respect to the necessary columns, but we keep the method-specific columns around as well in case users want them. The standardized columns are in tl.defaults.py, so we could set column defaults to those

I would do that

lucas-diedrich · 2026-02-24T10:12:13Z

src/alphapepttools/pl/plots.py

-        xlim: tuple[float, float] | None = None,
-        ylim: tuple[float, float] | None = None,
+        figure_kwargs: dict | None = None,
+        xlims: tuple[float, float] | None = None,


This is a breaking change - Is it necessary to rename the arguments in this PR?

lucas-diedrich · 2026-02-24T10:27:41Z

src/alphapepttools/pl/plots.py

+        layers: list[tuple] | None = None,
+        color_dict: dict[str, str | tuple] | None = None,
+        # Volcano-specific thresholds
+        x_thresholds: tuple | None = (-1, 1),


I think the None option does not add value here

lucas-diedrich · 2026-02-24T10:38:24Z

src/alphapepttools/pl/plots.py

+        max_labels: int | None = None,
+        x_label_anchors: list[float] | None = None,
+        y_display_start: float | None = 1,
+        y_padding_factor: float | None = 4,


Would drop None default

lucas-diedrich · 2026-02-24T10:38:31Z

src/alphapepttools/pl/plots.py

+        display_id_column: str | None = None,
+        max_labels: int | None = None,
+        x_label_anchors: list[float] | None = None,
+        y_display_start: float | None = 1,


Would drop None default

lucas-diedrich · 2026-02-24T10:42:21Z

src/alphapepttools/pl/plots.py

+        left = -abs_max * padding_factor
+        right = abs_max * padding_factor
+    else:
+        left = series.min() * padding_factor


[Claude] This assumes that the minimal value of series is negative (otherwise the limit is shifted to the right)

lucas-diedrich · 2026-02-24T10:48:08Z

src/alphapepttools/pl/plots.py

+    data: ad.AnnData | pd.DataFrame | None = None
+    _extra: dict | None = None  # Store additional fields
+
+    def __post_init__(self):


from dataclasses import dataclass, field @dataclass class Data: extra: dict[str, int] = field(default_factory=dict)

vbrennsteiner and others added 19 commits November 14, 2025 13:37

add draft for volcanoplot wrapper function

f33f321

Merge branch 'main' into volcanoplot_wrapper

852eb9e

update with main

Merge branch 'main' into volcanoplot_wrapper

281bc4f

update with main

refactor volcanoplot wrapper to use anndata and dataframe inputs, ext…

63d1894

…ract some steps to helper functions and extend the docstring

Merge branch 'main' into volcanoplot_wrapper

cb0ed49

Merge branch 'main' into volcanoplot_wrapper

ee93e3b

update with main

updated notebook structure

87a5526

refactor volcanoplot

23fe98a

extend data_column_to_array

6bc9ff1

adapt data_column_to_array to use index by default

fa5f437

update volcanoplot function and demonstration notebook

79b21ed

update volcanoplot function and demonstration notebook

5fa3bdc

update volcanoplot and demo notebook

d9fff2c

update volcanoplot wrapper and demo notebook

1c06f37

draft enhanced scattering logic with config class and separate plot_l…

d41afce

…ayers

restructure scatter_layers and ScatterConfig

81e994f

change plot config pattern to hold general plot configurations

63582b0

change plot config pattern to hold general plot configurations

220b4ef

add comprehensive example to volcanoplot docstring and update noteboo…

092166f

…k 04

vbrennsteiner self-assigned this Jan 28, 2026

vbrennsteiner requested review from Copilot, lucas-diedrich, mschwoer and shanibmo January 28, 2026 17:07

Copilot started reviewing on behalf of vbrennsteiner January 28, 2026 17:08 View session

Copilot AI reviewed Jan 28, 2026

View reviewed changes

vbrennsteiner added 4 commits February 2, 2026 18:43

update notebook and api documentation

66a8ba3

Merge branch 'main' into enhanced_scatter

914401f

update with main

integrate review comments

eccbfd5

integrate reviews

e6dff6c

vbrennsteiner added 2 commits February 23, 2026 16:07

resolve merge conflict

09fe143

update call to xlims (scatter) in notebook

83498b1

lucas-diedrich reviewed Feb 24, 2026

View reviewed changes

add two larger preset figure sizes to defaults and rerun notebooks

befe29e

Conversation

vbrennsteiner commented Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

This PR adds a volcanoplot wrapper

Main difference to the previous PR

New volcano():

Open points:

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov-commenter commented Feb 24, 2026

Codecov Report

Uh oh!

lucas-diedrich left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vbrennsteiner Mar 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

vbrennsteiner commented Jan 28, 2026 •

edited

Loading

New `volcano()`:

lucas-diedrich left a comment •

edited

Loading

vbrennsteiner Mar 15, 2026 •

edited

Loading