Allow for chain selection in CNS scoring modules #1421

VGPReys · 2025-10-17T07:33:57Z

Checklist

Tests added for the new code
Documentation added for the code changes
Modifications / enhancements are reflected on the haddock3 user-manual
CHANGELOG.md is updated to incorporate new changes
Does not break licensing
Does not add any dependencies, if it does please add a thorough explanation

Summary of the Pull Request

This PR adds a new parameters interface_combinations = [] in CNS scoring modules (emscoring and mdscoring).

Each entry must be composed of two coma separated chains
     e.g.: 
     []             -> Consider all interfaces (default)
     ["A,B"]        -> Consider only the interface score between A and B
     ["A,H", "A,L"] -> Sum interface scores between A,H and A,L

Note that the header of the PDB file is not modified, but only the score attribute written in the io.json is affected.

As this addition is used by both emscoring and mdscoring modules, it has been implemented in their shared class CNSScoringModule in module/scoring/__init__.py.

The reading of the score components from PDB files has been displaced from the CNSScoringModule to the HaddockModel.

Finally, small optimization to reduce IO when performing the per_interface_output, where now the PDB files are not read twice.

Unfortunately had to add antibody-antigen pdb structure + psf file for the integration tests...

Related Issue

Closes #1414

VGPReys · 2025-10-17T08:00:54Z

After discussion with Alex:
Let's change the parameter to behave differently, where the use needs to specify the chains of interest: ["A,C", "B,C"]
This would allow more flexibility

AnnaKravchenko · 2025-10-27T11:04:16Z

It’s “interface_combinationS” not ‘interface_combination” - would be nice to edit this PR description

AnnaKravchenko · 2025-10-27T11:20:03Z

Smth not right. If I use this new parameter - my run crushes:

[2025-10-27 12:19:07,133 __init__ INFO] [emscoring] CNS jobs have finished
[2025-10-27 12:19:07,135 libutil ERROR] local variable 'interface_score' referenced before assignment
Traceback (most recent call last):
  File "/trinity/login/arha/dev-h3/haddock3/src/haddock/libs/libutil.py", line 382, in log_error_and_exit
    yield
  File "/trinity/login/arha/dev-h3/haddock3/src/haddock/clis/cli.py", line 193, in main
    workflow.run()
  File "/trinity/login/arha/dev-h3/haddock3/src/haddock/libs/libworkflow.py", line 43, in run
    step.execute()
  File "/trinity/login/arha/dev-h3/haddock3/src/haddock/libs/libworkflow.py", line 173, in execute
    self.module.run()  # type: ignore
  File "/trinity/login/arha/dev-h3/haddock3/src/haddock/modules/base_cns_module.py", line 61, in run
    self._run()
  File "/trinity/login/arha/dev-h3/haddock3/src/haddock/modules/scoring/emscoring/__init__.py", line 92, in _run
    output_haddock_models = self.update_pdb_scores(interface_combinations)
  File "/trinity/login/arha/dev-h3/haddock3/src/haddock/modules/scoring/__init__.py", line 180, in update_pdb_scores
    haddock_score = self.compute_interfaces_score(
  File "/trinity/login/arha/dev-h3/haddock3/src/haddock/modules/scoring/__init__.py", line 251, in compute_interfaces_score
    selected_interfaces_scores.append(interface_score)
UnboundLocalError: local variable 'interface_score' referenced before assignment
[2025-10-27 12:19:07,138 libutil ERROR] local variable 'interface_score' referenced before assignment

More details:

My tolms:
Version1:

[topoaa]
[emscoring]
interface_combinations = ["A,B"]
[caprieval]

Version2:

[topoaa]
[emref]
[emscoring]
interface_combinations = ["A,B"]
[caprieval]

If I do not use interface_combinations - no errors, [emscoring] ( or smth else? since “ [emscoring] CNS jobs have finished” before error occured) doing fine.

Even more details: I have 400 DNA-ligand flexref models, which I modified by splitting DNA into chains A and C, remerging back with ligand (chain B), and putting all 400 models into ensemble. The idea now is to emsore this ensemble with 1. no interface_combinations, 2. interface_combinations = ["A,B”, ”C,B”] and 3. just because I can - only [“A,B”].

Runs here: /trinity/login/arha/test_pr

src/haddock/modules/scoring/__init__.py

AnnaKravchenko · 2025-10-27T12:14:27Z

Ah, I see a big issue with my testing. Chain B does not exist after emscoring! becuase I mess up my topology.
So now this behavior is due to the attempt to evaluate interface using non-existing chain name.

AnnaKravchenko · 2025-10-27T12:31:13Z

Turned out atm it dose not matter if chain B exists or not - same behavior in both cases.
Sorry for the messy commetns!

src/haddock/modules/scoring/__init__.py

Fix from reviews

AnnaKravchenko · 2025-10-27T14:02:58Z

Now [emscoring] finished sucesfully. But libutinl in [caprieval] failed:

[2025-10-27 14:59:25,248 libutil ERROR] '<' not supported between instances of 'NoneType' and 'NoneType'
Traceback (most recent call last):
  File "/trinity/login/arha/dev-h3/haddock3/src/haddock/libs/libutil.py", line 382, in log_error_and_exit
    yield
  File "/trinity/login/arha/dev-h3/haddock3/src/haddock/clis/cli.py", line 193, in main
    workflow.run()
  File "/trinity/login/arha/dev-h3/haddock3/src/haddock/libs/libworkflow.py", line 43, in run
    step.execute()
  File "/trinity/login/arha/dev-h3/haddock3/src/haddock/libs/libworkflow.py", line 173, in execute
    self.module.run()  # type: ignore
  File "/trinity/login/arha/dev-h3/haddock3/src/haddock/modules/__init__.py", line 246, in run
    self._run()
  File "/trinity/login/arha/dev-h3/haddock3/src/haddock/modules/analysis/caprieval/__init__.py", line 228, in _run
    extract_data_from_capri_class(
  File "/trinity/login/arha/dev-h3/haddock3/src/haddock/modules/analysis/caprieval/capri.py", line 924, in extract_data_from_capri_class
    ranked_data = rank_according_to_score(
  File "/trinity/login/arha/dev-h3/haddock3/src/haddock/modules/analysis/caprieval/capri.py", line 810, in rank_according_to_score
    score_rankkey_values.sort(key=lambda x: x[1])
TypeError: '<' not supported between instances of 'NoneType' and 'NoneType'

Both when reference_fname is given (and I made sure reference has correct number of chains and the names) and when no reference is given. Same error in both cases.

AnnaKravchenko

The same workflow with interface_combinations commented out runs without this error, so this error is probably caused by new addition to the code.

VGPReys · 2025-10-27T14:18:39Z

let me investigate...
This must be related to the fact that the function computing the interface combination can return None if none of the interfaces are found in the input model

VGPReys · 2025-10-27T15:23:37Z

After some investigations, it has been observed that for the use of the interface_combinations parameter, another parameter per_interface_scoring = true was required.

I applied a small patch to the code to temporary set the per_interface_scoring to True when interface_combinations != [], making sure the interfaces scores are written in the PDB headers as REMARK, thus allowing to read the interface score and compute the desired combination(s).

The original value of the per_interface_scoring set by the user is kept and used when outputting the results.

AnnaKravchenko

Double interface like interface_combinations = ["A,C","C,B”] is functional, both with per_interface_scoring set to true and false.
Single interface like interface_combinations = ["A,C”] with per_interface_scoring set to true and false creates expected headers in emscoring.pdb:

REMARK Interface Chain1 Chain2 HADDOCKscore Evdw Eelec Edesol BSA
REMARK Interface: A B -40.0928 -30.984 -40.9368 -0.921424 573.25
REMARK Interface: A C -211.158 -196.942 -565.691 98.9224 3194.26
REMARK Interface: B C -38.1605 -26.0944 -55.2351 -1.01912 530.589
REMARK Total HADDOCK score without restraints: -289.411

But then libutil (in between [emscoring] and [caprieval] ) crushes:

[2025-10-28 12:01:37,899 libutil ERROR] local variable 'score' referenced before assignment
Traceback (most recent call last):
  File "/trinity/login/arha/dev-h3/haddock3/src/haddock/modules/scoring/__init__.py", line 196, in update_pdb_scores
    score = self.compute_interfaces_score(
  File "/trinity/login/arha/dev-h3/haddock3/src/haddock/modules/scoring/__init__.py", line 279, in compute_interfaces_score
    raise ValueError
ValueError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/trinity/login/arha/dev-h3/haddock3/src/haddock/libs/libutil.py", line 382, in log_error_and_exit
    yield
  File "/trinity/login/arha/dev-h3/haddock3/src/haddock/clis/cli.py", line 193, in main
    workflow.run()
  File "/trinity/login/arha/dev-h3/haddock3/src/haddock/libs/libworkflow.py", line 43, in run
    step.execute()
  File "/trinity/login/arha/dev-h3/haddock3/src/haddock/libs/libworkflow.py", line 173, in execute
    self.module.run()  # type: ignore
  File "/trinity/login/arha/dev-h3/haddock3/src/haddock/modules/base_cns_module.py", line 61, in run
    self._run()
  File "/trinity/login/arha/dev-h3/haddock3/src/haddock/modules/scoring/emscoring/__init__.py", line 92, in _run
    output_haddock_models = self.update_pdb_scores(interface_combinations)
  File "/trinity/login/arha/dev-h3/haddock3/src/haddock/modules/scoring/__init__.py", line 206, in update_pdb_scores
    haddock_score = score

VGPReys · 2025-10-28T11:52:53Z

Yes, I was throwing a ValueError but excepting a TypeError, my bad.
It is fixed now

AnnaKravchenko · 2025-10-28T14:52:58Z

Seems smth is still not right, or I did smth wrong:

In workflow: interface_combinations = ["A,C”]; per_interface_scoring = true
In emscoring_1.pdb:

REMARK Interface Chain1 Chain2 HADDOCKscore Evdw Eelec Edesol BSA
REMARK Interface: A B -40.0928 -30.984 -40.9368 -0.921424 573.25
REMARK Interface: A C -211.158 -196.942 -565.691 98.9224 3194.26
REMARK Interface: B C -38.1605 -26.0944 -55.2351 -1.01912 530.589
REMARK Total HADDOCK score without restraints: -289.411

In capri_ss:
../1_emscoring/emscoring_1.pdb - 21 -292.491

So capri shows a full score between all available chains (listed also in emscoring_1.pdb as REMARK HADDOCK score: -292.491). But I expect only chains A and C.

VGPReys · 2025-10-28T16:02:08Z

In workflow: interface_combinations = ["A,C”]; per_interface_scoring = true In emscoring_1.pdb

What if you try interface_combinations = ["A,C"] ?
maybe an issue of the character ” != " ?

AnnaKravchenko · 2025-10-28T16:12:59Z

In workflow: interface_combinations = ["A,C”]; per_interface_scoring = true In emscoring_1.pdb

What if you try interface_combinations = ["A,C"] ? maybe an issue of the character ” != " ?

Unfortunately no, this is jsut a quirk of github. Both my quotes are the same, take a look at the cfg: /trinity/login/arha/test_pr/run1/data/configurations/raw_input.toml

I also tried the interface_combinations = ["A,C", "A,C”] - and capri_ss correctly shows -211.158*2 value.

VGPReys · 2025-10-28T20:05:04Z

so when chain combinations are duplicated it is functional but when only one chain is there it is not ?

AnnaKravchenko · 2025-10-28T20:06:06Z

so when chain combinations are duplicated it is functional but when only one chain is there it is not ?

Yes! Only one pair of chains, i.e. only 1 interface. I think this is what you meant

VGPReys · 2025-10-28T20:12:56Z

I cannot reproduce it so far...
Let's discuss it tomorrow in person I guess

allow for chain selection in CNS scoring modules

36bbf49

VGPReys self-assigned this Oct 17, 2025

VGPReys added enhancement Improving something in the codebase m|emscoring Related to emscoring module m|mdscoring mdscoring module labels Oct 17, 2025

Merge branch 'main' into interface-combinations-scoring

7a5af69

VGPReys marked this pull request as draft October 17, 2025 08:01

VGPReys added 4 commits October 22, 2025 10:39

reduce reading of remarks

c8b9213

modify parameter behavior

15dcb88

update changelog

e420e14

adding integration tests

9137570

VGPReys marked this pull request as ready for review October 22, 2025 13:01

VGPReys requested review from AnnaKravchenko and amjjbonvin October 22, 2025 13:01

rvhonorato reviewed Oct 27, 2025

View reviewed changes

src/haddock/modules/scoring/__init__.py Outdated Show resolved Hide resolved

VGPReys commented Oct 27, 2025

View reviewed changes

src/haddock/modules/scoring/__init__.py Outdated Show resolved Hide resolved

VGPReys added 2 commits October 27, 2025 14:16

Update src/haddock/modules/scoring/__init__.py

a887b4c

Fix from reviews

Merge branch 'main' into interface-combinations-scoring

5b8bbc3

AnnaKravchenko reviewed Oct 27, 2025

View reviewed changes

overcome impossible parameter combinations to ease their use

9d0405b

fix return

4fb7e21

VGPReys requested a review from AnnaKravchenko October 27, 2025 18:20

AnnaKravchenko reviewed Oct 28, 2025

View reviewed changes

fix excepted error type

be18fc4

adding test for reversed chain combinations

57fedb9

VGPReys added 2 commits October 29, 2025 11:59

fix condition

a3565cb

remove import

463d864

AnnaKravchenko self-requested a review October 29, 2025 11:03

AnnaKravchenko previously approved these changes Oct 29, 2025

View reviewed changes

add docstring

440d628

VGPReys dismissed AnnaKravchenko’s stale review via 440d628 October 29, 2025 11:06

Uh oh!

Allow for chain selection in CNS scoring modules #1421

Are you sure you want to change the base?

Allow for chain selection in CNS scoring modules #1421

Uh oh!

Conversation

VGPReys commented Oct 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Checklist

Summary of the Pull Request

Related Issue

Uh oh!

VGPReys commented Oct 17, 2025

Uh oh!

AnnaKravchenko commented Oct 27, 2025

Uh oh!

AnnaKravchenko commented Oct 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

AnnaKravchenko commented Oct 27, 2025

Uh oh!

AnnaKravchenko commented Oct 27, 2025

Uh oh!

Uh oh!

AnnaKravchenko commented Oct 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

AnnaKravchenko left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

VGPReys commented Oct 27, 2025

Uh oh!

VGPReys commented Oct 27, 2025

Uh oh!

AnnaKravchenko left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

VGPReys commented Oct 28, 2025

Uh oh!

AnnaKravchenko commented Oct 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

VGPReys commented Oct 28, 2025

Uh oh!

AnnaKravchenko commented Oct 28, 2025

Uh oh!

VGPReys commented Oct 28, 2025

Uh oh!

AnnaKravchenko commented Oct 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

VGPReys commented Oct 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

VGPReys commented Oct 17, 2025 •

edited

Loading

AnnaKravchenko commented Oct 27, 2025 •

edited

Loading

AnnaKravchenko commented Oct 27, 2025 •

edited

Loading

AnnaKravchenko left a comment •

edited

Loading

AnnaKravchenko left a comment •

edited

Loading

AnnaKravchenko commented Oct 28, 2025 •

edited

Loading

AnnaKravchenko commented Oct 28, 2025 •

edited

Loading