Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor llm vii #358

Merged
merged 18 commits into from
Nov 8, 2024
Merged

Refactor llm vii #358

merged 18 commits into from
Nov 8, 2024

Conversation

mschwoer
Copy link
Contributor

@mschwoer mschwoer commented Oct 23, 2024

  • make link between LLM function definitions and actual functions transparent
  • add general dimensionality reduction to DataSet
  • simplify some helpers
  • slightly refactor llm_integration
  • add unit tests for whole LLM module

Comment on lines +44 to +47
models,
index=models.index(st.session_state.get(StateKeys.MODEL_NAME))
if current_model is not None
else 0,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

key = StateKeys.MODEL_NAME?


if model_before != st.session_state[StateKeys.MODEL_NAME]:
if current_model != st.session_state[StateKeys.MODEL_NAME]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is the intendend behaviour, then why is the config a st.fragment?

Comment on lines +314 to +333
def perform_dimensionality_reduction(
self, method: str, group: Optional[str] = None, circle: bool = False
):
"""Generic wrapper for dimensionality reduction methods to be used by LLM.

Args:
method (str): "pca", "tsne", "umap"
group (str, optional): column in metadata that should be used for coloring. Defaults to None.
circle (bool, optional): draw circle around each group. Defaults to False.
"""

result = {
"pca": self.plot_pca,
"tsne": self.plot_tsne,
"umap": self.plot_umap,
}.get(method)
if result is None:
raise ValueError(f"Invalid method: {method}")

return result(group=group, circle=circle)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be great to have something similarly simple for the differential analysis longterm.

Comment on lines +91 to +94
def test_get_protein_id_multiple_matches(gene_to_prot_map):
"""Test with a gene that appears in multiple compound keys."""
result = get_protein_id_for_gene_name("MULTI", gene_to_prot_map)
assert result == "PROT1;PROT2;PROT3"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the same as test_get_protein_id_compound_key. Actually VCL would be the one matching multiple protein ids.

Copy link
Collaborator

@JuliaS92 JuliaS92 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to have a conversation about genes vs proteins with everyone and decide on one to use (i am in favor of protein ids) in the backend. Display can be different, but should match all cases. Otherwise kudos :)

Base automatically changed from refactor_llm_VI to development November 8, 2024 16:38
@mschwoer mschwoer merged commit 85a9409 into development Nov 8, 2024
4 of 5 checks passed
@mschwoer mschwoer deleted the refactor_llm_VII branch November 8, 2024 16:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants