-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor llm vii #358
Refactor llm vii #358
Conversation
mschwoer
commented
Oct 23, 2024
•
edited
Loading
edited
- make link between LLM function definitions and actual functions transparent
- add general dimensionality reduction to DataSet
- simplify some helpers
- slightly refactor llm_integration
- add unit tests for whole LLM module
models, | ||
index=models.index(st.session_state.get(StateKeys.MODEL_NAME)) | ||
if current_model is not None | ||
else 0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
key = StateKeys.MODEL_NAME?
|
||
if model_before != st.session_state[StateKeys.MODEL_NAME]: | ||
if current_model != st.session_state[StateKeys.MODEL_NAME]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is the intendend behaviour, then why is the config a st.fragment?
def perform_dimensionality_reduction( | ||
self, method: str, group: Optional[str] = None, circle: bool = False | ||
): | ||
"""Generic wrapper for dimensionality reduction methods to be used by LLM. | ||
|
||
Args: | ||
method (str): "pca", "tsne", "umap" | ||
group (str, optional): column in metadata that should be used for coloring. Defaults to None. | ||
circle (bool, optional): draw circle around each group. Defaults to False. | ||
""" | ||
|
||
result = { | ||
"pca": self.plot_pca, | ||
"tsne": self.plot_tsne, | ||
"umap": self.plot_umap, | ||
}.get(method) | ||
if result is None: | ||
raise ValueError(f"Invalid method: {method}") | ||
|
||
return result(group=group, circle=circle) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be great to have something similarly simple for the differential analysis longterm.
def test_get_protein_id_multiple_matches(gene_to_prot_map): | ||
"""Test with a gene that appears in multiple compound keys.""" | ||
result = get_protein_id_for_gene_name("MULTI", gene_to_prot_map) | ||
assert result == "PROT1;PROT2;PROT3" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is the same as test_get_protein_id_compound_key. Actually VCL would be the one matching multiple protein ids.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to have a conversation about genes vs proteins with everyone and decide on one to use (i am in favor of protein ids) in the backend. Display can be different, but should match all cases. Otherwise kudos :)