Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to use the mvc decoder at test-time? #283

Open
syr-cn opened this issue Jan 25, 2025 · 1 comment
Open

How to use the mvc decoder at test-time? #283

syr-cn opened this issue Jan 25, 2025 · 1 comment

Comments

@syr-cn
Copy link

syr-cn commented Jan 25, 2025

I'm trying to reproduce scGPT's performance in the gene expression level prediction (GEPC). I've tried two solutions:

  1. Disable both the gradient and gene masking, only do test time reconstruction.
gene_embs = model._encode(data_example['gene_ids'], data_example['values'], data_example['padding_mask'])
cell_emb = model._get_cell_emb_from_layer(gene_embs)
print('cell_emb', cell_emb.shape)
pred = model.generate(cell_emb, data_example['gene_ids'])
values = data_example['values']
print(f'values ({values.mean():.2f}±{values.std():.2f})\t', data_example['values'].shape, data_example['values'])
print(f'pred ({pred.mean():.2f}±{pred.std():.2f})\t', pred.shape, pred)

The output I get is:

cell_emb torch.Size([1, 512])
values (2.56±6.06)	 torch.Size([1, 2048]) tensor([[-2.,  1.,  1.,  ...,  1.,  1.,  1.]], device='cuda:0')
pred (32.29±0.43)	 torch.Size([1, 2048]) tensor([[32.3906, 32.2649, 32.3063,  ..., 32.7302, 32.8762, 32.7573]],
       device='cuda:0', grad_fn=<SqueezeBackward1>)

The outputs and the predicted values are very different.

  1. Use the fine-tuning code in scGPT/tutorials/Tutorial_Annotation.ipynb, and set MVC=True while training.

By doing this, I get extraordinary loss_mvc, which is approximately 150, while all other losses are less than 0.1

I believe there's something wrong with my implementation. Could you please help solve my problem?

@Thodorissio
Copy link

Hey I am having a similar problem (predictions around 30) but in my case it extends to loss_gep as well. I describe my case in detail in issue #285

Have you got any insight regarding the gene expression prediction (gep), since yours turns out with loss less than 0.1?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants