Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data missing #1

Open
zickun opened this issue Jun 12, 2024 · 2 comments
Open

Data missing #1

zickun opened this issue Jun 12, 2024 · 2 comments

Comments

@zickun
Copy link

zickun commented Jun 12, 2024

Dear author,
Thank you very much for your great contribution to the community and I am very interested in your research! However, it is a great pity that when I reproduced your work, I found that the PDB folder was missing in the data set folder. I know from the paper that I need to download, but I do not know which one should be downloaded. Could you please update the data section of readme to help me replicate your work?In addition, I sent you google Email, I wonder if you have received it.

@Jh-SYSU
Copy link
Contributor

Jh-SYSU commented Jun 12, 2024

Thanks for your interesting.

The native PDB structures can be obtained from https://github.com/zqgao22/HIGH-PPI (edge_list_12, x_list).
Also, you can use the predicted PDB structures from the pre-trained model ESMFold (https://github.com/facebookresearch/esm) with the protein sequences.

@zickun
Copy link
Author

zickun commented Jun 13, 2024

<<<<<<<<<< Protein GNN training >>>>>>>>>>
Processing...
Processing protein-protein interaction graph...
Processing protein graphs...
0%| | 0/1553 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/home/ryz/MUSE/trainer_ppi.py", line 651, in
trainer.multi_scale_em_train()
File "/home/ryz/MUSE/trainer_ppi.py", line 77, in multi_scale_em_train
self.gnn_model, _ = self._maximization(link_model=self.link_model,
File "/home/ryz/MUSE/trainer_ppi.py", line 102, in maximization
self.gnn_trainer = ProteinGNNTrainer(args=self.args,
File "/home/ryz/MUSE/trainer_ppi.py", line 211, in init
self.train_loader, self.test_loader = self.create_dataloaders()
File "/home/ryz/MUSE/trainer_ppi.py", line 225, in create_dataloaders
train_dataset = ProteinDataset(self.args, self.config, split='train')
File "/home/ryz/MUSE/dataset.py", line 209, in init
super(ProteinDataset, self).init(root=os.path.join(self.inter_dataset_root, self.dataset_name.replace('-', '
')))
File "/home/ryz/anaconda3/envs/MUSE/lib/python3.9/site-packages/torch_geometric/data/in_memory_dataset.py", line 57, in init
super().init(root, transform, pre_transform, pre_filter, log)
File "/home/ryz/anaconda3/envs/MUSE/lib/python3.9/site-packages/torch_geometric/data/dataset.py", line 97, in init
self._process()
File "/home/ryz/anaconda3/envs/MUSE/lib/python3.9/site-packages/torch_geometric/data/dataset.py", line 230, in _process
self.process()
File "/home/ryz/MUSE/dataset.py", line 324, in process
protein_graph_list = self.process_protein_graph(list(protein_idx2protein.values()), [protein_idx2sequence[i] for i in protein_idx2protein.keys()])
File "/home/ryz/MUSE/dataset.py", line 334, in process_protein_graph
X = torch.load(self.raw_dir + "/pdb/" + name + ".tensor")
File "/home/ryz/anaconda3/envs/MUSE/lib/python3.9/site-packages/torch/serialization.py", line 699, in load
with _open_file_like(f, 'rb') as opened_file:
File "/home/ryz/anaconda3/envs/MUSE/lib/python3.9/site-packages/torch/serialization.py", line 230, in _open_file_like
return _open_file(name_or_buffer, mode)
File "/home/ryz/anaconda3/envs/MUSE/lib/python3.9/site-packages/torch/serialization.py", line 211, in init
super(_open_file, self).init(open(name, mode))
FileNotFoundError: [Errno 2] No such file or directory: '/home/ryz/MUSE/datasets/high_ppi/raw/pdb/9606.ENSP00000000233.tensor'

i already have these two files(edge_list_12, x_list) in '/home/ryz/MUSE/datasets/high_ppi/raw/'. Can you tell me what should go in ‘PDB’ folder?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants