-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Eval不能正常运行 #3
Comments
您下载了对应的pdb文件吗,我们这个工作需要先将pdb处理成图才能进行推理,比如这里面的test_protein_12.pt就是处理的图文件 |
是的,下载了对应的pdb,但是在pdb文件夹下生成了对应的这些pt文件后就开始报错
807261890
***@***.***
…------------------ 原始邮件 ------------------
发件人: "tyang816/ProtSolM" ***@***.***>;
发送时间: 2024年12月24日(星期二) 晚上8:27
***@***.***>;
***@***.******@***.***>;
主题: Re: [tyang816/ProtSolM] Eval不能正常运行 (Issue #3)
您下载了对应的pdb文件吗,我们这个工作需要先将pdb处理成图才能进行推理,比如这里面的test_protein_12.pt就是处理的图文件
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
这里显示的是缺少了文件欸,我尝试复现一下,不好意思 |
python eval.py |
确实,删掉后可以运行,我以为这一步还要引用Pdb文件,看来是不用了? |
pdb用于制作图和获取物理化学特征,两项都处理完了就不用了 |
谢谢,佬 |
祝科研顺利!论文多多 |
对于ExternalTest可以正常跑通,但是一旦变为了我自己的数据集就会出现如下问题,此外发现一个问题,如果要使用自己的数据集,pdb文件夹的名称要设置为esmfold_pdb,否则会有其他bug 12/24/2024 22:33:06 - INFO - main - ***** Load Model ***** |
欸是的,目前pdb文件夹的名称要设置为esmfold_pdb,我修改一下这个bug。请问你有事先提取物理化学特征吗 |
我在刚刚的commit中修复了这个问题,新增了对存储pdb文件的文件夹名的判断:
|
没有事先提取物理特征,因为我看你整个图的流程,特征好像是直接从序列里边提取出来的呀。 |
是运行get_feature.py吗?这个是运行过的,但是还是这个错误 |
最后排查出了问题,是多线程可能会导致feature_dict不是全局变量,为空
|
似乎是在Dataloader中出现了一些问题
这是我们修改了一些本地的配置,以便顺利加载模型
parser = argparse.ArgumentParser()
# model config
parser.add_argument("--gnn", type=str, default="egnn", help="gat, gcn or egnn")
parser.add_argument("--gnn_config", type=str, default="src/config/egnn.yaml", help="gnn config")
parser.add_argument("--gnn_hidden_dim", type=int, default=512, help="hidden size of gnn")
parser.add_argument("--plm", type=str, default="./model/facebook", help="esm param number")
parser.add_argument("--plm_hidden_size", type=int, default=1280, help="hidden size of plm")
parser.add_argument("--pooling_method", type=str, default="attention1d", help="pooling method")
parser.add_argument("--pooling_dropout", type=float, default=0.1, help="pooling dropout")
这是readme中的建议脚本
python eval.py
--supv_dataset data/ExternalTest/esmfold_pdb
--test_file data/ExternalTest/ExternalTest.csv
--test_result_dir result/protssn_k20_h512/experiment
--feature_file data/ExternalTest/ExternalTest_feature.csv
--feature_name "aa_composition" "gravy" "ss_composition" "hygrogen_bonds" "exposed_res_fraction" "pLDDT"
--use_plddt_penalty
--batch_token_num 3000
这是报错的信息
12/24/2024 11:40:04 - INFO - main - ***** Loading Feature *****
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 7843/7843 [00:00<00:00, 13289.17it/s]
12/24/2024 11:40:05 - INFO - main - ***** Loading Dataset *****
Processing...
0it [00:00, ?it/s]
Total proteins: []
Wrong proteins: []
0it [00:00, ?it/s]
Done!
12/24/2024 11:40:05 - INFO - main - ***** Load Model *****
12/24/2024 11:40:06 - INFO - main - Number of parameter: 3.24M
12/24/2024 11:40:06 - INFO - main - Number of trainable parameter: 3.24M
12/24/2024 11:40:06 - INFO - main - ***** Running eval *****
12/24/2024 11:40:06 - INFO - main - Num test examples = 7579
12/24/2024 11:40:06 - INFO - main - Batch token num = 3000
0%| | 0/635 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/home/gaoyuan/ProtSolM/eval.py", line 341, in
eval_model(
File "/home/gaoyuan/ProtSolM/eval.py", line 128, in eval_model
model, epoch_metric_results, result_dict, ssn_embeds = test_epoch_runner(test_data)
File "/home/gaoyuan/ProtSolM/eval.py", line 95, in call
for batch in loop:
File "/home/gaoyuan/anaconda3/envs/protsolm/lib/python3.10/site-packages/tqdm/std.py", line 1181, in iter
for obj in iterable:
File "/home/gaoyuan/anaconda3/envs/protsolm/lib/python3.10/site-packages/accelerate/data_loader.py", line 552, in iter
current_batch = next(dataloader_iter)
File "/home/gaoyuan/anaconda3/envs/protsolm/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 701, in next
data = self._next_data()
File "/home/gaoyuan/anaconda3/envs/protsolm/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1465, in _next_data
return self._process_data(data)
File "/home/gaoyuan/anaconda3/envs/protsolm/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1491, in _process_data
data.reraise()
File "/home/gaoyuan/anaconda3/envs/protsolm/lib/python3.10/site-packages/torch/_utils.py", line 715, in reraise
raise exception
FileNotFoundError: Caught FileNotFoundError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/home/gaoyuan/anaconda3/envs/protsolm/lib/python3.10/site-packages/torch/utils/data/_utils/worker.py", line 351, in _worker_loop
data = fetcher.fetch(index) # type: ignore[possibly-undefined]
File "/home/gaoyuan/anaconda3/envs/protsolm/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 55, in fetch
return self.collate_fn(data)
File "/home/gaoyuan/ProtSolM/eval.py", line 297, in
collate_fn=lambda x: collect_fn(x),
File "/home/gaoyuan/ProtSolM/eval.py", line 291, in collect_fn
graph = future.result()
File "/home/gaoyuan/anaconda3/envs/protsolm/lib/python3.10/concurrent/futures/_base.py", line 451, in result
return self.__get_result()
File "/home/gaoyuan/anaconda3/envs/protsolm/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
raise self._exception
File "/home/gaoyuan/anaconda3/envs/protsolm/lib/python3.10/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
File "/home/gaoyuan/ProtSolM/eval.py", line 278, in process_data
data = torch.load(f"{args.supv_dataset}/{graph_dir.capitalize()}/processed/{name}.pt")
File "/home/gaoyuan/anaconda3/envs/protsolm/lib/python3.10/site-packages/torch/serialization.py", line 1319, in load
with _open_file_like(f, "rb") as opened_file:
File "/home/gaoyuan/anaconda3/envs/protsolm/lib/python3.10/site-packages/torch/serialization.py", line 659, in _open_file_like
return _open_file(name_or_buffer, mode)
File "/home/gaoyuan/anaconda3/envs/protsolm/lib/python3.10/site-packages/torch/serialization.py", line 640, in init
super().init(open(name, mode))
FileNotFoundError: [Errno 2] No such file or directory: 'data/ExternalTest/esmfold_pdb/Esmfold_pdb_k20/processed/test_protein_12.pt'
The text was updated successfully, but these errors were encountered: