You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
请提供您出现的报错信息及相关log
======================= Modified FLAGS detected =======================
FLAGS(name='FLAGS_use_stride_kernel', current_value=False, default_value=True)
=======================================================================
I0918 23:46:05.051712 973133 tcp_utils.cc:130] Successfully connected to 127.0.0.1:52457
loading annotations into memory...
Done (t=0.00s)
creating index...
index created!
W0918 23:46:24.526521 973133 dygraph_functions.cc:83150] got different data type, run type promotion automatically, this may cause data type been changed.
FatalError: Segmentation fault is detected by the operating system.
[TimeInfo: *** Aborted at 1726674458 (unix time) try "date -d @1726674458" if you are using GNU date ***]
LAUNCH INFO 2024-09-18 23:47:48,695 Exit code -11
[SignalInfo: *** SIGSEGV (@0xed94d) received by PID 973133 (TID 0xffffa00bae90) from PID 973133 ***]
Traceback (most recent call last):
File "/work/workspace/PaddleX/paddlex/utils/result_saver.py", line 30, in wrap
result = func(self, *args, **kwargs)
File "/work/workspace/PaddleX/paddlex/engine.py", line 42, in run
trainer.train()
File "/work/workspace/PaddleX/paddlex/modules/base/trainer/trainer.py", line 61, in train
train_result = self.pdx_model.train(**self.get_train_kwargs())
File "/work/workspace/PaddleX/paddlex/repo_apis/PaddleDetection_api/object_det/model.py", line 109, in train
return self.runner.train(
File "/work/workspace/PaddleX/paddlex/repo_apis/PaddleDetection_api/object_det/runner.py", line 54, in train
return self.run_cmd(
File "/work/workspace/PaddleX/paddlex/repo_apis/base/runner.py", line 359, in run_cmd
raise CalledProcessError(
paddlex.utils.errors.others.CalledProcessError: Command ['/usr/bin/python', '-m', 'paddle.distributed.launch', '--devices', '0,1,2,3', '--log_dir', '/work/workspace/PaddleX/ppyolo_plus_s_output/distributed_train_logs', 'tools/train.py', '--eval', '--config', '/root/.paddlex/tmp99soy5_c/detmodel_PP-YOLOE_plus-S.yml', '--use_vdl', 'True', '--vdl_log_dir', '/work/workspace/PaddleX/ppyolo_plus_s_output'] returned non-zero exit status 245.
环境
请提供您使用的PaddlePaddle和PaddleX的版本号
3.0-beta
请提供您使用的操作系统信息,如Linux/Windows/MacOS
请问您使用的Python版本是?
请问您使用的CUDA/cuDNN的版本号是?
The text was updated successfully, but these errors were encountered:
Checklist:
描述问题
PaddleX 支持对数据集进行校验,确保数据集格式符合 PaddleX 的相关要求。同时在数据校验时,能够对数据集进行分析,统计数据集的基本信息。
python main.py -c paddlex/configs/object_detection/PP-YOLOE_plus-S.yaml
-o Global.mode=check_dataset
-o Global.dataset_dir=./dataset/det_coco_examples
成功
复现
您是否已经正常运行我们提供的教程?
您是否在教程的基础上修改代码内容?还请您提供运行的代码
python main.py -c paddlex/configs/object_detection/PP-YOLOE_plus-S.yaml
-o Global.mode=train
-o Global.dataset_dir=./dataset/det_coco_examples
-o Global.output=ppyolo_plus_s_output
-o Global.device="npu:0,1,2,3"
您使用的数据集是?
请提供您出现的报错信息及相关log
======================= Modified FLAGS detected =======================
FLAGS(name='FLAGS_use_stride_kernel', current_value=False, default_value=True)
=======================================================================
I0918 23:46:05.051712 973133 tcp_utils.cc:130] Successfully connected to 127.0.0.1:52457
loading annotations into memory...
Done (t=0.00s)
creating index...
index created!
W0918 23:46:24.526521 973133 dygraph_functions.cc:83150] got different data type, run type promotion automatically, this may cause data type been changed.
\
C++ Traceback (most recent call last):
0 egr::Backward(std::vector<paddle::Tensor, std::allocatorpaddle::Tensor > const&, std::vector<paddle::Tensor, std::allocatorpaddle::Tensor > const&, bool)
1 egr::RunBackward(std::vector<paddle::Tensor, std::allocatorpaddle::Tensor > const&, std::vector<paddle::Tensor, std::allocatorpaddle::Tensor > const&, bool, bool, std::vector<paddle::Tensor, std::allocatorpaddle::Tensor > const&, bool, std::vector<paddle::Tensor, std::allocatorpaddle::Tensor > const&)
2 Conv2dGradNodeFinal::operator()(paddle::small_vector<std::vector<paddle::Tensor, std::allocatorpaddle::Tensor >, 15u>&, bool, bool)
3 paddle::experimental::conv2d_grad(paddle::Tensor const&, paddle::Tensor const&, paddle::Tensor const&, std::vector<int, std::allocator > const&, std::vector<int, std::allocator > const&, std::string const&, std::vector<int, std::allocator > const&, int, std::string const&, paddle::Tensor*, paddle::Tensor*)
4 void custom_kernel::Conv2DGradKernel<float, phi::CustomContext>(phi::CustomContext const&, phi::DenseTensor const&, phi::DenseTensor const&, phi::DenseTensor const&, std::vector<int, std::allocator > const&, std::vector<int, std::allocator > const&, std::string const&, std::vector<int, std::allocator > const&, int, std::string const&, phi::DenseTensor*, phi::DenseTensor*)
5 aclnnConvolutionBackward
6 InitL2Phase2Context(char const*, aclOpExecutor*)
7 GetOpExecCacheFromExecutor(aclOpExecutor*)
Error Message Summary:
FatalError:
Segmentation fault
is detected by the operating system.[TimeInfo: *** Aborted at 1726674458 (unix time) try "date -d @1726674458" if you are using GNU date ***]
LAUNCH INFO 2024-09-18 23:47:48,695 Exit code -11
[SignalInfo: *** SIGSEGV (@0xed94d) received by PID 973133 (TID 0xffffa00bae90) from PID 973133 ***]
Traceback (most recent call last):
File "/work/workspace/PaddleX/paddlex/utils/result_saver.py", line 30, in wrap
result = func(self, *args, **kwargs)
File "/work/workspace/PaddleX/paddlex/engine.py", line 42, in run
trainer.train()
File "/work/workspace/PaddleX/paddlex/modules/base/trainer/trainer.py", line 61, in train
train_result = self.pdx_model.train(**self.get_train_kwargs())
File "/work/workspace/PaddleX/paddlex/repo_apis/PaddleDetection_api/object_det/model.py", line 109, in train
return self.runner.train(
File "/work/workspace/PaddleX/paddlex/repo_apis/PaddleDetection_api/object_det/runner.py", line 54, in train
return self.run_cmd(
File "/work/workspace/PaddleX/paddlex/repo_apis/base/runner.py", line 359, in run_cmd
raise CalledProcessError(
paddlex.utils.errors.others.CalledProcessError: Command ['/usr/bin/python', '-m', 'paddle.distributed.launch', '--devices', '0,1,2,3', '--log_dir', '/work/workspace/PaddleX/ppyolo_plus_s_output/distributed_train_logs', 'tools/train.py', '--eval', '--config', '/root/.paddlex/tmp99soy5_c/detmodel_PP-YOLOE_plus-S.yml', '--use_vdl', 'True', '--vdl_log_dir', '/work/workspace/PaddleX/ppyolo_plus_s_output'] returned non-zero exit status 245.
环境
请提供您使用的PaddlePaddle和PaddleX的版本号
3.0-beta
请提供您使用的操作系统信息,如Linux/Windows/MacOS
请问您使用的Python版本是?
请问您使用的CUDA/cuDNN的版本号是?
The text was updated successfully, but these errors were encountered: