Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error while training "Caused by op 'loss_1/data_loss/GatherV2_1', defined at:" #15

Open
levanpon98 opened this issue Aug 14, 2020 · 9 comments

Comments

@levanpon98
Copy link

levanpon98 commented Aug 14, 2020

I got an error while training

Version

  • tensorflow==1.12.0
  • mesh-renderer==1.4
08-14 14:01:58 - x - INFO: - render_lambda: 0.000000, refine_lambda: 1.000000
08-14 14:02:54 - x - INFO: - Error Occured in Sess Run.
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1334, in _do_call
    return fn(*args)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1319, in _run_fn
    options, feed_dict, fetch_list, target_list, run_metadata)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: indices[16447] = 35709 is not in [0, 35709)
	 [[{{node loss_1/data_loss/GatherV2_1}} = GatherV2[Taxis=DT_INT32, Tindices=DT_INT64, Tparams=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](mesh_generator_1/Tanh, loss_1/data_loss/GatherV2_1/indices, GatherV2_9/axis)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "main.py", line 134, in <module>
    main()
  File "main.py", line 94, in main
    model.fit()
  File "/content/3D-Face-GCNs/base_model.py", line 619, in fit
    string, results = self.evaluate(val_image)
  File "/content/3D-Face-GCNs/base_model.py", line 719, in evaluate
    result = self.predict(batch_image)
  File "/content/3D-Face-GCNs/base_model.py", line 791, in predict
    proj_loss, refine_loss, perc_loss = self.sess.run(fetches, feed_dict)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 929, in run
    run_metadata_ptr)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1152, in _run
    feed_dict_tensor, options, run_metadata)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1328, in _do_run
    run_metadata)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1348, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: indices[16447] = 35709 is not in [0, 35709)
	 [[node loss_1/data_loss/GatherV2_1 (defined at /content/3D-Face-GCNs/base_model.py:429)  = GatherV2[Taxis=DT_INT32, Tindices=DT_INT64, Tparams=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](mesh_generator_1/Tanh, loss_1/data_loss/GatherV2_1/indices, GatherV2_9/axis)]]

Caused by op 'loss_1/data_loss/GatherV2_1', defined at:
  File "main.py", line 134, in <module>
    main()
  File "main.py", line 86, in main
    model = NormalModel(args, sess, graph, refer_mesh, image_paths, img_file)
  File "/content/3D-Face-GCNs/model_normal.py", line 16, in __init__
    super(Model, self).__init__(*args, **kwargs)
  File "/content/3D-Face-GCNs/base_model.py", line 106, in __init__
    self.build_graph()
  File "/content/3D-Face-GCNs/base_model.py", line 234, in build_graph
    image_feat_test, gcn_image_feat_test, self.regularization, True)
  File "/content/3D-Face-GCNs/base_model.py", line 429, in compute_loss
    sym_diff = tf.gather(gcn_texture, self.bfm.left_index, axis=1) - tf.gather(
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/array_ops.py", line 2669, in gather
    return gen_array_ops.gather_v2(params, indices, axis, name=name)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/gen_array_ops.py", line 3332, in gather_v2
    "GatherV2", params=params, indices=indices, axis=axis, name=name)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 3274, in create_op
    op_def=op_def)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 1770, in __init__
    self._traceback = tf_stack.extract_stack()

InvalidArgumentError (see above for traceback): indices[16447] = 35709 is not in [0, 35709)
	 [[node loss_1/data_loss/GatherV2_1 (defined at /content/3D-Face-GCNs/base_model.py:429)  = GatherV2[Taxis=DT_INT32, Tindices=DT_INT64, Tparams=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](mesh_generator_1/Tanh, loss_1/data_loss/GatherV2_1/indices, GatherV2_9/axis)]]

How do I solve this error? Thank you

@cyjouc
Copy link

cyjouc commented Jan 12, 2021

have you slove these problems?

@levanpon98
Copy link
Author

@cyjouc I think we should use GPU to training, because the gather function may be error when we train in CPU, you can see more information from this link https://www.tensorflow.org/api_docs/python/tf/gather

@cyjouc
Copy link

cyjouc commented Jan 12, 2021

I also use GPU to train. Now I can not the details for model?Do you share experiences with me (hicaicaihi for my wechat)?

@cyjouc
Copy link

cyjouc commented Jan 13, 2021

Hi,Can you share experiments with me?hicaicahi for my wechat

@cyjouc
Copy link

cyjouc commented Jan 15, 2021

@cyjouc I think we should use GPU to training, because the gather function may be error when we train in CPU, you can see more information from this link https://www.tensorflow.org/api_docs/python/tf/gather

Hi,Do you sucessfully prepare the dataset? now ,i occur to these error the prepare the dataset

@cyjouc
Copy link

cyjouc commented Jan 19, 2021

I got an error while training

Version

  • tensorflow==1.12.0
  • mesh-renderer==1.4
08-14 14:01:58 - x - INFO: - render_lambda: 0.000000, refine_lambda: 1.000000
08-14 14:02:54 - x - INFO: - Error Occured in Sess Run.
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1334, in _do_call
    return fn(*args)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1319, in _run_fn
    options, feed_dict, fetch_list, target_list, run_metadata)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: indices[16447] = 35709 is not in [0, 35709)
	 [[{{node loss_1/data_loss/GatherV2_1}} = GatherV2[Taxis=DT_INT32, Tindices=DT_INT64, Tparams=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](mesh_generator_1/Tanh, loss_1/data_loss/GatherV2_1/indices, GatherV2_9/axis)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "main.py", line 134, in <module>
    main()
  File "main.py", line 94, in main
    model.fit()
  File "/content/3D-Face-GCNs/base_model.py", line 619, in fit
    string, results = self.evaluate(val_image)
  File "/content/3D-Face-GCNs/base_model.py", line 719, in evaluate
    result = self.predict(batch_image)
  File "/content/3D-Face-GCNs/base_model.py", line 791, in predict
    proj_loss, refine_loss, perc_loss = self.sess.run(fetches, feed_dict)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 929, in run
    run_metadata_ptr)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1152, in _run
    feed_dict_tensor, options, run_metadata)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1328, in _do_run
    run_metadata)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1348, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: indices[16447] = 35709 is not in [0, 35709)
	 [[node loss_1/data_loss/GatherV2_1 (defined at /content/3D-Face-GCNs/base_model.py:429)  = GatherV2[Taxis=DT_INT32, Tindices=DT_INT64, Tparams=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](mesh_generator_1/Tanh, loss_1/data_loss/GatherV2_1/indices, GatherV2_9/axis)]]

Caused by op 'loss_1/data_loss/GatherV2_1', defined at:
  File "main.py", line 134, in <module>
    main()
  File "main.py", line 86, in main
    model = NormalModel(args, sess, graph, refer_mesh, image_paths, img_file)
  File "/content/3D-Face-GCNs/model_normal.py", line 16, in __init__
    super(Model, self).__init__(*args, **kwargs)
  File "/content/3D-Face-GCNs/base_model.py", line 106, in __init__
    self.build_graph()
  File "/content/3D-Face-GCNs/base_model.py", line 234, in build_graph
    image_feat_test, gcn_image_feat_test, self.regularization, True)
  File "/content/3D-Face-GCNs/base_model.py", line 429, in compute_loss
    sym_diff = tf.gather(gcn_texture, self.bfm.left_index, axis=1) - tf.gather(
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/array_ops.py", line 2669, in gather
    return gen_array_ops.gather_v2(params, indices, axis, name=name)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/gen_array_ops.py", line 3332, in gather_v2
    "GatherV2", params=params, indices=indices, axis=axis, name=name)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 3274, in create_op
    op_def=op_def)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 1770, in __init__
    self._traceback = tf_stack.extract_stack()

InvalidArgumentError (see above for traceback): indices[16447] = 35709 is not in [0, 35709)
	 [[node loss_1/data_loss/GatherV2_1 (defined at /content/3D-Face-GCNs/base_model.py:429)  = GatherV2[Taxis=DT_INT32, Tindices=DT_INT64, Tparams=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](mesh_generator_1/Tanh, loss_1/data_loss/GatherV2_1/indices, GatherV2_9/axis)]]

How do I solve this error? Thank you

Hi,would you like share the prepared dataset for training? the following method I hava some troubles.

@cyjouc
Copy link

cyjouc commented Jan 19, 2021

@cyjouc I think we should use GPU to training, because the gather function may be error when we train in CPU, you can see more information from this link https://www.tensorflow.org/api_docs/python/tf/gather

Hi,Do you sucessfully prepare the dataset? now ,i occur to these error the prepare the dataset

@djx99
Copy link

djx99 commented Jul 6, 2021

Hi,I have the same problem, do you solve it? And how?

@djx99
Copy link

djx99 commented Jul 6, 2021

@cyjouc I think we should use GPU to training, because the gather function may be error when we train in CPU, you can see more information from this link https://www.tensorflow.org/api_docs/python/tf/gather

Hi,Do you sucessfully prepare the dataset? now ,i occur to these error the prepare the dataset

Hi,I have the same problem, do you solve it? And how?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants