Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in Bayesian Optimization #4

Open
MichailChatzianastasis opened this issue Aug 27, 2020 · 3 comments
Open

Error in Bayesian Optimization #4

MichailChatzianastasis opened this issue Aug 27, 2020 · 3 comments

Comments

@MichailChatzianastasis
Copy link

Hey, I got the following error when i try to run bayesian optimization in ENAS. Any ideas how to fix this?

2020-08-27 09:45:05.767860: E tensorflow/core/common_runtime/executor.cc:623] Executor failed to create kernel. Invalid argument: Default AvgPoolingOp only supports NHWC on device type CPU
[[{{node child_1/layer_1/pool_at_1/from_0/AvgPool}} = AvgPoolT=DT_FLOAT, data_format="NCHW", ksize=[1, 1, 1, 1], padding="VALID", strides=[1, 1, 2, 2], _device="/job:localhost/replica:0/task:0/device:CPU:0"]]
Traceback (most recent call last):
File "/anaconda/envs/azureml_py36_automl/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1334, in _do_call
return fn(*args)
File "/anaconda/envs/azureml_py36_automl/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1319, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/anaconda/envs/azureml_py36_automl/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Default AvgPoolingOp only supports NHWC on device type CPU
[[{{node child_1/layer_1/pool_at_1/from_0/AvgPool}} = AvgPoolT=DT_FLOAT, data_format="NCHW", ksize=[1, 1, 1, 1], padding="VALID", strides=[1, 1, 2, 2], _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "bo.py", line 268, in
score = -eva.eval(arc)
File "/home/mchatzi/my_projects/D-VAE/bayesian_optimization/../software/enas/src/cifar10/evaluation.py", line 260, in eval
return self.ops["eval_func"](self.sess, "valid", feed_dict={self.ops["controller"]["sample_arc3"]: np.asarray(arch_str)})
File "/home/mchatzi/my_projects/D-VAE/bayesian_optimization/../software/enas/src/cifar10/eval_child.py", line 715, in customized_eval_once
acc = self.eval_once(sess, eval_set, feed_dict, verbose)
File "/home/mchatzi/my_projects/D-VAE/bayesian_optimization/../software/enas/src/cifar10/models.py", line 179, in eval_once
acc = sess.run(acc_op, feed_dict=feed_dict)
File "/anaconda/envs/azureml_py36_automl/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 671, in run
run_metadata=run_metadata)
File "/anaconda/envs/azureml_py36_automl/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 1255, in run
raise six.reraise(*original_exc_info)
File "/anaconda/envs/azureml_py36_automl/lib/python3.6/site-packages/six.py", line 703, in reraise
raise value
File "/anaconda/envs/azureml_py36_automl/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 1240, in run
return self._sess.run(*args, **kwargs)
File "/anaconda/envs/azureml_py36_automl/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 1312, in run
run_metadata=run_metadata)
File "/anaconda/envs/azureml_py36_automl/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 1076, in run
return self._sess.run(*args, **kwargs)
File "/anaconda/envs/azureml_py36_automl/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 929, in run
run_metadata_ptr)
File "/anaconda/envs/azureml_py36_automl/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1152, in _run
feed_dict_tensor, options, run_metadata)
File "/anaconda/envs/azureml_py36_automl/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1328, in _do_run
run_metadata)
File "/anaconda/envs/azureml_py36_automl/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1348, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Default AvgPoolingOp only supports NHWC on device type CPU
[[node child_1/layer_1/pool_at_1/from_0/AvgPool (defined at /home/mchatzi/my_projects/D-VAE/bayesian_optimization/../software/enas/src/cifar10/eval_child.py:145) = AvgPoolT=DT_FLOAT, data_format="NCHW", ksize=[1, 1, 1, 1], padding="VALID", strides=[1, 1, 2, 2], _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

Caused by op 'child_1/layer_1/pool_at_1/from_0/AvgPool', defined at:
File "bo.py", line 130, in
eva = Eval_NN() # build the network acc evaluater
File "/home/mchatzi/my_projects/D-VAE/bayesian_optimization/../software/enas/src/cifar10/evaluation.py", line 304, in Eval_NN
eva = Eval()
File "/home/mchatzi/my_projects/D-VAE/bayesian_optimization/../software/enas/src/cifar10/evaluation.py", line 234, in init
self.ops = self.get_ops(images, labels)
File "/home/mchatzi/my_projects/D-VAE/bayesian_optimization/../software/enas/src/cifar10/evaluation.py", line 178, in get_ops
child_model.connect_controller(controller_model)
File "/home/mchatzi/my_projects/D-VAE/bayesian_optimization/../software/enas/src/cifar10/eval_child.py", line 736, in connect_controller
self._build_valid()
File "/home/mchatzi/my_projects/D-VAE/bayesian_optimization/../software/enas/src/cifar10/eval_child.py", line 648, in _build_valid
logits = self._model(self.x_valid, False, reuse=True)
File "/home/mchatzi/my_projects/D-VAE/bayesian_optimization/../software/enas/src/cifar10/eval_child.py", line 222, in _model
layer, out_filters, 2, is_training)
File "/home/mchatzi/my_projects/D-VAE/bayesian_optimization/../software/enas/src/cifar10/eval_child.py", line 145, in _factorized_reduction
x, [1, 1, 1, 1], stride_spec, "VALID", data_format=self.data_format)
File "/anaconda/envs/azureml_py36_automl/lib/python3.6/site-packages/tensorflow/python/ops/nn_ops.py", line 2110, in avg_pool
name=name)
File "/anaconda/envs/azureml_py36_automl/lib/python3.6/site-packages/tensorflow/python/ops/gen_nn_ops.py", line 72, in avg_pool
data_format=data_format, name=name)
File "/anaconda/envs/azureml_py36_automl/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/anaconda/envs/azureml_py36_automl/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
return func(*args, **kwargs)
File "/anaconda/envs/azureml_py36_automl/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3274, in create_op
op_def=op_def)
File "/anaconda/envs/azureml_py36_automl/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1770, in init
self._traceback = tf_stack.extract_stack()

InvalidArgumentError (see above for traceback): Default AvgPoolingOp only supports NHWC on device type CPU
[[node child_1/layer_1/pool_at_1/from_0/AvgPool (defined at /home/mchatzi/my_projects/D-VAE/bayesian_optimization/../software/enas/src/cifar10/eval_child.py:145) = AvgPoolT=DT_FLOAT, data_format="NCHW", ksize=[1, 1, 1, 1], padding="VALID", strides=[1, 1, 2, 2], _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

@muhanzhang
Copy link
Owner

From the error message, did you use CPU instead of GPU? Using CPU will be very slow. And CPU only supports NHWC format images while the pretrained ENAS model was using GPU that relied on NCHW format.

@MichailChatzianastasis
Copy link
Author

Thanks for your reply. I followed the instructions in the readme. Should i change something in order to use GPU?

@muhanzhang
Copy link
Owner

If your machine has a GPU and you install TensorFlow's GPU version, then the program should automatically use GPU. You can also check your GPU usage by "watch nvidia-smi", to verify whether GPU is really used.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants