Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

loss=nan #11

Open
wml666 opened this issue Mar 19, 2019 · 3 comments
Open

loss=nan #11

wml666 opened this issue Mar 19, 2019 · 3 comments

Comments

@wml666
Copy link

wml666 commented Mar 19, 2019

按照您所说的方式,采用vot数据进行训练,出现loss=nan的情况
step = 0,loss = nan,cls_loss = 0.694078922272,reg_loss = nan,lr = 0.0010000000475,time = 2.76396989822

@sing-hui
Copy link

sing-hui commented May 6, 2019

@wml666 您好,请问你是用vot2016吗?我使用vot2016总是在训练过程出现错误,显示为:
tensorflow.python.framework.errors_impl.OutOfRangeError: FIFOQueue '_1_batch/fifo_queue' is closed and has insufficient elements (requested 1, current size 0)
[[Node: batch = QueueDequeueManyV2[component_types=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_INT32, DT_FLOAT], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](batch/fifo_queue, batch/n)]]
[[Node: batch/_497 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_181_batch", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "train.py", line 106, in
t.train()
File "train.py", line 74, in train
sess.run([train_op,loss,cls_loss,reg_loss,lr,debug_pre_cls,debug_pre_reg,debug_pre_score,debug_pre_box,label,target_box,detection,gt_box])
File "/home/h/anaconda3/envs/py36/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 877, in run
run_metadata_ptr)
File "/home/h/anaconda3/envs/py36/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1100, in _run
feed_dict_tensor, options, run_metadata)
File "/home/h/anaconda3/envs/py36/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1272, in _do_run
run_metadata)
File "/home/h/anaconda3/envs/py36/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1291, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.OutOfRangeError: FIFOQueue '_1_batch/fifo_queue' is closed and has insufficient elements (requested 1, current size 0)
[[Node: batch = QueueDequeueManyV2[component_types=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_INT32, DT_FLOAT], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](batch/fifo_queue, batch/n)]]
[[Node: batch/_497 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_181_batch", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]]

Caused by op 'batch', defined at:
File "train.py", line 106, in
t.train()
File "train.py", line 31, in train
template,,detection,gt_box,,_=self.reader.get_batch()
File "/home/h/user/Siamese-RPN-tensorflow-master/utils/image_reader_cuda.py", line 182, in get_batch
batch_size,num_threads=32,capacity=2048,shapes=[(127,127,3),(4),(255,255,3),(4),(2),()])
File "/home/h/anaconda3/envs/py36/lib/python3.6/site-packages/tensorflow/python/training/input.py", line 988, in batch
name=name)
File "/home/h/anaconda3/envs/py36/lib/python3.6/site-packages/tensorflow/python/training/input.py", line 762, in _batch
dequeued = queue.dequeue_many(batch_size, name=name)
File "/home/h/anaconda3/envs/py36/lib/python3.6/site-packages/tensorflow/python/ops/data_flow_ops.py", line 476, in dequeue_many
self._queue_ref, n=n, component_types=self._dtypes, name=name)
File "/home/h/anaconda3/envs/py36/lib/python3.6/site-packages/tensorflow/python/ops/gen_data_flow_ops.py", line 3480, in queue_dequeue_many_v2
component_types=component_types, timeout_ms=timeout_ms, name=name)
File "/home/h/anaconda3/envs/py36/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/home/h/anaconda3/envs/py36/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 454, in new_func
return func(*args, **kwargs)
File "/home/h/anaconda3/envs/py36/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3155, in create_op
op_def=op_def)
File "/home/h/anaconda3/envs/py36/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1717, in init
self._traceback = tf_stack.extract_stack()

OutOfRangeError (see above for traceback): FIFOQueue '_1_batch/fifo_queue' is closed and has insufficient elements (requested 1, current size 0)
[[Node: batch = QueueDequeueManyV2[component_types=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_INT32, DT_FLOAT], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](batch/fifo_queue, batch/n)]]
[[Node: batch/_497 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_181_batch", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]]
谢谢!

@zhaoym55
Copy link

我和你们遇到同样问题,请问你们怎么解决的??

@liangliu123
Copy link

@sing-hui 请问你这个问题怎么解决的

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants