Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GCS gfile operations fail in TF 2.17.0rc0 and 2.18 nightly outside of GCP #2016

Closed
lgeiger opened this issue Jun 19, 2024 · 1 comment
Closed

Comments

@lgeiger
Copy link
Contributor

lgeiger commented Jun 19, 2024

I already posted this issue a couple of days ago on upstream TensorFlow at tensorflow/tensorflow#69789 but posting it here again since it might be related to the gcs-filesystem package. /cc @yongtang

When trying to run GCS operations with tf.io.gfile on 2.17.0rc0 or 2.18 nightly anywhere outside of a GCP VM the command hangs and eventually fails after 10 retries with the error message as below.

import tensorflow as tf

tf.io.gfile.exists("gs://tfds-data/dataset_info/mnist/3.0.1/dataset_info.json")
2024-06-14 15:45:30.081439: W external/local_tsl/tsl/platform/cloud/google_auth_provider.cc:184] All attempts to get a Google authentication bearer token failed, returning an empty token. Retrieving token from files failed with "NOT_FOUND: Could not locate the credentials file.". Retrieving token from GCE failed with "FAILED_PRECONDITION: Error executing an HTTP request: libcurl code 6 meaning 'Couldn't resolve host name', error details: Could not resolve host: metadata.google.internal".
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.11/site-packages/tensorflow/python/lib/io/file_io.py", line 290, in file_exists_v2
    _pywrap_file_io.FileExists(compat.path_to_bytes(path))
tensorflow.python.framework.errors_impl.AbortedError: All 10 retry attempts failed. The last failure: Error executing an HTTP request: HTTP response code 301 with body '<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>301 Moved</TITLE></HEAD><BODY>
<H1>301 Moved</H1>
The document has moved
<A HREF="https://www.googleapis.com/storage/v1/b/tfds-data/o/dataset_info%2Fmnist%2F3.0.1%2Fdataset_info.json?fields=size%2Cgeneration%2Cupdated">here</A>.
</BODY></HTML>
'
	 when reading metadata of gs://tfds-data/dataset_info/mnist/3.0.1/dataset_info.json

I can't seem to reproduce this issue on either colab or a GCP VM. But it will consistently fail locally on my mac, inside a python:3.11 docker container, on GitHub actions or inside a kaggle notebook. The same code works fine with TensorFlow 2.16 so I don't think this is due to my local setup.

It also seems like other people are running into this with the latest TF nightly: tensorflow/datasets#5360

Would be great to get this fixed before the next stable release.

@lgeiger
Copy link
Contributor Author

lgeiger commented Jun 25, 2024

Looks like this issue is indeed caused by upstream TF: tensorflow/tensorflow#69789 (comment)

@lgeiger lgeiger closed this as completed Jun 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant