-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
better handling of invalid files in open_mfdataset #6736
Comments
+1. You could make this change to
would be quite helpful!
This I'm not sure about because a user wouldn't know if they were missing some data in the middle... |
My vote is to have both, a warning, and an option to fill missing data with NaNs. My use case: I have an archive of 15 years of monthly forecasts. For one month one of the ensemble members is missing. I am converting binary format to zarr. The code is:
Currently, my only option is to remove the remaining ensemble member data files before processing. Since I have to use a custom backend (based on https://github.com/aurghs/xarray-backend-tutorial/tree/main), I tried to add code to return array filled with nans when |
This comment was marked as duplicate.
This comment was marked as duplicate.
Contributions welcome! |
Is your feature request related to a problem?
Suppose I'm trying to read a large number of netCDF files with
open_mfdataset
.Now suppose that one of those files is for some reason incorrect -- for instance there was a problem during the creation of that particular file, and its file size is zero, or it is not valid netCDF. The file exists, but it is invalid.
Currently
open_mfdataset
will raise an exception with the messageValueError: did not find a match in any of xarray's currently installed IO backends
As far as I can tell, there is currently no way to identify which one(s) of the files being read is the source of the problem. If there are several hundreds of those, finding the problematic files is a task by itself, even though xarray probably knows them.
Describe the solution you'd like
It would be most useful to this particular user if the error message could somehow identify the file(s) responsible for the exception.
Apart from better reporting, I would find it very useful if I could pass to
open_mfdataset
some kind of argument that would make it ignore invalid files altogether (ignore_invalid=False
comes to mind).Describe alternatives you've considered
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: