Releases: tensorflow/datasets
Releases · tensorflow/datasets
v1.2.0
Features
- Add
shuffle_files
argument totfds.load
function. The semantic is the same as inbuilder.as_dataset
function, which for now means that by default, files will be shuffled forTRAIN
split, and not for other splits. Default behaviour will change to always be False at next release. - Most datasets now support the new S3 API (documentation)
- Support for uint16 PNG images
Misc
- Crash while shuffling on Windows
- Various documentation improvements
New datasets
- AFLW2000-3D
- Amazon_US_Reviews
- binarized_mnist
- BinaryAlphaDigits
- Caltech Birds 2010
- Coil100
- DeepWeeds
- Food101
- MIT Scene Parse 150
- RockYou leaked password
- Stanford Dogs
- Stanford Online Products
- Visual Domain Decathlon
v1.1.0
Features
- Add
in_memory
option to cache small dataset in RAM. - Better sharding, shuffling and sub-split
- It is now possible to add arbitrary metadata to
tfds.core.DatasetInfo
which will be stored/restored with the dataset. Seetfds.core.Metadata
. - Better proxy support, possibility to add certificate
- Add
decoders
kwargs to override the default feature decoding
(guide).
New datasets
More datasets added:
- downsampled_imagenet
- patch_camelyon
- coco 2017 (with and without panoptic annotations)
- uc_merced
- trivia_qa
- super_glue
- so2sat
- snli
- resisc45
- pet_finder
- mnist_corrupted
- kitti
- eurosat
- definite_pronoun_resolution
- curated_breast_imaging_ddsm
- clevr
- bigearthnet
v1.0.2
- Add Apache Beam support
- Add direct GCS access for MNIST (with
tfds.load('mnist', try_gcs=True)
) - More datasets added
- Option to turn off tqdm bar (
tfds.disable_progress_bar()
) - Subsplit do not depends on the number of shard anymore (#292)
- Various bug fixes
Thanks to all external contributors for raising issues, their feedback and their pull request.