This is the code release for the TMLR publication "Feature Distillation Improves Zero-Shot Transfer from Synthetic Images" by Niclas Popp, Jan Hendrik Metzen and Matthias Hein. The following figure summarizes the experimental setup:
The codebase consists of three components:
- Domain-agnostic Distillation
- Synthetic Data Generation
- Domain-specific Distillation
The required packages are listed in the requirements.txt file. The code was tested on NVIDIA v100 and h100 gpus.
For an example file to run domain-agnostic distillation together with the available hyperparameters see: example_domain_agnostic.sh The code is built to use the webdataset dataloader together with .tar files. For details in how to setup the data for this kind of dataloader see here
The synthetic data generation process can be started as shown in the example_data_generation.sh file. For different domains, select the corresponding dataset option.
The final step of our framework is domain-specific distillation. An example together with the available options is given in the file example_domain_specific.sh. This step requires the final model checkpoint from step 1 and the synthetic data from step 2.