Finetune algorithms log only train regret #76

DT6A · 2023-08-01T12:02:25Z

All of the algorithms with offline-to-online finetuning log training regret (regret obtained by online interactions which are used for training) under both train/regret and eval/regret. So we report only train regret which is different from Cal-QL work where authors report eval regret. Reporting eval regret is strange because the thing we really want to minimize on practice is a train regret so this bug is not critical but should be kept in mind. I will fix it but without reruning all of the algorithms due to compute limitations (maybe later we will rerun it).

The text was updated successfully, but these errors were encountered:

DT6A added bug Something isn't working wontfix This will not be worked on labels Aug 1, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Finetune algorithms log only train regret #76

Finetune algorithms log only train regret #76

DT6A commented Aug 1, 2023 •

edited

Loading

Finetune algorithms log only train regret #76

Finetune algorithms log only train regret #76

Comments

DT6A commented Aug 1, 2023 • edited Loading

DT6A commented Aug 1, 2023 •

edited

Loading