-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add train, val, test
folder paths in data.yaml
at save_data_yaml()
#1422
base: develop
Are you sure you want to change the base?
Conversation
Hi @xaristeidou 👋🏻! Thanks for staying active in the supervision repo! As for the PR you opened, the main problem I see is that YOLOv8 shares a single |
@SkalskiP Hello there 👋🏻! Yes indeed, if we want to create train, valid, test subset folders for the dataset training we have to run the process one time for each subset. Nevertheless, all 3 subset folder datasets contain the same I think there are 3 scenarios in that case:
|
@xaristeidou, One idea I had is that |
@SkalskiP Well not too complicated. We could modify
The function should modify the |
@xaristeidou, wanna try to implement the PoC of this solution? |
@SkalskiP Yes! I will be back when it is ready. |
@SkalskiP I have added the new changes. Here is a Colab notebook for easy testing. https://colab.research.google.com/drive/1BL7c2ycXkuCrEE5JOqh7u7Zf1kJzgb43?usp=sharing |
@SkalskiP Did you manage to take a look at the new committed changes and test with the notebook? |
Description
In the process of developing a notebook as scheduled in #1388, I used the
sv.DetectionDataset().as_yolo()
method which executes in the backed thesave_data_yaml()
function to create thedata.yaml
file need for the dataset.As YOLO the model construction for training, and the documentation, the model needs prerequisite
train
,val
arguments in thedata.yaml
file, and thetest
argument is not needed but could be passed also if a test dataset exists. When trying to runmodel.train()
the following error raises:SyntaxError: /content/dataset/data.yaml 'train:' key missing ❌. 'train' and 'val' are required in all data YAMLs.
Therefore in
save_data_yaml()
except thenc
,names
arguments we should export also thetrain
,val
,test
paths in order to be ready for executing themodel.train()
process. I think we should export the default paths as follows:If someone has a different working directory than the root of the folder containing the data.yaml, should change these paths manually. At least it will be easier to debug and modify the path if needed than to add the arguments in the yaml file.
List any dependencies that are required for this change.
None
Please delete options that are not relevant.
How has this change been tested, please provide a testcase or example of how you tested the change?
Using
sv.DetectionDataset.as_yolo()
exports the data.yaml file with prerequisitetrain, val, test
paths.Any specific deployment considerations
None
Docs