Dataset Preparation
“The model is only as good as the data.” - Daniel Puratich?
As the AEAC Student UAS Competition CONOPS is different every year, a new model needs to be trained every year as well. Unfortunately, this means a lot of manual labour.
Data Collection
Software
See: Image Collection Repository
Hardware
Flight computer: A Raspberry Pi with a 32GB microSD card is sufficient
Camera: The same one used at competition (e.g. $200 CV Camera )
Flight
Find the drone height in the appropriate AEAC document: SysInt
Data Cleaning
Once images have been collected, any image with features of interest (e.g. landing pads) are kept, and the rest are discarded. This can be parallelized by uploading the images to a folder on OneDrive and creating a signup sheet so that members process a few hundred images at a time.
Members can use the Tiles view to see the images without having to click through each one individually
Images can be deleted in the OneDrive webpage
2022-2023 instructions: Landing Pad Data Cleaning Instructions
Data Labelling
Once images have been cleaned, they are downloaded and then uploaded to Roboflow for labelling. A Roboflow account with a non-existent email can be created (e.g. warg-0a@warg.warg ). Each Roboflow project can only have up to 3 collaborators, so multiple projects can be created. It is unclear whether simultaneous logins cause accounts to be banned, so it is better to create new accounts for every project and to add a signup sheet for members currently logged into the account.
New Roboflow project:
Create a Roboflow account (e.g. warg-0a@warg.warg )
In the main account, create a new workspace (e.g. warglandingpad)
In the workspace, create a new project (e.g. warglandingpad)
Typically the use case requires Object Detection
Add the first class; additional classes are added in further instructions
Log out
Create 2 additional Roboflow accounts (e.g. warg-0b@warg.warg , warg-0c@warg.warg )
Log out
In the main account, navigate to the workspace (left bar)
Invite the additional Roboflow accounts (top right)
Log out
In the additional Roboflow accounts, accept the invitation (bell icon in top right)
Log out
In the main account, navigate to the workspace (left bar)
Navigate to the project (bottom centre)
Uploading:
Upload the images (left bar)
The images will cache locally in the browser (i.e. RAM), so split the images (use a good batch name)
Save and Continue
Repeat until the images for the project are uploaded
Classes:
Navigate to Annotate (left bar)
Assign a batch to the main account
Click on an image
Use the bounding box tool (right bar) to draw a box anywhere
Add the new class
Repeat d-e until all classes have been added
Click on the project name (top left)
Click on the 3 dots (left bar) and navigate to Project Settings
Check Lock Annotation Classes
Return to the image that was just labelled
Delete the boxes
Repeat for as many projects are required.
2022-2023 instructions: Landing Pad Data Labelling Instructions
When a batch is labelled, verify the labels are correctly placed by sampling a few images (scroll down and randomly pick an image, repeat). Once the batch is verified, add that batch to the dataset with the default 70/20/10 split.
Data Augmentation
Once all images have been labelled, navigate to Dataset in the left bar and ensure that the split is as close to 70/20/10 as possible (a few images of variance is fine). Then, navigate to Generate.
Generate augmented version:
Leave default
Leave default
None
Flip, Hue, Saturation, Brightness.
2022-2023 used:
Flip: Horizontal, Vertical
Hue: -27° and +27°
Saturation: -75% and +75%
Brightness: -25% and +0%
3x (but if higher multiplier is available, use it)
Click Generate. It takes a few minutes for the version to be generated.
Download version:
Click on the version to download under Versions
Click on Export Dataset (left)
Export:
Format: YOLOv8
Select download zip to computer
Uncheck Also train a model etc.
Click Continue
Download the zip file
Data Recombination
Extract all zip files to the same location. It is fine to have the README and .yaml files overwrite each other, as the READMEs can be ignored and the .yaml are almost identical (the differences are in the Roboflow section).
Ideally, the image order is randomized. However, labels are required to correspond to the same image.
Roboflow provides hashes at the end of the file names, which is sufficient and can be used as the prefix. Bulk Rename Utility is used to rename files: https://www.bulkrenameutility.co.uk/
Bulk Rename Utility settings:
Move/Copy Parts (6): Copy last n, 32, To start, 1, Sep.: . (dot)
Apply the rename to all files (test, train, valid).
The 3 directories now contain the dataset and are ready for training.