Object detection training
Training an Object Detection Model¶
Model training requires three main steps:
- Download the data to be used for training, validation, and testing the model
- Initiate the training
For our system, this involves the mbari-aidata and the deepsea-ai python packages. An AWS account is required to use the deepsea-ai package.
Tip
Get the project yaml file from the project page Install the aidata tool. See the installation instructions
Here is a sequence of commands, by way of example:
Download¶
aidata download dataset \
--config https://docs.mbari.org/internal/ai/projects/config/config_uav.yml \
--base-path $PWD \
--labels "Surfboard","Batray","Plume","Sea_Lion","Bird","Seal","Wave","Foam","Egregia","Reflectance","Buoy","Shark","Person","Mooring","Otter","Boat","Kelp","Mola","Secci_Disc","Jelly","Whale","RIB" \
--voc \
--token $TATOR_TOKEN
For more information on the download command, see the aidata download documentation.
Prepare¶
Transform the data, making crops of the images with overlap
aidata transform voc --base-path $PWD/Baseline --crop-size 1280 --crop-overlap 0.5
Convert the data to YOLO format
aidata transform voc-to-yolo --base-path $PWD/Baseline/transformed
Split the data into train/validate/test sets
aidata transform split -i $PWD/Baseline/transformed -o $PWD/Baselinesplit
Train in AWS SageMaker with deepsea-ai 🚀¶
Before training, see the instructions on preparing your data.
If your training has been scaled to 1280x1280, use yolov5x6, e.g.
info
Be sure your --batch-size is a multiple of the available GPUs, e.g. --batch-size 1,2,3,4 for ml.p3.2xlarge --batch-size 4 for ml.p3.8xlarge, --batch-size 8, or 16 for ml.p3.16xlarge.
deepsea-ai train --model yolov5x6 --instance-type ml.p3.16xlarge \
--config 901902_uavs.ini \
--labels $PWD/BaselineSplit/labels.tar.gz \
--images $PWD/BaselineSplit/images.tar.gz \
--label-map $PWD/Baseline/labels.txt \
--input-s3 s3://901902-new-starting-checkpoint/megafish_ROV_weights.pt \
--output-s3 s3://901902-new-model-checkpoints/ \
--resume True \
--epochs 60 \
--batch-size 16
Note
info
Before running the train command, be sure to check that there is no yolov5x6 folder in the output-s3 bucket, since the training job will override anything in that directory. Outputs from previous training runs should be moved to a new folder with a different name
info
Before running the train command, be sure to check that there is no training folder in the input-s3 bucket, since the training job will use that directory for the training images, labels, and labels text file. Inputs from previous training runs should be moved to a new folder with a different name.
Train yolov11x model in Google Colab environment 🚀¶
Documentation on training yolo11x detector model with UAV images on an A100 instance in Colab
on icefish
directory train-drone-model
setup
pyenv shell 3.11.6
python3 -m venv train-drone
source train-drone/bin/activate
pip install mbari-aidata
use yaml file from https://docs.mbari.org/internal/ai/projects/config/config_uav.yml
Set TATOR_TOKEN¶
export TATOR_TOKEN=<blahblahblah> # token value from TATOR credentials
echo $TATOR_TOKEN
Download data¶
aidata download dataset \
--config https://docs.mbari.org/internal/ai/projects/config/config_uav.yml \
--base-path $PWD/Sept232025 --voc \
--labels "Batray","Bird","Boat","Cement_Ship","Egregia","Fish","Jelly","Kayak","Kelp","Mola","Mooring_Buoy","Otter","Person","Pinniped","Secci_Disc","Shark","Surfboard","Velella_velella","Velella_velella_raft","Whale" \
--single-class "object" \
--verified \
--token $TATOR_TOKEN \
--disable-ssl-verify
logs go to
~/mbari_aidata/logs/
Transform data¶
./transform.sh
aidata transform voc --base-path $PWD/Sept232025 --resize 640 --crop-size 640 --crop-overlap 0.5 --min-visibility 0.0 --min-dim 20
./voc_to_yolo.sh
aidata transform voc-to-yolo --base-path $PWD/Sept232025/transformed
./split.sh
then run split. See the transform command for more details
aidata transform split -i $PWD/Sept232025/transformed -o $PWD/Sept232025split
Train yolo11x model¶
upload or create data.yaml file
train: /content/datasets/train/images
val: /content/datasets/val/images
test: /content/datasets/test/images
nc: 1
names: ['object']
roboflow:
workspace: liangdianzhong
project: -qvdww
version: 3
license: CC BY 4.0
url: https://universe.roboflow.com/liangdianzhong/-qvdww/dataset/3
Train model from COCO weights¶
In Colab notebook, upload data to google drive. Here the folder on google drive is named "uavs"
Install YOLO11 via Ultralytics
%pip install ultralytics supervision roboflow -q
import ultralytics
ultralytics.checks()
and then move data to appropriate directories
# Allow access to personal google drive and add new folders
# Connect Google Drive
from google.colab import drive
drive.mount("/content/drive", force_remount=True) # This will prompt for authorization.
# This will create the uavs files if they don't exist.
folders = ["uavs/"]
for folder in folders:
path = "/content/drive/MyDrive/" + folder
if not os.path.exists(path): # Create the folder if it does not exist
os.mkdir(path)
set up HOME. Move in the data compressed files, and unpack them.
import os
HOME = os.getcwd()
print(HOME)
!mkdir {HOME}/datasets
%cd {HOME}/datasets
from google.colab import userdata
uavs_folder = "/content/drive/MyDrive/uavs/"
!mkdir /content/datasets/
!mkdir /content/datasets/savedir/
!cp -r "/content/drive/MyDrive/uavs/images.tar.gz" "/content/datasets/savedir/"
!cp -r "/content/drive/MyDrive/uavs/labels.tar.gz" "/content/datasets/savedir/"
!tar xf /content/datasets/savedir/images.tar.gz --directory /content/datasets/savedir/
!tar xf /content/datasets/savedir/labels.tar.gz --directory /content/datasets/savedir/
move the data to the directory structure YOLO expects
## make the directories that yolo11 expects
!mkdir /content/datasets/train/
!mkdir /content/datasets/train/images/
!mkdir /content/datasets/train/labels/
!mkdir /content/datasets/test/
!mkdir /content/datasets/test/images/
!mkdir /content/datasets/test/labels/
!mkdir /content/datasets/val/
!mkdir /content/datasets/val/images/
!mkdir /content/datasets/val/labels/
#get the data.yaml file
!cp "/content/drive/MyDrive/uavs/data.yaml" "/content/datasets/data.yaml"
!ls /content/datasets/
#move the data to the expected directories
!cp -r "/content/datasets/savedir/images/train/" "/content/datasets/train/images/"
!cp -r "/content/datasets/savedir/labels/train/" "/content/datasets/train/labels/"
!cp -r "/content/datasets/savedir/images/test/" "/content/datasets/test/images/"
!cp -r "/content/datasets/savedir/labels/test/" "/content/datasets/test/labels/"
!cp -r "/content/datasets/savedir/images/val/" "/content/datasets/val/images/"
!cp -r "/content/datasets/savedir/labels/val/" "/content/datasets/val/labels/"
!ls /content/datasets/
!yolo task=detect mode=train model=yolo11x.pt data=/content/datasets/data.yaml epochs=40 patience=5 imgsz=640 plots=True
Here are examles of yolo11x training commands, with different starting weights
# Build a new model from YAML and start training from scratch
yolo detect train data=coco8.yaml model=yolo11x.yaml epochs=100 imgsz=640
# Start training from a pretrained *.pt model
yolo detect train data=coco8.yaml model=yolo11x.pt epochs=100 imgsz=640
# Build a new model from YAML, transfer pretrained weights to it and start training
yolo detect train data=coco8.yaml model=yolo11x.yaml pretrained=yolo11x.pt epochs=100 imgsz=640
Save the results of training
!cp "/content/runs/detect/train/weights/best.pt" "/content/drive/MyDrive/uavs/best.pt"
!cp "/content/runs/detect/train/weights/last.pt" "/content/drive/MyDrive/uavs/last.pt"
!cp -r "/content/runs/detect/train/" "/content/drive/MyDrive/uavs/train/"
!ls {HOME}/runs/detect/train/
from IPython.display import Image as IPyImage
IPyImage(filename=f'{HOME}/runs/detect/train/confusion_matrix.png', width=600)
IPyImage(filename=f'{HOME}/runs/detect/train/results.png', width=600)
IPyImage(filename=f'{HOME}/runs/detect/train/val_batch0_pred.jpg', width=600)
!yolo task=detect mode=val model=/content/runs/detect/train/weights/best.pt data=/content/datasets/data.yaml
!yolo task=detect mode=predict model=/content/runs/detect/train/weights/best.pt conf=0.25 source=/content/datasets/test/images/test/ save=True
🗓️ Updated: 2025-10-01