Object detection training

Training an Object Detection Model¶

Model training requires three main steps:

Download the data to be used for training, validation, and testing the model
Initiate the training

For our system, this involves the mbari-aidata and the deepsea-ai python packages. An AWS account is required to use the deepsea-ai package.

Tip

Get the project yaml file from the project page Install the aidata tool. See the installation instructions

Here is a sequence of commands, by way of example:

Download¶

aidata download dataset \
--config https://docs.mbari.org/internal/ai/projects/config/config_uav.yml \
--base-path $PWD \
--labels "Surfboard","Batray","Plume","Sea_Lion","Bird","Seal","Wave","Foam","Egregia","Reflectance","Buoy","Shark","Person","Mooring","Otter","Boat","Kelp","Mola","Secci_Disc","Jelly","Whale","RIB" \
--voc \
--token $TATOR_TOKEN

For more information on the download command, see the aidata download documentation.

Prepare¶

Transform the data, making crops of the images with overlap

aidata transform voc --base-path $PWD/Baseline --crop-size 1280 --crop-overlap 0.5

Convert the data to YOLO format

aidata transform voc-to-yolo --base-path $PWD/Baseline/transformed

Split the data into train/validate/test sets

aidata transform split -i $PWD/Baseline/transformed -o $PWD/Baselinesplit

Train in AWS SageMaker with deepsea-ai 🚀¶

Before training, see the instructions on preparing your data.

If your training has been scaled to 1280x1280, use yolov5x6, e.g.

info

Be sure your --batch-size is a multiple of the available GPUs, e.g. --batch-size 1,2,3,4 for ml.p3.2xlarge --batch-size 4 for ml.p3.8xlarge, --batch-size 8, or 16 for ml.p3.16xlarge.

deepsea-ai train --model yolov5x6 --instance-type ml.p3.16xlarge \
--config 901902_uavs.ini \
--labels $PWD/BaselineSplit/labels.tar.gz \
--images $PWD/BaselineSplit/images.tar.gz \
--label-map $PWD/Baseline/labels.txt \
--input-s3 s3://901902-new-starting-checkpoint/megafish_ROV_weights.pt \
--output-s3 s3://901902-new-model-checkpoints/ \
--resume True \
--epochs 60 \
--batch-size 16

Note

info

Before running the train command, be sure to check that there is no yolov5x6 folder in the output-s3 bucket, since the training job will override anything in that directory. Outputs from previous training runs should be moved to a new folder with a different name

info

Before running the train command, be sure to check that there is no training folder in the input-s3 bucket, since the training job will use that directory for the training images, labels, and labels text file. Inputs from previous training runs should be moved to a new folder with a different name.

Train yolov11x model in Google Colab environment 🚀¶

Documentation on training yolo11x detector model with UAV images on an A100 instance in Colab

on icefish

directory train-drone-model

setup

pyenv shell 3.11.6
python3 -m venv train-drone
source train-drone/bin/activate
pip install mbari-aidata

use yaml file from https://docs.mbari.org/internal/ai/projects/config/config_uav.yml

Set TATOR_TOKEN¶

export TATOR_TOKEN=<blahblahblah>  # token value from TATOR credentials
echo $TATOR_TOKEN

Download data¶

aidata download dataset \
  --config https://docs.mbari.org/internal/ai/projects/config/config_uav.yml \
  --base-path $PWD/Sept232025 --voc \
  --labels "Batray","Bird","Boat","Cement_Ship","Egregia","Fish","Jelly","Kayak","Kelp","Mola","Mooring_Buoy","Otter","Person","Pinniped","Secci_Disc","Shark","Surfboard","Velella_velella","Velella_velella_raft","Whale" \
  --single-class "object" \
  --verified \
  --token $TATOR_TOKEN \
  --disable-ssl-verify

logs go to

~/mbari_aidata/logs/

Transform data¶

./transform.sh

aidata transform voc --base-path $PWD/Sept232025 --resize 640 --crop-size 640  --crop-overlap 0.5 --min-visibility 0.0 --min-dim 20

./voc_to_yolo.sh

aidata transform voc-to-yolo  --base-path $PWD/Sept232025/transformed

./split.sh

then run split. See the transform command for more details

aidata transform split -i $PWD/Sept232025/transformed -o $PWD/Sept232025split

Train yolo11x model¶

upload or create data.yaml file

train: /content/datasets/train/images
val: /content/datasets/val/images
test: /content/datasets/test/images

nc: 1
names: ['object']

roboflow:
  workspace: liangdianzhong
  project: -qvdww
  version: 3
  license: CC BY 4.0
  url: https://universe.roboflow.com/liangdianzhong/-qvdww/dataset/3

Train model from COCO weights¶

In Colab notebook, upload data to google drive. Here the folder on google drive is named "uavs"

Screenshot 2025-10-01 at 2 54 41 PM

Install YOLO11 via Ultralytics

%pip install ultralytics supervision roboflow -q
import ultralytics
ultralytics.checks()

and then move data to appropriate directories

# Allow access to personal google drive and add new folders

# Connect Google Drive
from google.colab import drive
drive.mount("/content/drive", force_remount=True) # This will prompt for authorization.

# This will create the uavs files if they don't exist.
folders =  ["uavs/"]
for folder in folders:
path = "/content/drive/MyDrive/" + folder
  if not os.path.exists(path): # Create the folder if it does not exist
    os.mkdir(path)

set up HOME. Move in the data compressed files, and unpack them.

import os
HOME = os.getcwd()
print(HOME)       

!mkdir {HOME}/datasets
%cd {HOME}/datasets
from google.colab import userdata
uavs_folder = "/content/drive/MyDrive/uavs/"

!mkdir /content/datasets/
!mkdir /content/datasets/savedir/
!cp -r "/content/drive/MyDrive/uavs/images.tar.gz" "/content/datasets/savedir/"

!cp -r "/content/drive/MyDrive/uavs/labels.tar.gz" "/content/datasets/savedir/"

!tar xf /content/datasets/savedir/images.tar.gz --directory /content/datasets/savedir/

!tar xf /content/datasets/savedir/labels.tar.gz --directory /content/datasets/savedir/

move the data to the directory structure YOLO expects

## make the directories that yolo11 expects
!mkdir /content/datasets/train/
!mkdir /content/datasets/train/images/
!mkdir /content/datasets/train/labels/
!mkdir /content/datasets/test/
!mkdir /content/datasets/test/images/
!mkdir /content/datasets/test/labels/
!mkdir /content/datasets/val/
!mkdir /content/datasets/val/images/
!mkdir /content/datasets/val/labels/

#get the data.yaml file
!cp "/content/drive/MyDrive/uavs/data.yaml" "/content/datasets/data.yaml"
!ls /content/datasets/

#move the data to the expected directories
!cp -r "/content/datasets/savedir/images/train/" "/content/datasets/train/images/"
!cp -r "/content/datasets/savedir/labels/train/" "/content/datasets/train/labels/"

!cp -r "/content/datasets/savedir/images/test/" "/content/datasets/test/images/"
!cp -r "/content/datasets/savedir/labels/test/" "/content/datasets/test/labels/"

!cp -r "/content/datasets/savedir/images/val/" "/content/datasets/val/images/"
!cp -r "/content/datasets/savedir/labels/val/" "/content/datasets/val/labels/"

!ls /content/datasets/

Run the training. In this case, we start from pre-trained COCO model weights

!yolo task=detect mode=train model=yolo11x.pt data=/content/datasets/data.yaml epochs=40 patience=5 imgsz=640 plots=True

Here are examles of yolo11x training commands, with different starting weights

# Build a new model from YAML and start training from scratch
yolo detect train data=coco8.yaml model=yolo11x.yaml epochs=100 imgsz=640

# Start training from a pretrained *.pt model
yolo detect train data=coco8.yaml model=yolo11x.pt epochs=100 imgsz=640

# Build a new model from YAML, transfer pretrained weights to it and start training
yolo detect train data=coco8.yaml model=yolo11x.yaml pretrained=yolo11x.pt epochs=100 imgsz=640

Save the results of training

!cp "/content/runs/detect/train/weights/best.pt" "/content/drive/MyDrive/uavs/best.pt"
!cp "/content/runs/detect/train/weights/last.pt" "/content/drive/MyDrive/uavs/last.pt"
!cp -r "/content/runs/detect/train/" "/content/drive/MyDrive/uavs/train/"

NOTE: The results of the completed training are saved in {HOME}/runs/detect/train/. Let's examine them.

!ls {HOME}/runs/detect/train/
from IPython.display import Image as IPyImage
IPyImage(filename=f'{HOME}/runs/detect/train/confusion_matrix.png', width=600)
IPyImage(filename=f'{HOME}/runs/detect/train/results.png', width=600)
IPyImage(filename=f'{HOME}/runs/detect/train/val_batch0_pred.jpg', width=600)

Validate the trained model

!yolo task=detect mode=val model=/content/runs/detect/train/weights/best.pt data=/content/datasets/data.yaml

Inference the test set with the trained model

!yolo task=detect mode=predict model=/content/runs/detect/train/weights/best.pt conf=0.25 source=/content/datasets/test/images/test/ save=True

🗓️ Updated: 2025-10-01