Training data annotation workflow¶

The training data annotation workflow is iterative. The goal is to build a high-quality labeled dataset, train a model, review the results, and then use what you learned to improve the next pass.

Most projects only need 2 to 3 iterations before the dataset is in good shape for training and evaluation.

Workflow steps¶

Preview cluster grids in Mantis or review grids generated by SDCAT.
Assign each cluster to the appropriate class, such as kelp, bird, whale, or diatom, either in the web interface or through the bulk REST API.
If you are annotating images or video, review the bounding boxes and correct or add annotations as needed. Skip this step for ROI-only workflows.
Train a model on the current dataset. Use either a classification workflow or an object detection workflow.
Review the performance metrics and sample outputs to find weak labels, missing classes, or confusing examples.
Repeat the cycle until the labels and model behavior are stable enough for your use case.

Example grids from SDCAT¶


(mostly) akashiwo cluster	jellies cluster	velella cluster

Updated: 2026-04-01