Training data annotation workflow¶
The training data annotation workflow is iterative. The goal is to build a high-quality labeled dataset, train a model, review the results, and then use what you learned to improve the next pass.
Most projects only need 2 to 3 iterations before the dataset is in good shape for training and evaluation.
Workflow steps¶
- Preview cluster grids in Mantis or review grids generated by SDCAT.
- Assign each cluster to the appropriate class, such as kelp, bird, whale, or diatom, either in the web interface or through the bulk REST API.
- If you are annotating images or video, review the bounding boxes and correct or add annotations as needed. Skip this step for ROI-only workflows.
- Train a model on the current dataset. Use either a classification workflow or an object detection workflow.
- Review the performance metrics and sample outputs to find weak labels, missing classes, or confusing examples.
- Repeat the cycle until the labels and model behavior are stable enough for your use case.
Example grids from SDCAT¶
| (mostly) akashiwo cluster | jellies cluster | velella cluster | |
![]() |
![]() |
![]() |
Updated: 2026-04-01



