Training data annotation workflow¶
The training data annotation workflow is generally 4-5 steps, and is iterative. The goal is to create a high-quality labeled dataset for training machine learning models. Typically, this process is only repeated 2-3 times before you have a satisfactory dataset.

- Preview cluster grids either through the Mantis web interface or your grids generated by SDCAT
- Assign clusters to the appropriate class, e.g. kelp, bird, whale, diatom, etc. through the Mantis web interface or in bulk through the bulk REST API
- If annotating images/video. Review and correct the bounding boxes or add new ones. Skip this if you are only labeling region-of-interest (ROI) data.
- Train a model on the data. This can currently be a classification model, or a detection model.
- Preview the performance metrics for your labels
- Repeat steps 1-5 until you are satisfied with the results
Example grids from SDCAT¶
| (mostly) akashiwo cluster | jellies cluster | velella cluster | |
![]() |
![]() |
![]() |
🗓️ Updated: 2025-09-02


