Training data annotation workflow¶

The training data annotation workflow is generally 4-5 steps, and is iterative. The goal is to create a high-quality labeled dataset for training machine learning models. Typically, this process is only repeated 2-3 times before you have a satisfactory dataset.

Preview cluster grids either through the Mantis web interface or your grids generated by SDCAT
Assign clusters to the appropriate class, e.g. kelp, bird, whale, diatom, etc. through the Mantis web interface or in bulk through the bulk REST API
If annotating images/video. Review and correct the bounding boxes or add new ones. Skip this if you are only labeling region-of-interest (ROI) data.
Train a model on the data. This can currently be a classification model, or a detection model.
Preview the performance metrics for your labels
Repeat steps 1-5 until you are satisfied with the results

Example grids from SDCAT¶


(mostly) akashiwo cluster	jellies cluster	velella cluster

🗓️ Updated: 2025-09-02