Classification training
Training a Classification Model¶
Model training requires two steps:
- Download the data to be used for training, validation, and testing the model
- Initiate the training
This involves the mbari-aidata package and the vitrain code to train the model.
Tip
Get the project yaml file from the project page Install the aidata tool. See the installation instructions
Here is a sequence of commands, by way of example to download only verified data of specific classes. The TATOR_TOKEN is an environment variable that should be set to your Tator token. See instructions here: Tator Token on how to get your Tator token.
Download¶
Download all verified ROIs from the dataset using the aidata command line tool.
aidata download dataset \
--config https://docs.mbari.org/internal/ai/projects/902004-Planktivore/config_highmag.yml \
--crop-roi \
--resize 224 \
--base-path $PWD \
--verified \
--token $TATOR_TOKEN
or for specific labels, you can use the following command:
aidata download dataset \
--config https://docs.mbari.org/internal/ai/projects/902004-Planktivore/config_highmag.yml \
--crop-roi \
--resize 224 \
--base-path $PWD \
--verified \
--labels "Nano_plankton,Ceratium" \
--token $TATOR_TOKEN
For more information on the download command, see the aidata download documentation.
Tip
Before running the download command, you can check the available labels in the dataset for your project through the fast
lookup for all labels.
You can get that here for this project: http://mantis.shore.mbari.org:8001/labels/902004-Planktivore this will return all labels in sorted order from the largest to smallest.
If you click the "pretty-print" it will be easy to read

Verify ROI Download Count¶
Ensure that you have downloaded the expected number of ROIs by running the following command.
find . -maxdepth 1 -type d | while read -r dir; do printf "%s:\t" "$dir"; find "$dir" -type f | wc -l; done
.: 3372
./Tiarina: 24
./Nano_plankton: 243
./Detonula_Cerataulina_Lauderia: 41
./Ceratium: 38
./Strombidium: 11
./Truncated: 35
./Medium_pennate: 383
./Mesodinium: 158
./Cylindrotheca: 11
./Prorocentrum: 166
./Dinoflagellate: 53
./Detritus: 497
./Ciliate: 23
./Thalassionema: 43
./Akashiwo: 70
./Chaetoceros: 818
./Pseudo-nitzschia: 355
./Eucampia: 12
./Polykrikos: 14
./Gyrodinium: 101
./Amphidinium_Oxyphysis: 247
./Guinardia_Dactyliosolen: 28
Train 🚀¶
First, clone the repository and install the requirements:
git clone https://github.com/mbari-org/vittrain
cd vittrain
pip install -r requirements.txt
Train with a 16-block size Vision Transformer (ViT) model and name mbari-uav-vits-b16:¶
python src/fine_tune_vits.py \
--model-name mbari-uav-vits-b16 \
--base-model google/vit-base-patch16-224-in21k8 \
--raw-data $PWD/crops \
--filter-data $PWD/filtered \
--add-rotations True \
--num-epochs 5
Train with an 8-block size Vision Transformer (ViT) model and name mbari-uav-vits-b8:¶
python src/fine_tune_vits.py \
--model-name mbari-uav-vits-b8 \
--base-model facebook/dino-vitb8 \
--raw-data $PWD/crops \
--filter-data $PWD/filtered \
--add-rotations True \
--num-epochs 5
Train with a 32-block size Vision Transformer (ViT) model and name mbari-uav-vits-b32:¶
python src/fine_tune_vits.py \
--model-name mbari-uav-vits-b32 \
--base-model openai/clip-vit-base-patch32 \
--raw-data $PWD/crops \
--filter-data $PWD/filtered \
--add-rotations True \
--num-epochs 5
🗓️ Updated: 2025-07-04