Skip to content

Download data

💾 Downloading your data for training

If you are training a model, the download process will need formatting into compatible formats for training. To assist with that, we have a tool aidata.

⬇️ Downloading your data for analysis

You may download data to a simple CSV directly from the Tator web interface through the metadata button.

Tator_download_metadata.png Here is a quick video showing how to do this: Download Metadata from Tator This will save a CSV file with any number of columns depending on your project configuration. For example, for the UAV project, a useful export would be the following:

(media) altitude (media) date (media) latitude (media) longitude (media) make (media) model $version_name $x_pixels $y_pixels $width_pixels $height_pixels Label score
60.62410003789310 2024-05-02T16:06:48+00:00 36.976068022980300 121.92875717895700 SONY DSC-RX1RM2 Baseline 6435.381443 860.939130 271.618557 410.060870 Shark 1
59.8184 2024-05-02T17:13:32+00:00 36.97105967800670 121.91850362801700 SONY DSC-RX1RM2 Baseline 3827.412371 2777.553623 317.587629 179.446377 Shark 1
59.27389984825490 2024-05-02T17:16:14+00:00 36.96902484400940 121.91568711392600 SONY DSC-RX1RM2 Baseline 4601.092784 2464.950725 250.907216 400.049275 Shark 1
59.39959973315540 2024-05-02T17:17:05+00:00 36.96702581099970 121.91080038101100 SONY DSC-RX1RM2 Baseline 6127.958763 3807.605797 261.041237 363.394203 Shark 1

Example downloads for training

Download data for machine learning to various formats can be accomplished with the aidata command line tool. This tool allows you to download data from Tator in various formats, such as VOC, CIFAR, and YOLO.
It also allows you to resize images, crop regions of interest (ROIs), and filter by labels, versions, and verification status (with --verified or --unverified).

To use, you will need a Tator token, which can be obtained from the Tator web interface by clicking on your username in the top right corner and selecting "API Token".

Some examples:

Download all verified Pinniped and Shark data and resize to 224x224 from the UAV project

This is useful for training a classification model. See the classification training for an example on how to train a classification model with this data.

pip install mbari-aidata
export TATOR_TOKEN=<your_token>
aidata download dataset --crop-roi --resize --labels "Pinniped" --version Baseline --verified --config https://docs.mbari.org/internal/ai/projects/uav-901902/config_uav.yml

Download all verified data and save to VOC format.

This is useful for training an object detection model.

aidata download dataset --voc --resize 224 --labels "Pinniped" --version Baseline --verified --config  https://docs.mbari.org/internal/ai/projects/config/config_uav.yml

Download all verified data and save to YOLO Ultralytics format.

This is useful for training a detection model. Requires two steps: first download the data to VOC, then transform it to YOLO format.

aidata download dataset --yolo --resize 224 --labels "Pinniped" --version Baseline --verified --config https://docs.mbari.org/internal/ai/projects/config/config_uav.yml
aidata transform voc-to-yolo --base-path Baseline

Download all unverified data and save to CIFAR format.

aidata download dataset --cifar --resize 224 --labels "Pinniped" --version Baseline --unverified --config  https://docs.mbari.org/internal/ai/projects/config/config_uav.yml

For more examples of downloading and augmenting your data which is useful for training models, see the transform command. Augmentation refers to the process of applying transformations to your data to increase the size and diversity of your dataset, which can help improve the performance of your models without the need for additional labeled data. We have found this useful for large format images, such as those from the UAV project.

🗓️ Updated: 2025-12-17