Download data
💾 Downloading your data for training¶
If you are training a model, the download process will need formatting into compatible formats for training. To assist with that, we have a tool aidata.
⬇️ Downloading your data for analysis¶
You may download data to a simple CSV directly from the Tator web interface through the metadata button.
Here is a quick video showing how to do this: Download Metadata from Tator
This will save a CSV file with any number of columns depending on your project configuration. For example, for the UAV project,
a useful export would be the following:
| (media) altitude | (media) date | (media) latitude | (media) longitude | (media) make | (media) model | $version_name | $x_pixels | $y_pixels | $width_pixels | $height_pixels | Label | score |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 60.62410003789310 | 2024-05-02T16:06:48+00:00 | 36.976068022980300 | 121.92875717895700 | SONY | DSC-RX1RM2 | Baseline | 6435.381443 | 860.939130 | 271.618557 | 410.060870 | Shark | 1 |
| 59.8184 | 2024-05-02T17:13:32+00:00 | 36.97105967800670 | 121.91850362801700 | SONY | DSC-RX1RM2 | Baseline | 3827.412371 | 2777.553623 | 317.587629 | 179.446377 | Shark | 1 |
| 59.27389984825490 | 2024-05-02T17:16:14+00:00 | 36.96902484400940 | 121.91568711392600 | SONY | DSC-RX1RM2 | Baseline | 4601.092784 | 2464.950725 | 250.907216 | 400.049275 | Shark | 1 |
| 59.39959973315540 | 2024-05-02T17:17:05+00:00 | 36.96702581099970 | 121.91080038101100 | SONY | DSC-RX1RM2 | Baseline | 6127.958763 | 3807.605797 | 261.041237 | 363.394203 | Shark | 1 |
Example downloads for training¶
Download data for machine learning to various formats can be accomplished with the aidata command line tool. This tool allows you to download data from Tator in various formats, such as VOC, CIFAR, and YOLO.
It also allows you to resize images, crop regions of interest (ROIs), and filter by labels, versions, and verification status (with --verified or --unverified).
To use, you will need a Tator token, which can be obtained from the Tator web interface by clicking on your username in the top right corner and selecting "API Token".
Some examples:
Download all verified Pinniped and Shark data and resize to 224x224 from the UAV project¶
This is useful for training a classification model. See the classification training for an example on how to train a classification model with this data.
pip install mbari-aidata
export TATOR_TOKEN=<your_token>
aidata download dataset --crop-roi --resize --labels "Pinniped" --version Baseline --verified --config https://docs.mbari.org/internal/ai/projects/uav-901902/config_uav.yml
Download all verified data and save to VOC format.¶
This is useful for training an object detection model.
aidata download dataset --voc --resize 224 --labels "Pinniped" --version Baseline --verified --config https://docs.mbari.org/internal/ai/projects/config/config_uav.yml
Download all verified data and save to YOLO Ultralytics format.¶
This is useful for training a detection model. Requires two steps: first download the data to VOC, then transform it to YOLO format.
aidata download dataset --yolo --resize 224 --labels "Pinniped" --version Baseline --verified --config https://docs.mbari.org/internal/ai/projects/config/config_uav.yml
aidata transform voc-to-yolo --base-path Baseline
Download all unverified data and save to CIFAR format.¶
aidata download dataset --cifar --resize 224 --labels "Pinniped" --version Baseline --unverified --config https://docs.mbari.org/internal/ai/projects/config/config_uav.yml
For more examples of downloading and augmenting your data which is useful for training models, see the transform command. Augmentation refers to the process of applying transformations to your data to increase the size and diversity of your dataset, which can help improve the performance of your models without the need for additional labeled data. We have found this useful for large format images, such as those from the UAV project.
🗓️ Updated: 2025-12-17