aidata Command Line Tool¶
aidata is a handy command line tool for extracting, transforming, loading, and downloading AI data. We use it across several MBARI projects that involve detection, clustering, or classification. You can find the source code on GitHub.
Installation¶
The easiest way to get started is with pip:
pip install mbari-aidata
What it does¶
aidata supports loading from VOC XML files or SDCAT CSVs (which include bounding boxes, saliency scores, and class info). You can also use it to download data directly from the Tator database, although we recomend using the Tator web-based interface instead unless you need to download data for training a model.
Once you’ve downloaded your data, you can transform it into formats like COCO, CIFAR, or PASCAL VOC—which you’ll need before training your models. It also supports data augmentation (like cropping and resizing) to help you get more out of your training sets.
Commands¶
aidata download- Download data to COCO, CIFAR, or PASCAL VOC formataidata load- Load SDCAT data, images, or video from a directory or text fileaidata db- Manage your databaseaidata transform- Transform downloaded data and apply augmentationsaidata split- Split your data before training
You can run aidata -h anytime to see the help message, or use it with any command (like aidata download -h) to see more options.
last updated: 2026-02-16