Usage¶
The m3_download package provides the m3-download command-line tool, which is the main entrypoint.
To see the usage, including the available subcommands, use the following command:
Available Commands¶
The m3-download tool provides the following subcommands:
| Command | Description |
|---|---|
generate |
Download images & extract localizations as Pascal VOC |
filter |
Exclude specified concepts from Pascal VOC annotations |
remap-voc |
Remap concepts in Pascal VOC annotations |
remap-yolo |
Remap class IDs in YOLO annotations |
add-taxonomy |
Add taxonomic information to Pascal VOC annotations |
voc-to-yolo |
Convert Pascal VOC annotations to YOLO format |
yolo-to-voc |
Convert YOLO annotations to Pascal VOC format |
yolo-to-json |
Convert YOLO annotations to JSON format |
count-localizations |
Count localizations in Pascal VOC annotations |
correct-image-dimensions |
Correct image dimensions in Pascal VOC annotations |
find-above-iou-threshold |
Find localizations with IoU above a threshold |
dedup-voc |
Deduplicate Pascal VOC annotations |
For more detailed information on each command, see the dedicated documentation page or use:
Workflow Examples¶
Complete Workflow: Download to YOLO Format¶
# 1. Download data for Sebastes and its descendants
m3-download generate images/ voc_annotations/ --include-concept Sebastes --include-descendants
# 2. Filter out unwanted concepts
m3-download filter voc_annotations/ --exclude "unidentified rockfish" --output-dir filtered_voc/
# 3. Remap similar concepts
m3-download remap-voc remapping.csv filtered_voc/ --output-dir remapped_voc/
# 4. Convert to YOLO format for training
m3-download voc-to-yolo remapped_voc/ --output-dir yolo_dataset/
# 5. Optionally remap YOLO class IDs
m3-download remap-yolo yolo_mapping.csv yolo_dataset/ --output-dir final_yolo_dataset/
Quality Control Workflow¶
# 1. Correct image dimensions in annotations
m3-download correct-image-dimensions voc_annotations/ images/
# 2. Find overlapping annotations
m3-download find-above-iou-threshold --threshold 0.5 voc_annotations/
# 3. Count annotations per class
m3-download count-localizations voc_annotations/ > class_distribution.txt
Dataset Enhancement Workflow¶
# 1. Download base dataset
m3-download generate images/ voc_annotations/ --include-concept species_list.txt
# 2. Add taxonomic information
m3-download add-taxonomy voc_annotations/ --output-dir voc_with_taxonomy/
# 3. Convert to multiple formats
m3-download voc-to-yolo voc_with_taxonomy/ --output-dir yolo_dataset/
m3-download yolo-to-json yolo_dataset/ yolo_dataset/yolo.names 1920 1080 annotations.json
Custom Filtering Pipeline¶
# 1. Download all data
m3-download generate images/ all_annotations/
# 2. Filter by concept
m3-download filter all_annotations/ --exclude "artifact" --output-dir no_artifacts/
# 3. Filter by overlapping annotations (create list first)
m3-download find-above-iou-threshold no_artifacts/ 0.7 > overlapping.txt
# 4. Process the overlapping files manually or with other tools
YOLO Dataset Manipulation Workflow¶
# 1. Convert VOC annotations to YOLO format
m3-download voc-to-yolo voc_annotations/ --output-dir yolo_dataset/
# 2. Remap YOLO class IDs (merge similar classes)
m3-download remap-yolo class_merge.csv yolo_dataset/ --output-dir merged_classes/
# 3. Convert back to VOC format (if needed)
m3-download yolo-to-voc merged_classes/ images/ voc_final/