Triton Video Sampling¶
High-performance, diversity-aware frame sampling tool for autonomous underwater vehicle (AUV) midwater transect video. Extracts sparse, high-quality frames distributed across varied visual conditions for training computer vision models.
What is this?¶
When training computer vision models on underwater video, you need a representative dataset that captures the full range of visual conditions your model will encounter. However:
- Temporal redundancy: Adjacent frames in video are nearly identical
- Limited diversity: Long transects may have repetitive scenes
- Quality issues: Motion blur, poor lighting, and featureless frames reduce training value
Triton Video Sampling solves this by:
- Filtering frames that fail quality checks (brightness, sharpness, information content)
- Selecting a diverse subset that spans the full range of visual conditions
- Enforcing temporal separation to avoid redundant similar frames
Key Features¶
- ⚡ High Performance: Parallel processing with OpenMP, metric caching
- 🎯 Quality Gates: Reject dark, blurry, or featureless frames
- 🌈 Diversity Selection: 3D grid-based sampling ensures visual variety
- ⏱️ Temporal Filtering: Enforces minimum time gaps between frames
- 🎨 Interest Scoring: Prioritizes visually rich, sharp, dynamic content
- 🔧 Calibration Tools: Tune thresholds to your data with quantitative feedback
- 💾 Metric Caching: Hash-based caching for fast re-runs
Quick Start¶
1. Build¶
See Installation for detailed build instructions and dependencies.
2. Calibrate¶
Analyze a sample video to determine appropriate quality thresholds:
This reports metric distributions and suggests thresholds for different pass rates (80%, 60%, 40%, 20%).
3. Sample¶
Extract diverse frames from your video collection:
./sample \
--root-dir /path/to/videos \
--camera 1 \
--max-frames 5000 \
--min-brightness 12 \
--min-sharpness 15 \
--min-entropy 2.5 \
--output-dir ./frames
See Usage for complete command-line reference.
Algorithm Overview¶
The sampler uses a three-pass pipeline:
Pass 1: Quality Filtering¶
For each video:
- Sample frames at
--sample-fps(e.g., 1.0 = once per second) - Compute four quality metrics:
- Brightness: Mean pixel value (0–255)
- Sharpness: Laplacian variance (focus quality)
- Entropy: Shannon entropy in bits (information content)
- Motion: Frame-to-frame pixel difference
- Reject frames failing any quality gate
- Cache metrics for future runs
Pass 2: Diversity Selection¶
- Normalize brightness, log(sharpness), entropy to [0,1]
- Bin frames into 3D grid (default: 8³ = 512 cells)
- Rank frames within each cell by interest score: \(\(\text{score} = \text{entropy} \times \ln(1 + \text{sharpness}) \times (1 + \text{motion})\)\)
- Select top N frames per cell
- If total exceeds budget, keep highest-scoring frames globally
Temporal filtering enforces --min-gap seconds between frames from the same clip.
Pass 3: Frame Extraction¶
Extract selected frames as PNG/JPEG with ISO timestamp filenames:
See Algorithm for in-depth explanation of each component.
When to Use This Tool¶
Good for:
- Training object detection, segmentation, or classification models
- Creating balanced datasets from hours of video
- Ensuring model sees diverse lighting, turbidity, and scene types
- Rapid prototyping with representative subsets
Not ideal for:
- Temporal modeling (LSTM, video transformers) — need sequential frames
- Tracking or optical flow — need consecutive frames
- Exhaustive annotation — better to use active learning
Documentation Structure¶
- Installation: Build requirements and setup
- Usage: Command-line reference for
sampleandcalibrate - Algorithm: In-depth explanation of metric computation and diversity selection
- Tuning Guide: How to calibrate thresholds and optimize for your data
- Troubleshooting: Common issues and solutions
- API Reference: Core library functions (for developers)
Example Workflow¶
# Step 1: Calibrate on representative footage
./calibrate /data/Triton43_Cam1_sample.mp4
# Output shows:
# brightness: min=8 p5=15 median=85 p95=201 max=243
# sharpness: min=2 p5=12 median=42 p95=187 max=324
# entropy: min=1.2 p5=2.1 median=3.8 p95=5.2 max=6.1
# Step 2: Choose thresholds that pass ~60% of frames
./sample \
--root-dir /data/Triton_20250904 \
--camera 1 \
--sample-fps 1.0 \
--max-frames 5000 \
--min-brightness 15 \
--min-sharpness 12 \
--min-entropy 2.5 \
--output-dir ./training_data \
--jobs 8
# Output: 5000 diverse, high-quality frames