Triton Video Sampling¶

High-performance, diversity-aware frame sampling tool for autonomous underwater vehicle (AUV) midwater transect video. Extracts sparse, high-quality frames distributed across varied visual conditions for training computer vision models.

What is this?¶

When training computer vision models on underwater video, you need a representative dataset that captures the full range of visual conditions your model will encounter. However:

Temporal redundancy: Adjacent frames in video are nearly identical
Limited diversity: Long transects may have repetitive scenes
Quality issues: Motion blur, poor lighting, and featureless frames reduce training value

Triton Video Sampling solves this by:

Filtering frames that fail quality checks (brightness, sharpness, information content)
Selecting a diverse subset that spans the full range of visual conditions
Enforcing temporal separation to avoid redundant similar frames

Key Features¶

⚡ High Performance: Parallel processing with OpenMP, metric caching
🎯 Quality Gates: Reject dark, blurry, or featureless frames
🌈 Diversity Selection: 3D grid-based sampling ensures visual variety
⏱️ Temporal Filtering: Enforces minimum time gaps between frames
🎨 Interest Scoring: Prioritizes visually rich, sharp, dynamic content
🔧 Calibration Tools: Tune thresholds to your data with quantitative feedback
💾 Metric Caching: Hash-based caching for fast re-runs

Quick Start¶

1. Build¶

mkdir build && cd build
cmake ..
make -j$(nproc)

See Installation for detailed build instructions and dependencies.

2. Calibrate¶

Analyze a sample video to determine appropriate quality thresholds:

./calibrate /path/to/sample_video.mp4 --sample-fps 2.0

This reports metric distributions and suggests thresholds for different pass rates (80%, 60%, 40%, 20%).

3. Sample¶

Extract diverse frames from your video collection:

./sample \
  --root-dir /path/to/videos \
  --camera 1 \
  --max-frames 5000 \
  --min-brightness 12 \
  --min-sharpness 15 \
  --min-entropy 2.5 \
  --output-dir ./frames

See Usage for complete command-line reference.

Algorithm Overview¶

The sampler uses a three-pass pipeline:

Pass 1: Quality Filtering¶

For each video:

Sample frames at --sample-fps (e.g., 1.0 = once per second)
Compute four quality metrics:
- Brightness: Mean pixel value (0–255)
- Sharpness: Laplacian variance (focus quality)
- Entropy: Shannon entropy in bits (information content)
- Motion: Frame-to-frame pixel difference
Reject frames failing any quality gate
Cache metrics for future runs

Pass 2: Diversity Selection¶

Normalize brightness, log(sharpness), entropy to [0,1]
Bin frames into 3D grid (default: 8³ = 512 cells)
Rank frames within each cell by interest score: \(\(\text{score} = \text{entropy} \times \ln(1 + \text{sharpness}) \times (1 + \text{motion})\)\)
Select top N frames per cell
If total exceeds budget, keep highest-scoring frames globally

Temporal filtering enforces --min-gap seconds between frames from the same clip.

Pass 3: Frame Extraction¶

Extract selected frames as PNG/JPEG with ISO timestamp filenames:

Triton43_Cam1_20250904T120315Z_0001234.png

See Algorithm for in-depth explanation of each component.

When to Use This Tool¶

Good for:

Training object detection, segmentation, or classification models
Creating balanced datasets from hours of video
Ensuring model sees diverse lighting, turbidity, and scene types
Rapid prototyping with representative subsets

Not ideal for:

Temporal modeling (LSTM, video transformers) — need sequential frames
Tracking or optical flow — need consecutive frames
Exhaustive annotation — better to use active learning

Documentation Structure¶

Installation: Build requirements and setup
Usage: Command-line reference for sample and calibrate
Algorithm: In-depth explanation of metric computation and diversity selection
Tuning Guide: How to calibrate thresholds and optimize for your data
Troubleshooting: Common issues and solutions
API Reference: Core library functions (for developers)

Example Workflow¶

# Step 1: Calibrate on representative footage
./calibrate /data/Triton43_Cam1_sample.mp4

# Output shows:
#   brightness:  min=8   p5=15   median=85   p95=201  max=243
#   sharpness:   min=2   p5=12   median=42   p95=187  max=324
#   entropy:     min=1.2 p5=2.1  median=3.8  p95=5.2  max=6.1

# Step 2: Choose thresholds that pass ~60% of frames
./sample \
  --root-dir /data/Triton_20250904 \
  --camera 1 \
  --sample-fps 1.0 \
  --max-frames 5000 \
  --min-brightness 15 \
  --min-sharpness 12 \
  --min-entropy 2.5 \
  --output-dir ./training_data \
  --jobs 8

# Output: 5000 diverse, high-quality frames