Skip to content

Triton Video Sampling

High-performance, diversity-aware frame sampling tool for autonomous underwater vehicle (AUV) midwater transect video. Extracts sparse, high-quality frames distributed across varied visual conditions for training computer vision models.

What is this?

When training computer vision models on underwater video, you need a representative dataset that captures the full range of visual conditions your model will encounter. However:

  • Temporal redundancy: Adjacent frames in video are nearly identical
  • Limited diversity: Long transects may have repetitive scenes
  • Quality issues: Motion blur, poor lighting, and featureless frames reduce training value

Triton Video Sampling solves this by:

  1. Filtering frames that fail quality checks (brightness, sharpness, information content)
  2. Selecting a diverse subset that spans the full range of visual conditions
  3. Enforcing temporal separation to avoid redundant similar frames

Key Features

  • High Performance: Parallel processing with OpenMP, metric caching
  • 🎯 Quality Gates: Reject dark, blurry, or featureless frames
  • 🌈 Diversity Selection: 3D grid-based sampling ensures visual variety
  • ⏱️ Temporal Filtering: Enforces minimum time gaps between frames
  • 🎨 Interest Scoring: Prioritizes visually rich, sharp, dynamic content
  • 🔧 Calibration Tools: Tune thresholds to your data with quantitative feedback
  • 💾 Metric Caching: Hash-based caching for fast re-runs

Quick Start

1. Build

mkdir build && cd build
cmake ..
make -j$(nproc)

See Installation for detailed build instructions and dependencies.

2. Calibrate

Analyze a sample video to determine appropriate quality thresholds:

./calibrate /path/to/sample_video.mp4 --sample-fps 2.0

This reports metric distributions and suggests thresholds for different pass rates (80%, 60%, 40%, 20%).

3. Sample

Extract diverse frames from your video collection:

./sample \
  --root-dir /path/to/videos \
  --camera 1 \
  --max-frames 5000 \
  --min-brightness 12 \
  --min-sharpness 15 \
  --min-entropy 2.5 \
  --output-dir ./frames

See Usage for complete command-line reference.

Algorithm Overview

The sampler uses a three-pass pipeline:

Pass 1: Quality Filtering

For each video:

  1. Sample frames at --sample-fps (e.g., 1.0 = once per second)
  2. Compute four quality metrics:
    • Brightness: Mean pixel value (0–255)
    • Sharpness: Laplacian variance (focus quality)
    • Entropy: Shannon entropy in bits (information content)
    • Motion: Frame-to-frame pixel difference
  3. Reject frames failing any quality gate
  4. Cache metrics for future runs

Pass 2: Diversity Selection

  1. Normalize brightness, log(sharpness), entropy to [0,1]
  2. Bin frames into 3D grid (default: 8³ = 512 cells)
  3. Rank frames within each cell by interest score: \(\(\text{score} = \text{entropy} \times \ln(1 + \text{sharpness}) \times (1 + \text{motion})\)\)
  4. Select top N frames per cell
  5. If total exceeds budget, keep highest-scoring frames globally

Temporal filtering enforces --min-gap seconds between frames from the same clip.

Pass 3: Frame Extraction

Extract selected frames as PNG/JPEG with ISO timestamp filenames:

Triton43_Cam1_20250904T120315Z_0001234.png

See Algorithm for in-depth explanation of each component.

When to Use This Tool

Good for:

  • Training object detection, segmentation, or classification models
  • Creating balanced datasets from hours of video
  • Ensuring model sees diverse lighting, turbidity, and scene types
  • Rapid prototyping with representative subsets

Not ideal for:

  • Temporal modeling (LSTM, video transformers) — need sequential frames
  • Tracking or optical flow — need consecutive frames
  • Exhaustive annotation — better to use active learning

Documentation Structure

  • Installation: Build requirements and setup
  • Usage: Command-line reference for sample and calibrate
  • Algorithm: In-depth explanation of metric computation and diversity selection
  • Tuning Guide: How to calibrate thresholds and optimize for your data
  • Troubleshooting: Common issues and solutions
  • API Reference: Core library functions (for developers)

Example Workflow

# Step 1: Calibrate on representative footage
./calibrate /data/Triton43_Cam1_sample.mp4

# Output shows:
#   brightness:  min=8   p5=15   median=85   p95=201  max=243
#   sharpness:   min=2   p5=12   median=42   p95=187  max=324
#   entropy:     min=1.2 p5=2.1  median=3.8  p95=5.2  max=6.1

# Step 2: Choose thresholds that pass ~60% of frames
./sample \
  --root-dir /data/Triton_20250904 \
  --camera 1 \
  --sample-fps 1.0 \
  --max-frames 5000 \
  --min-brightness 15 \
  --min-sharpness 12 \
  --min-entropy 2.5 \
  --output-dir ./training_data \
  --jobs 8

# Output: 5000 diverse, high-quality frames