Skip to content

Tuning Guide

This guide helps you optimize sampler parameters for your specific dataset. The goal is to maximize quality and diversity while meeting your frame budget (--max-frames).

Overview

The sampler has two categories of parameters:

  1. Quality thresholds: Define minimum standards (brightness, sharpness, entropy)
  2. Diversity controls: Define how frames are selected (n-bins, max-per-cell, min-gap)

Workflow:

  1. Run calibrate on representative footage → understand metric distributions
  2. Choose quality thresholds → balance pass rate and quality
  3. Choose diversity parameters → balance coverage and redundancy
  4. Run sample on full dataset → inspect results
  5. Iterate if needed

Step 1: Calibration

Run the Calibrate Tool

./calibrate /path/to/sample_video.mp4 \
  --sample-fps 2.0 \
  --min-brightness 10 \
  --max-brightness 245 \
  --min-sharpness 10 \
  --min-entropy 2.0

Choose a representative clip:

  • ✅ Includes variety of conditions (lighting, subjects, motion)
  • ✅ ~5–15 minutes duration (enough frames for statistics)
  • ❌ Not an outlier (e.g., all murky water or all bright shallow)

Initial thresholds: Set permissive values (10th percentile) to see full distribution.

Interpret Output

Metric Distributions

Metric distributions (all examined frames):
  brightness    min=8.2   p5=18.3   median=87.2   p95=201.4  max=243.1
  sharpness     min=3.1   p5=12.7   median=45.3   p95=187.9  max=321.5
  entropy       min=1.3   p5=2.2    median=3.9    p95=5.3    max=6.2
  motion        min=0.5   p5=1.8    median=8.4    p95=28.7   max=89.3

Key percentiles:

  • p5 (5th percentile): ~80% pass threshold
  • p50 (median): ~50% pass threshold
  • p95 (95th percentile): ~20% pass threshold

Example: If sharpness p5=12.7, setting --min-sharpness 12.7 rejects the blurriest 5% of frames.

Per-Gate Rejection

Per-gate rejection counts:
  min-brightness (≥10):    125 / 1200 rejected  (10.4%)
  max-brightness (≤245):   43 / 1200 rejected   (3.6%)
  min-sharpness (≥10):     89 / 1200 rejected   (7.4%)
  min-entropy (≥2.0):      156 / 1200 rejected  (13.0%)

Combined quality filter:   312 / 1200 rejected  (26.0%)
                          888 / 1200 passed     (74.0%)

What to look for:

  • High rejection (>50% by one gate): Either that metric is problematic in your data, or threshold is too strict
  • Low rejection (<5% by all gates): Thresholds are too permissive, not filtering effectively
  • Target: 30–60% combined pass rate (balanced filtering)

Threshold Recommendations

Threshold recommendations (for target pass rates):

  80% pass:  --min-brightness 18  --max-brightness 220  --min-sharpness 12  --min-entropy 2.2
  60% pass:  --min-brightness 28  --max-brightness 205  --min-sharpness 25  --min-entropy 2.8
  40% pass:  --min-brightness 42  --max-brightness 185  --min-sharpness 48  --min-entropy 3.5
  20% pass:  --min-brightness 68  --max-brightness 160  --min-sharpness 92  --min-entropy 4.3

How to use:

  • Choose a row based on desired selectivity (60% is typical)
  • Adjust individual thresholds based on per-gate rejection (e.g., relax sharpness if too strict)

Step 2: Choose Quality Thresholds

Brightness

What it measures: Mean pixel value (0–255 scale)

When to adjust:

  • Dark environment (deep water, night): Lower --min-brightness (e.g., 10–20)
  • Bright environment (shallow water, daylight): Raise --min-brightness (e.g., 30–50)
  • Overexposure issues: Lower --max-brightness (e.g., 200–220)

Recommendations:

Environment min-brightness max-brightness
Deep ocean (>200m) 10–20 230
Mid-water (50–200m) 20–40 220
Shallow (<50m), sunlight 40–80 200
Artificial lighting 50–100 210

Visual check: Run calibrate, examine rejected frames. If many look acceptable, lower threshold.

Sharpness

What it measures: Laplacian variance (focus quality)

When to adjust:

  • Frequent blur (fast-moving vehicle, turbulent water): Lower --min-sharpness (e.g., 10–20)
  • High-quality optics (professional camera, stable platform): Raise --min-sharpness (e.g., 30–50)
  • Training on blurry data: Lower threshold to match test distribution

Recommendations:

Camera Quality min-sharpness
Low-res or fast-moving 10–20
Standard AUV camera 20–40
High-res or slow-moving 40–80

Visual check: Inspect frames near the threshold. If they look acceptably sharp, threshold is good. If obviously blurry, raise it.

Entropy

What it measures: Shannon entropy in bits (information content)

When to adjust:

  • Featureless environment (open ocean, sparse plankton): Lower --min-entropy (e.g., 2.0–2.5)
  • Feature-rich environment (reef, particle clouds): Raise --min-entropy (e.g., 3.0–4.0)
  • Goal is rare events: Lower threshold to avoid rejecting sparse interesting frames

Recommendations:

Environment min-entropy
Open ocean (blue water) 2.0–2.5
Mid-water (plankton, particles) 2.5–3.5
Benthic (seafloor, organisms) 3.0–4.5

Visual check: Low-entropy frames are uniform (nearly solid color). If many look visually rich but rejected, lower threshold.

Motion

Not a quality gate: Motion is used for interest scoring, not filtering. Zero motion is acceptable (static scene).

When to tune (via interest score formula, requires code change):

  • Prefer static frames: Reduce motion weight in score
  • Prefer dynamic frames: Increase motion weight in score

Default formula: entropy × log(1 + sharpness) × (1 + motion) gives 1–10× boost for motion.


Step 3: Choose Diversity Parameters

Sample Rate (--sample-fps)

What it does: Determines how many frames are examined per video in Pass 1.

Trade-off:

  • Higher (e.g., 2–5 fps): More candidates → better diversity, longer runtime
  • Lower (e.g., 0.5–1 fps): Fewer candidates → faster, may miss short events

Recommendations:

Video Frame Rate Recommended --sample-fps
24–30 fps 1.0 (every 1s)
60 fps 2.0 (every 0.5s)
10–15 fps 0.5 (every 2s)

Rule of thumb: Sample ~1–2× per second regardless of video frame rate.

When to increase:

  • Short transects (seconds-long events) → Need finer temporal resolution
  • High diversity requirements → Need more candidates to fill grid

When to decrease:

  • Very long videos (hours) → Reduce runtime without losing coverage
  • Limited CPU resources → Faster Pass 1

Frame Budget (--max-frames)

What it does: Target number of output frames (may output fewer if insufficient candidates).

How to choose:

  1. Estimate training dataset size needed (e.g., 5000 frames for small model, 50,000 for large)
  2. Account for annotation effort (e.g., 5000 frames × 30s/frame = 42 hours)
  3. Set --max-frames to target value

Typical values:

  • Small dataset (proof of concept): 500–1,000
  • Medium dataset (single-task model): 5,000–10,000
  • Large dataset (multi-task or fine-grained): 20,000–100,000

Note: If dataset has high diversity (many videos, varied conditions), larger budgets are beneficial. If dataset is repetitive, smaller budgets suffice.

Temporal Gap (--min-gap)

What it does: Minimum seconds between selected frames from the same video.

Trade-off:

  • Higher (e.g., 5–10s): Strong diversity, fewer frames per video
  • Lower (e.g., 0.5–2s): More frames per video, risk of near-duplicates

Recommendations:

Vehicle Speed Scene Change Rate Recommended --min-gap
Stationary Slow (static scene) 5–10s
Slow (<1 m/s) Moderate (gradual drift) 2–5s
Fast (>1 m/s) Rapid (transect) 1–2s

When to increase:

  • Static scenes (e.g., benthic landers) → Avoid redundant nearly identical frames
  • Long videos with repetitive content → Spread selections across time

When to decrease:

  • Rapid transects (e.g., fast AUV pass) → Capture short-lived features
  • High diversity within videos → Allow multiple frames from interesting sections

Visual check: If output frames from same video look nearly identical, increase --min-gap.

Grid Resolution (--n-bins)

What it does: Number of bins per feature axis (total cells = n_bins³).

Trade-off:

  • Higher (e.g., 10–12): Finer distinctions, more diversity, risk of sparse cells
  • Lower (e.g., 5–7): Coarser grouping, less diversity, denser cells

Recommendations:

Number of Candidates Recommended --n-bins Total Cells
<1,000 5 125
1,000–10,000 8 512
10,000–50,000 10 1,000
>50,000 12 1,728

Rule of thumb: Aim for ~10–50 candidates per cell on average.

\[\text{n\_bins} \approx \sqrt[3]{\frac{\text{num\_candidates}}{20}}\]

When to increase:

  • Large candidate pool → Finer diversity resolution
  • Over-sampling common scenes → Split dense cells

When to decrease:

  • Small candidate pool → Avoid empty cells
  • Under-utilizing candidates → Merge sparse cells

Diagnostic: After running sample, check output:

After grid filter (8³ cells, ≤10/cell):  4237  (387 occupied cells)
  • Occupied cells < 20% of total: Consider lowering n_bins (grid too sparse)
  • Occupied cells > 80% of total: Consider raising n_bins (grid too dense)

Max Per Cell (--max-per-cell)

What it does: Maximum frames selected from each grid cell.

Default: ceil(max_frames / n_bins³) — ensures cells can collectively reach budget.

When to override:

  • Force uniform distribution: Set low (e.g., 1–5) → each cell contributes equally
  • Prefer high-interest frames: Set high (e.g., 20–50) → allows dense cells to dominate

Trade-offs:

--max-per-cell Effect
Low (1–5) Strong diversity, may miss high-interest regions
Medium (10–20) Balanced diversity and interest
High (50+) Interest-driven, may over-sample common scenes

Example (5000 frame budget, 8³ = 512 cells):

  • Default max_per_cell = 10: Each cell contributes ≤10, up to 5120 total (trimmed to 5000)
  • Override max_per_cell = 5: Each cell contributes ≤5, up to 2560 total (may under-shoot budget)
  • Override max_per_cell = 50: Each cell contributes ≤50, allows dense cells to provide more

When to use:

  • Rare event detection: Lower max_per_cell to ensure rare conditions get representation
  • High-interest prioritization: Raise max_per_cell to allow scoring to dominate

Step 4: Run and Inspect

Run the Sampler

./sample \
  --root-dir /path/to/videos \
  --camera 1 \
  --sample-fps 1.0 \
  --max-frames 5000 \
  --min-gap 2.0 \
  --min-brightness 20 \
  --max-brightness 220 \
  --min-sharpness 25 \
  --min-entropy 2.8 \
  --n-bins 8 \
  --output-dir ./frames \
  --jobs 8

Check Output

Pass 1 Summary

video1.mp4
  fps=29.97  interval=30  1200 frames examined  →  720 quality-passed

video2.mp4
  fps=29.97  interval=30  1500 frames examined  →  890 quality-passed

Check:

  • Pass rate: 40–60% ideal (too low → overly strict, too high → too permissive)
  • Interval: Should match fps / sample_fps (e.g., 30 fps / 1.0 fps = 30)

If pass rate too low: Relax quality thresholds (lower min-sharpness, lower min-entropy).

If pass rate too high: Tighten thresholds (raise min-brightness, raise min-sharpness).

Pass 2 Summary

After temporal filter (2.0s gap):  5234
After grid filter (8³ cells, ≤10/cell):  4237  (387 occupied cells)
Trimmed to budget:  5000

Check:

  • Temporal filter rejection: ~10–40% typical (depends on min_gap and video diversity)
  • Occupied cells: 30–70% of total typical (387/512 = 75% is good)
  • Budget trim: Small trim (<10%) is good; large trim (>50%) suggests n_bins too small

If too few occupied cells (<20%): Lower n_bins.

If too many occupied cells (>80%): Raise n_bins.

Visual Inspection

Randomly sample 20–50 output frames:

ls ./frames | shuf -n 20 | xargs -I {} eog ./frames/{}

What to look for:

  • Diversity: Variety of lighting, subjects, turbidity
  • Quality: No obvious blur, darkness, or featureless frames
  • Temporal separation: No near-duplicates from same video

  • Over-representation: One condition (e.g., blue water) dominates

  • Under-representation: Rare conditions (e.g., organisms) missing
  • Quality issues: Blur or darkness slipping through

If issues found: Adjust parameters (see below) and re-run.


Step 5: Iterate

Problem: Too Many Dark Frames

Symptoms: Many output frames are darker than desired.

Solutions:

  1. Raise --min-brightness (e.g., 20 → 30)
  2. Check calibrate output: Are dark frames common in your data? If yes, you may need to accept them or filter videos upstream.

Problem: Too Many Blurry Frames

Symptoms: Many output frames are out of focus or motion-blurred.

Solutions:

  1. Raise --min-sharpness (e.g., 25 → 40)
  2. Check if blur is pervasive in source videos (vehicle motion, turbidity) — may need better optics or slower vehicle speed.

Problem: Over-Sampling Blue Water

Symptoms: Majority of output frames are featureless blue/green.

Solutions:

  1. Raise --min-entropy (e.g., 2.8 → 3.5) to reject uniform scenes
  2. Lower --max-per-cell to limit contribution from dense (common) cells
  3. Increase --n-bins to fragment common scenes into more cells

Problem: Missing Rare Events

Symptoms: Known rare conditions (e.g., organisms, particle clouds) under-represented.

Solutions:

  1. Increase --sample-fps (e.g., 1.0 → 2.0) to examine more frames (may catch short events)
  2. Lower --max-per-cell to reserve budget for sparse cells
  3. Manually review calibrate output: Are rare events rejected by quality gates? Relax thresholds if needed.

Problem: Not Reaching Frame Budget

Symptoms: Output has fewer frames than --max-frames.

Causes:

  1. Insufficient candidates after quality filtering (too strict thresholds)
  2. Temporal filter rejecting too many (too large --min-gap)
  3. Not enough source videos

Solutions:

  1. Relax quality thresholds (lower --min-sharpness, --min-entropy)
  2. Reduce --min-gap (e.g., 2.0 → 1.0)
  3. Add more source videos

Problem: Too Many Near-Duplicates

Symptoms: Multiple output frames from same video look nearly identical.

Solutions:

  1. Increase --min-gap (e.g., 1.0 → 3.0)
  2. Check if frames are from different grid cells (same visual appearance can have different metrics due to noise) — not a bug, but may need tighter grid resolution (--n-bins)

Common Scenarios

Scenario 1: Deep-Sea Benthic Survey

Characteristics:

  • Slow-moving vehicle (0.5 m/s)
  • Low light (artificial only)
  • High-interest targets (organisms, geology)
  • Long transects (hours)

Recommended settings:

./sample \
  --sample-fps 0.5 \
  --max-frames 10000 \
  --min-gap 5.0 \
  --min-brightness 15 \
  --max-brightness 230 \
  --min-sharpness 20 \
  --min-entropy 3.0 \
  --n-bins 10

Rationale:

  • Low sample-fps: Long transects, slow speed → sparse sampling OK
  • High min-gap: Slow motion → large gaps avoid redundancy
  • Low min-brightness: Dark environment → accept dim frames
  • High min-entropy: Reject featureless seafloor, keep organisms
  • High n-bins: Large candidate pool → fine diversity resolution

Scenario 2: Mid-Water Plankton Survey

Characteristics:

  • Medium speed (1–2 m/s)
  • Ambient + artificial light (variable)
  • Sparse targets (plankton, particles)
  • Moderate transects (10–30 min)

Recommended settings:

./sample \
  --sample-fps 1.0 \
  --max-frames 5000 \
  --min-gap 2.0 \
  --min-brightness 20 \
  --max-brightness 220 \
  --min-sharpness 15 \
  --min-entropy 2.5 \
  --n-bins 8

Rationale:

  • Moderate sample-fps: Moderate speed → standard sampling
  • Moderate min-gap: Balance redundancy and capture rate
  • Moderate min-brightness: Variable light → middle range
  • Low min-sharpness: Plankton may be blurry (small, fast) → accept softer focus
  • Low min-entropy: Sparse targets in blue water → accept low-texture frames

Scenario 3: Shallow Reef Inspection

Characteristics:

  • Slow/stationary (ROV)
  • Bright natural light
  • High detail (coral, fish)
  • Short clips (minutes)

Recommended settings:

./sample \
  --sample-fps 2.0 \
  --max-frames 2000 \
  --min-gap 3.0 \
  --min-brightness 50 \
  --max-brightness 200 \
  --min-sharpness 30 \
  --min-entropy 3.5 \
  --n-bins 6

Rationale:

  • High sample-fps: Short clips → examine more frames
  • High min-gap: Slow/static → large gaps avoid duplicates
  • High min-brightness: Bright environment → reject dim frames
  • High min-sharpness: High-quality optics → demand sharp focus
  • High min-entropy: Rich detail → reject uniform areas
  • Low n-bins: Small candidate pool (short clips) → coarser grid

Advanced Tuning

Custom Interest Scoring

If default interest score doesn't match your priorities, modify sample.cpp:

Current formula (line ~35):

static double interest_score(const FrameRecord& r) {
    return r.entropy * std::log1p(r.sharpness) * (1.0 + r.motion);
}

Examples:

Prioritize sharpness over entropy:

return std::pow(r.sharpness, 0.5) * r.entropy * (1.0 + r.motion);

Ignore motion (prefer static frames):

return r.entropy * std::log1p(r.sharpness);

Heavily weight motion (prefer dynamic scenes):

return r.entropy * std::log1p(r.sharpness) * std::pow(1.0 + r.motion, 2);

After modifying: Rebuild (make -j$(nproc)) and re-run sample.

Two-Pass Sampling

For very large datasets (>100 videos), run in two passes:

Pass A: Coarse sampling (get representative frames quickly)

./sample \
  --sample-fps 0.5 \
  --max-frames 10000 \
  --n-bins 6 \
  --output-dir ./coarse_frames

Pass B: Annotate and evaluate coarse frames

Identify under-represented conditions (e.g., organisms).

Pass C: Targeted sampling (focus on gaps)

Filter source videos to those with under-represented conditions, then:

./sample \
  --root-dir /path/to/targeted_videos \
  --sample-fps 2.0 \
  --max-frames 5000 \
  --min-entropy 3.5 \  # Higher threshold to target rich frames
  --output-dir ./targeted_frames

Combine: Merge coarse + targeted frames for final dataset.


Summary Checklist

Before sampling:

  • [x] Run calibrate on representative clip(s)
  • [x] Examine metric distributions and per-gate rejections
  • [x] Choose quality thresholds for ~50% pass rate
  • [x] Choose --sample-fps based on video frame rate and content
  • [x] Choose --max-frames based on training needs
  • [x] Choose --min-gap based on vehicle speed and scene change rate
  • [x] Choose --n-bins based on expected candidate count

After sampling:

  • [x] Check Pass 1 pass rate (40–60% ideal)
  • [x] Check Pass 2 occupied cells (30–70% ideal)
  • [x] Visually inspect 20–50 random output frames
  • [x] Verify diversity (lighting, subjects, conditions)
  • [x] Verify quality (no blur, darkness, featureless frames)
  • [x] Verify temporal separation (no near-duplicates)

If issues:

  • [x] Adjust parameters and re-run (fast with caching)
  • [x] Document final parameters for reproducibility

Next Steps