Tuning Guide¶

This guide helps you optimize sampler parameters for your specific dataset. The goal is to maximize quality and diversity while meeting your frame budget (--max-frames).

Overview¶

The sampler has two categories of parameters:

Quality thresholds: Define minimum standards (brightness, sharpness, entropy)
Diversity controls: Define how frames are selected (n-bins, max-per-cell, min-gap)

Workflow:

Run calibrate on representative footage → understand metric distributions
Choose quality thresholds → balance pass rate and quality
Choose diversity parameters → balance coverage and redundancy
Run sample on full dataset → inspect results
Iterate if needed

Step 1: Calibration¶

Run the Calibrate Tool¶

./calibrate /path/to/sample_video.mp4 \
  --sample-fps 2.0 \
  --min-brightness 10 \
  --max-brightness 245 \
  --min-sharpness 10 \
  --min-entropy 2.0

Choose a representative clip:

✅ Includes variety of conditions (lighting, subjects, motion)
✅ ~5–15 minutes duration (enough frames for statistics)
❌ Not an outlier (e.g., all murky water or all bright shallow)

Initial thresholds: Set permissive values (10th percentile) to see full distribution.

Interpret Output¶

Metric Distributions¶

Metric distributions (all examined frames):
  brightness    min=8.2   p5=18.3   median=87.2   p95=201.4  max=243.1
  sharpness     min=3.1   p5=12.7   median=45.3   p95=187.9  max=321.5
  entropy       min=1.3   p5=2.2    median=3.9    p95=5.3    max=6.2
  motion        min=0.5   p5=1.8    median=8.4    p95=28.7   max=89.3

Key percentiles:

p5 (5th percentile): ~80% pass threshold
p50 (median): ~50% pass threshold
p95 (95th percentile): ~20% pass threshold

Example: If sharpness p5=12.7, setting --min-sharpness 12.7 rejects the blurriest 5% of frames.

Per-Gate Rejection¶

Per-gate rejection counts:
  min-brightness (≥10):    125 / 1200 rejected  (10.4%)
  max-brightness (≤245):   43 / 1200 rejected   (3.6%)
  min-sharpness (≥10):     89 / 1200 rejected   (7.4%)
  min-entropy (≥2.0):      156 / 1200 rejected  (13.0%)

Combined quality filter:   312 / 1200 rejected  (26.0%)
                          888 / 1200 passed     (74.0%)

What to look for:

High rejection (>50% by one gate): Either that metric is problematic in your data, or threshold is too strict
Low rejection (<5% by all gates): Thresholds are too permissive, not filtering effectively
Target: 30–60% combined pass rate (balanced filtering)

Threshold Recommendations¶

Threshold recommendations (for target pass rates):

  80% pass:  --min-brightness 18  --max-brightness 220  --min-sharpness 12  --min-entropy 2.2
  60% pass:  --min-brightness 28  --max-brightness 205  --min-sharpness 25  --min-entropy 2.8
  40% pass:  --min-brightness 42  --max-brightness 185  --min-sharpness 48  --min-entropy 3.5
  20% pass:  --min-brightness 68  --max-brightness 160  --min-sharpness 92  --min-entropy 4.3

How to use:

Choose a row based on desired selectivity (60% is typical)
Adjust individual thresholds based on per-gate rejection (e.g., relax sharpness if too strict)

Step 2: Choose Quality Thresholds¶

Brightness¶

What it measures: Mean pixel value (0–255 scale)

When to adjust:

Dark environment (deep water, night): Lower --min-brightness (e.g., 10–20)
Bright environment (shallow water, daylight): Raise --min-brightness (e.g., 30–50)
Overexposure issues: Lower --max-brightness (e.g., 200–220)

Recommendations:

Environment	min-brightness	max-brightness
Deep ocean (>200m)	10–20	230
Mid-water (50–200m)	20–40	220
Shallow (<50m), sunlight	40–80	200
Artificial lighting	50–100	210

Visual check: Run calibrate, examine rejected frames. If many look acceptable, lower threshold.

Sharpness¶

What it measures: Laplacian variance (focus quality)

When to adjust:

Frequent blur (fast-moving vehicle, turbulent water): Lower --min-sharpness (e.g., 10–20)
High-quality optics (professional camera, stable platform): Raise --min-sharpness (e.g., 30–50)
Training on blurry data: Lower threshold to match test distribution

Recommendations:

Camera Quality	min-sharpness
Low-res or fast-moving	10–20
Standard AUV camera	20–40
High-res or slow-moving	40–80

Visual check: Inspect frames near the threshold. If they look acceptably sharp, threshold is good. If obviously blurry, raise it.

Entropy¶

What it measures: Shannon entropy in bits (information content)

When to adjust:

Featureless environment (open ocean, sparse plankton): Lower --min-entropy (e.g., 2.0–2.5)
Feature-rich environment (reef, particle clouds): Raise --min-entropy (e.g., 3.0–4.0)
Goal is rare events: Lower threshold to avoid rejecting sparse interesting frames

Recommendations:

Environment	min-entropy
Open ocean (blue water)	2.0–2.5
Mid-water (plankton, particles)	2.5–3.5
Benthic (seafloor, organisms)	3.0–4.5

Visual check: Low-entropy frames are uniform (nearly solid color). If many look visually rich but rejected, lower threshold.

Motion¶

Not a quality gate: Motion is used for interest scoring, not filtering. Zero motion is acceptable (static scene).

When to tune (via interest score formula, requires code change):

Prefer static frames: Reduce motion weight in score
Prefer dynamic frames: Increase motion weight in score

Default formula: entropy × log(1 + sharpness) × (1 + motion) gives 1–10× boost for motion.

Step 3: Choose Diversity Parameters¶

Sample Rate (`--sample-fps`)¶

What it does: Determines how many frames are examined per video in Pass 1.

Trade-off:

Higher (e.g., 2–5 fps): More candidates → better diversity, longer runtime
Lower (e.g., 0.5–1 fps): Fewer candidates → faster, may miss short events

Recommendations:

Video Frame Rate	Recommended --sample-fps
24–30 fps	1.0 (every 1s)
60 fps	2.0 (every 0.5s)
10–15 fps	0.5 (every 2s)

Rule of thumb: Sample ~1–2× per second regardless of video frame rate.

When to increase:

Short transects (seconds-long events) → Need finer temporal resolution
High diversity requirements → Need more candidates to fill grid

When to decrease:

Very long videos (hours) → Reduce runtime without losing coverage
Limited CPU resources → Faster Pass 1

Frame Budget (`--max-frames`)¶

What it does: Target number of output frames (may output fewer if insufficient candidates).

How to choose:

Estimate training dataset size needed (e.g., 5000 frames for small model, 50,000 for large)
Account for annotation effort (e.g., 5000 frames × 30s/frame = 42 hours)
Set --max-frames to target value

Typical values:

Small dataset (proof of concept): 500–1,000
Medium dataset (single-task model): 5,000–10,000
Large dataset (multi-task or fine-grained): 20,000–100,000

Note: If dataset has high diversity (many videos, varied conditions), larger budgets are beneficial. If dataset is repetitive, smaller budgets suffice.

Temporal Gap (`--min-gap`)¶

What it does: Minimum seconds between selected frames from the same video.

Trade-off:

Higher (e.g., 5–10s): Strong diversity, fewer frames per video
Lower (e.g., 0.5–2s): More frames per video, risk of near-duplicates

Recommendations:

Vehicle Speed	Scene Change Rate	Recommended --min-gap
Stationary	Slow (static scene)	5–10s
Slow (<1 m/s)	Moderate (gradual drift)	2–5s
Fast (>1 m/s)	Rapid (transect)	1–2s

When to increase:

Static scenes (e.g., benthic landers) → Avoid redundant nearly identical frames
Long videos with repetitive content → Spread selections across time

When to decrease:

Rapid transects (e.g., fast AUV pass) → Capture short-lived features
High diversity within videos → Allow multiple frames from interesting sections

Visual check: If output frames from same video look nearly identical, increase --min-gap.

Grid Resolution (`--n-bins`)¶

What it does: Number of bins per feature axis (total cells = n_bins³).

Trade-off:

Higher (e.g., 10–12): Finer distinctions, more diversity, risk of sparse cells
Lower (e.g., 5–7): Coarser grouping, less diversity, denser cells

Recommendations:

Number of Candidates	Recommended --n-bins	Total Cells
<1,000	5	125
1,000–10,000	8	512
10,000–50,000	10	1,000
>50,000	12	1,728

Rule of thumb: Aim for ~10–50 candidates per cell on average.

\[\text{n\_bins} \approx \sqrt[3]{\frac{\text{num\_candidates}}{20}}\]

When to increase:

Large candidate pool → Finer diversity resolution
Over-sampling common scenes → Split dense cells

When to decrease:

Small candidate pool → Avoid empty cells
Under-utilizing candidates → Merge sparse cells

Diagnostic: After running sample, check output:

After grid filter (8³ cells, ≤10/cell):  4237  (387 occupied cells)

Occupied cells < 20% of total: Consider lowering n_bins (grid too sparse)
Occupied cells > 80% of total: Consider raising n_bins (grid too dense)

Max Per Cell (`--max-per-cell`)¶

What it does: Maximum frames selected from each grid cell.

Default: ceil(max_frames / n_bins³) — ensures cells can collectively reach budget.

When to override:

Force uniform distribution: Set low (e.g., 1–5) → each cell contributes equally
Prefer high-interest frames: Set high (e.g., 20–50) → allows dense cells to dominate

Trade-offs:

--max-per-cell	Effect
Low (1–5)	Strong diversity, may miss high-interest regions
Medium (10–20)	Balanced diversity and interest
High (50+)	Interest-driven, may over-sample common scenes

Example (5000 frame budget, 8³ = 512 cells):

Default max_per_cell = 10: Each cell contributes ≤10, up to 5120 total (trimmed to 5000)
Override max_per_cell = 5: Each cell contributes ≤5, up to 2560 total (may under-shoot budget)
Override max_per_cell = 50: Each cell contributes ≤50, allows dense cells to provide more

When to use:

Rare event detection: Lower max_per_cell to ensure rare conditions get representation
High-interest prioritization: Raise max_per_cell to allow scoring to dominate

Step 4: Run and Inspect¶

Run the Sampler¶

./sample \
  --root-dir /path/to/videos \
  --camera 1 \
  --sample-fps 1.0 \
  --max-frames 5000 \
  --min-gap 2.0 \
  --min-brightness 20 \
  --max-brightness 220 \
  --min-sharpness 25 \
  --min-entropy 2.8 \
  --n-bins 8 \
  --output-dir ./frames \
  --jobs 8

Check Output¶

Pass 1 Summary¶

video1.mp4
  fps=29.97  interval=30  1200 frames examined  →  720 quality-passed

video2.mp4
  fps=29.97  interval=30  1500 frames examined  →  890 quality-passed

Check:

Pass rate: 40–60% ideal (too low → overly strict, too high → too permissive)
Interval: Should match fps / sample_fps (e.g., 30 fps / 1.0 fps = 30)

If pass rate too low: Relax quality thresholds (lower min-sharpness, lower min-entropy).

If pass rate too high: Tighten thresholds (raise min-brightness, raise min-sharpness).

Pass 2 Summary¶

After temporal filter (2.0s gap):  5234
After grid filter (8³ cells, ≤10/cell):  4237  (387 occupied cells)
Trimmed to budget:  5000

Check:

Temporal filter rejection: ~10–40% typical (depends on min_gap and video diversity)
Occupied cells: 30–70% of total typical (387/512 = 75% is good)
Budget trim: Small trim (<10%) is good; large trim (>50%) suggests n_bins too small

If too few occupied cells (<20%): Lower n_bins.

If too many occupied cells (>80%): Raise n_bins.

Visual Inspection¶

Randomly sample 20–50 output frames:

ls ./frames | shuf -n 20 | xargs -I {} eog ./frames/{}

What to look for:

✅ Diversity: Variety of lighting, subjects, turbidity
✅ Quality: No obvious blur, darkness, or featureless frames
✅ Temporal separation: No near-duplicates from same video
❌ Over-representation: One condition (e.g., blue water) dominates
❌ Under-representation: Rare conditions (e.g., organisms) missing
❌ Quality issues: Blur or darkness slipping through

If issues found: Adjust parameters (see below) and re-run.

Step 5: Iterate¶

Problem: Too Many Dark Frames¶

Symptoms: Many output frames are darker than desired.

Solutions:

Raise --min-brightness (e.g., 20 → 30)
Check calibrate output: Are dark frames common in your data? If yes, you may need to accept them or filter videos upstream.

Problem: Too Many Blurry Frames¶

Symptoms: Many output frames are out of focus or motion-blurred.

Solutions:

Raise --min-sharpness (e.g., 25 → 40)
Check if blur is pervasive in source videos (vehicle motion, turbidity) — may need better optics or slower vehicle speed.

Problem: Over-Sampling Blue Water¶

Symptoms: Majority of output frames are featureless blue/green.

Solutions:

Raise --min-entropy (e.g., 2.8 → 3.5) to reject uniform scenes
Lower --max-per-cell to limit contribution from dense (common) cells
Increase --n-bins to fragment common scenes into more cells

Problem: Missing Rare Events¶

Symptoms: Known rare conditions (e.g., organisms, particle clouds) under-represented.

Solutions:

Increase --sample-fps (e.g., 1.0 → 2.0) to examine more frames (may catch short events)
Lower --max-per-cell to reserve budget for sparse cells
Manually review calibrate output: Are rare events rejected by quality gates? Relax thresholds if needed.

Problem: Not Reaching Frame Budget¶

Symptoms: Output has fewer frames than --max-frames.

Causes:

Insufficient candidates after quality filtering (too strict thresholds)
Temporal filter rejecting too many (too large --min-gap)
Not enough source videos

Solutions:

Relax quality thresholds (lower --min-sharpness, --min-entropy)
Reduce --min-gap (e.g., 2.0 → 1.0)
Add more source videos

Problem: Too Many Near-Duplicates¶

Symptoms: Multiple output frames from same video look nearly identical.

Solutions:

Increase --min-gap (e.g., 1.0 → 3.0)
Check if frames are from different grid cells (same visual appearance can have different metrics due to noise) — not a bug, but may need tighter grid resolution (--n-bins)

Common Scenarios¶

Scenario 1: Deep-Sea Benthic Survey¶

Characteristics:

Slow-moving vehicle (0.5 m/s)
Low light (artificial only)
High-interest targets (organisms, geology)
Long transects (hours)

Recommended settings:

./sample \
  --sample-fps 0.5 \
  --max-frames 10000 \
  --min-gap 5.0 \
  --min-brightness 15 \
  --max-brightness 230 \
  --min-sharpness 20 \
  --min-entropy 3.0 \
  --n-bins 10

Rationale:

Low sample-fps: Long transects, slow speed → sparse sampling OK
High min-gap: Slow motion → large gaps avoid redundancy
Low min-brightness: Dark environment → accept dim frames
High min-entropy: Reject featureless seafloor, keep organisms
High n-bins: Large candidate pool → fine diversity resolution

Scenario 2: Mid-Water Plankton Survey¶

Characteristics:

Medium speed (1–2 m/s)
Ambient + artificial light (variable)
Sparse targets (plankton, particles)
Moderate transects (10–30 min)

Recommended settings:

./sample \
  --sample-fps 1.0 \
  --max-frames 5000 \
  --min-gap 2.0 \
  --min-brightness 20 \
  --max-brightness 220 \
  --min-sharpness 15 \
  --min-entropy 2.5 \
  --n-bins 8

Rationale:

Moderate sample-fps: Moderate speed → standard sampling
Moderate min-gap: Balance redundancy and capture rate
Moderate min-brightness: Variable light → middle range
Low min-sharpness: Plankton may be blurry (small, fast) → accept softer focus
Low min-entropy: Sparse targets in blue water → accept low-texture frames

Scenario 3: Shallow Reef Inspection¶

Characteristics:

Slow/stationary (ROV)
Bright natural light
High detail (coral, fish)
Short clips (minutes)

Recommended settings:

./sample \
  --sample-fps 2.0 \
  --max-frames 2000 \
  --min-gap 3.0 \
  --min-brightness 50 \
  --max-brightness 200 \
  --min-sharpness 30 \
  --min-entropy 3.5 \
  --n-bins 6

Rationale:

High sample-fps: Short clips → examine more frames
High min-gap: Slow/static → large gaps avoid duplicates
High min-brightness: Bright environment → reject dim frames
High min-sharpness: High-quality optics → demand sharp focus
High min-entropy: Rich detail → reject uniform areas
Low n-bins: Small candidate pool (short clips) → coarser grid

Advanced Tuning¶

Custom Interest Scoring¶

If default interest score doesn't match your priorities, modify sample.cpp:

Current formula (line ~35):

static double interest_score(const FrameRecord& r) {
    return r.entropy * std::log1p(r.sharpness) * (1.0 + r.motion);
}

Examples:

Prioritize sharpness over entropy:

return std::pow(r.sharpness, 0.5) * r.entropy * (1.0 + r.motion);

Ignore motion (prefer static frames):

return r.entropy * std::log1p(r.sharpness);

Heavily weight motion (prefer dynamic scenes):

return r.entropy * std::log1p(r.sharpness) * std::pow(1.0 + r.motion, 2);

After modifying: Rebuild (make -j$(nproc)) and re-run sample.

Two-Pass Sampling¶

For very large datasets (>100 videos), run in two passes:

Pass A: Coarse sampling (get representative frames quickly)

./sample \
  --sample-fps 0.5 \
  --max-frames 10000 \
  --n-bins 6 \
  --output-dir ./coarse_frames

Pass B: Annotate and evaluate coarse frames

Identify under-represented conditions (e.g., organisms).

Pass C: Targeted sampling (focus on gaps)

Filter source videos to those with under-represented conditions, then:

./sample \
  --root-dir /path/to/targeted_videos \
  --sample-fps 2.0 \
  --max-frames 5000 \
  --min-entropy 3.5 \  # Higher threshold to target rich frames
  --output-dir ./targeted_frames

Combine: Merge coarse + targeted frames for final dataset.

Summary Checklist¶

Before sampling:

[x] Run calibrate on representative clip(s)
[x] Examine metric distributions and per-gate rejections
[x] Choose quality thresholds for ~50% pass rate
[x] Choose --sample-fps based on video frame rate and content
[x] Choose --max-frames based on training needs
[x] Choose --min-gap based on vehicle speed and scene change rate
[x] Choose --n-bins based on expected candidate count

After sampling:

[x] Check Pass 1 pass rate (40–60% ideal)
[x] Check Pass 2 occupied cells (30–70% ideal)
[x] Visually inspect 20–50 random output frames
[x] Verify diversity (lighting, subjects, conditions)
[x] Verify quality (no blur, darkness, featureless frames)
[x] Verify temporal separation (no near-duplicates)

If issues:

[x] Adjust parameters and re-run (fast with caching)
[x] Document final parameters for reproducibility

Next Steps¶

Parameter Reference: Complete list of all command-line options
Calibration Workflow: Detailed calibrate tool usage
Common Scenarios: More domain-specific examples

Tuning Guide¶

Overview¶

Step 1: Calibration¶

Run the Calibrate Tool¶

Interpret Output¶

Metric Distributions¶

Per-Gate Rejection¶

Threshold Recommendations¶

Step 2: Choose Quality Thresholds¶

Brightness¶

Sharpness¶

Entropy¶

Motion¶

Step 3: Choose Diversity Parameters¶

Sample Rate (--sample-fps)¶

Frame Budget (--max-frames)¶

Temporal Gap (--min-gap)¶

Grid Resolution (--n-bins)¶

Max Per Cell (--max-per-cell)¶

Step 4: Run and Inspect¶

Run the Sampler¶

Check Output¶

Pass 1 Summary¶

Pass 2 Summary¶

Visual Inspection¶

Step 5: Iterate¶

Problem: Too Many Dark Frames¶

Problem: Too Many Blurry Frames¶

Problem: Over-Sampling Blue Water¶

Problem: Missing Rare Events¶

Problem: Not Reaching Frame Budget¶

Problem: Too Many Near-Duplicates¶

Common Scenarios¶

Scenario 1: Deep-Sea Benthic Survey¶

Scenario 2: Mid-Water Plankton Survey¶

Scenario 3: Shallow Reef Inspection¶

Advanced Tuning¶

Custom Interest Scoring¶

Two-Pass Sampling¶

Summary Checklist¶

Next Steps¶

Sample Rate (`--sample-fps`)¶

Frame Budget (`--max-frames`)¶

Temporal Gap (`--min-gap`)¶

Grid Resolution (`--n-bins`)¶

Max Per Cell (`--max-per-cell`)¶