Tuning Guide¶
This guide helps you optimize sampler parameters for your specific dataset. The goal is to maximize quality and diversity while meeting your frame budget (--max-frames).
Overview¶
The sampler has two categories of parameters:
- Quality thresholds: Define minimum standards (brightness, sharpness, entropy)
- Diversity controls: Define how frames are selected (n-bins, max-per-cell, min-gap)
Workflow:
- Run
calibrateon representative footage → understand metric distributions - Choose quality thresholds → balance pass rate and quality
- Choose diversity parameters → balance coverage and redundancy
- Run
sampleon full dataset → inspect results - Iterate if needed
Step 1: Calibration¶
Run the Calibrate Tool¶
./calibrate /path/to/sample_video.mp4 \
--sample-fps 2.0 \
--min-brightness 10 \
--max-brightness 245 \
--min-sharpness 10 \
--min-entropy 2.0
Choose a representative clip:
- ✅ Includes variety of conditions (lighting, subjects, motion)
- ✅ ~5–15 minutes duration (enough frames for statistics)
- ❌ Not an outlier (e.g., all murky water or all bright shallow)
Initial thresholds: Set permissive values (10th percentile) to see full distribution.
Interpret Output¶
Metric Distributions¶
Metric distributions (all examined frames):
brightness min=8.2 p5=18.3 median=87.2 p95=201.4 max=243.1
sharpness min=3.1 p5=12.7 median=45.3 p95=187.9 max=321.5
entropy min=1.3 p5=2.2 median=3.9 p95=5.3 max=6.2
motion min=0.5 p5=1.8 median=8.4 p95=28.7 max=89.3
Key percentiles:
- p5 (5th percentile): ~80% pass threshold
- p50 (median): ~50% pass threshold
- p95 (95th percentile): ~20% pass threshold
Example: If sharpness p5=12.7, setting --min-sharpness 12.7 rejects the blurriest 5% of frames.
Per-Gate Rejection¶
Per-gate rejection counts:
min-brightness (≥10): 125 / 1200 rejected (10.4%)
max-brightness (≤245): 43 / 1200 rejected (3.6%)
min-sharpness (≥10): 89 / 1200 rejected (7.4%)
min-entropy (≥2.0): 156 / 1200 rejected (13.0%)
Combined quality filter: 312 / 1200 rejected (26.0%)
888 / 1200 passed (74.0%)
What to look for:
- High rejection (>50% by one gate): Either that metric is problematic in your data, or threshold is too strict
- Low rejection (<5% by all gates): Thresholds are too permissive, not filtering effectively
- Target: 30–60% combined pass rate (balanced filtering)
Threshold Recommendations¶
Threshold recommendations (for target pass rates):
80% pass: --min-brightness 18 --max-brightness 220 --min-sharpness 12 --min-entropy 2.2
60% pass: --min-brightness 28 --max-brightness 205 --min-sharpness 25 --min-entropy 2.8
40% pass: --min-brightness 42 --max-brightness 185 --min-sharpness 48 --min-entropy 3.5
20% pass: --min-brightness 68 --max-brightness 160 --min-sharpness 92 --min-entropy 4.3
How to use:
- Choose a row based on desired selectivity (60% is typical)
- Adjust individual thresholds based on per-gate rejection (e.g., relax sharpness if too strict)
Step 2: Choose Quality Thresholds¶
Brightness¶
What it measures: Mean pixel value (0–255 scale)
When to adjust:
- Dark environment (deep water, night): Lower
--min-brightness(e.g., 10–20) - Bright environment (shallow water, daylight): Raise
--min-brightness(e.g., 30–50) - Overexposure issues: Lower
--max-brightness(e.g., 200–220)
Recommendations:
| Environment | min-brightness | max-brightness |
|---|---|---|
| Deep ocean (>200m) | 10–20 | 230 |
| Mid-water (50–200m) | 20–40 | 220 |
| Shallow (<50m), sunlight | 40–80 | 200 |
| Artificial lighting | 50–100 | 210 |
Visual check: Run calibrate, examine rejected frames. If many look acceptable, lower threshold.
Sharpness¶
What it measures: Laplacian variance (focus quality)
When to adjust:
- Frequent blur (fast-moving vehicle, turbulent water): Lower
--min-sharpness(e.g., 10–20) - High-quality optics (professional camera, stable platform): Raise
--min-sharpness(e.g., 30–50) - Training on blurry data: Lower threshold to match test distribution
Recommendations:
| Camera Quality | min-sharpness |
|---|---|
| Low-res or fast-moving | 10–20 |
| Standard AUV camera | 20–40 |
| High-res or slow-moving | 40–80 |
Visual check: Inspect frames near the threshold. If they look acceptably sharp, threshold is good. If obviously blurry, raise it.
Entropy¶
What it measures: Shannon entropy in bits (information content)
When to adjust:
- Featureless environment (open ocean, sparse plankton): Lower
--min-entropy(e.g., 2.0–2.5) - Feature-rich environment (reef, particle clouds): Raise
--min-entropy(e.g., 3.0–4.0) - Goal is rare events: Lower threshold to avoid rejecting sparse interesting frames
Recommendations:
| Environment | min-entropy |
|---|---|
| Open ocean (blue water) | 2.0–2.5 |
| Mid-water (plankton, particles) | 2.5–3.5 |
| Benthic (seafloor, organisms) | 3.0–4.5 |
Visual check: Low-entropy frames are uniform (nearly solid color). If many look visually rich but rejected, lower threshold.
Motion¶
Not a quality gate: Motion is used for interest scoring, not filtering. Zero motion is acceptable (static scene).
When to tune (via interest score formula, requires code change):
- Prefer static frames: Reduce motion weight in score
- Prefer dynamic frames: Increase motion weight in score
Default formula: entropy × log(1 + sharpness) × (1 + motion) gives 1–10× boost for motion.
Step 3: Choose Diversity Parameters¶
Sample Rate (--sample-fps)¶
What it does: Determines how many frames are examined per video in Pass 1.
Trade-off:
- Higher (e.g., 2–5 fps): More candidates → better diversity, longer runtime
- Lower (e.g., 0.5–1 fps): Fewer candidates → faster, may miss short events
Recommendations:
| Video Frame Rate | Recommended --sample-fps |
|---|---|
| 24–30 fps | 1.0 (every 1s) |
| 60 fps | 2.0 (every 0.5s) |
| 10–15 fps | 0.5 (every 2s) |
Rule of thumb: Sample ~1–2× per second regardless of video frame rate.
When to increase:
- Short transects (seconds-long events) → Need finer temporal resolution
- High diversity requirements → Need more candidates to fill grid
When to decrease:
- Very long videos (hours) → Reduce runtime without losing coverage
- Limited CPU resources → Faster Pass 1
Frame Budget (--max-frames)¶
What it does: Target number of output frames (may output fewer if insufficient candidates).
How to choose:
- Estimate training dataset size needed (e.g., 5000 frames for small model, 50,000 for large)
- Account for annotation effort (e.g., 5000 frames × 30s/frame = 42 hours)
- Set
--max-framesto target value
Typical values:
- Small dataset (proof of concept): 500–1,000
- Medium dataset (single-task model): 5,000–10,000
- Large dataset (multi-task or fine-grained): 20,000–100,000
Note: If dataset has high diversity (many videos, varied conditions), larger budgets are beneficial. If dataset is repetitive, smaller budgets suffice.
Temporal Gap (--min-gap)¶
What it does: Minimum seconds between selected frames from the same video.
Trade-off:
- Higher (e.g., 5–10s): Strong diversity, fewer frames per video
- Lower (e.g., 0.5–2s): More frames per video, risk of near-duplicates
Recommendations:
| Vehicle Speed | Scene Change Rate | Recommended --min-gap |
|---|---|---|
| Stationary | Slow (static scene) | 5–10s |
| Slow (<1 m/s) | Moderate (gradual drift) | 2–5s |
| Fast (>1 m/s) | Rapid (transect) | 1–2s |
When to increase:
- Static scenes (e.g., benthic landers) → Avoid redundant nearly identical frames
- Long videos with repetitive content → Spread selections across time
When to decrease:
- Rapid transects (e.g., fast AUV pass) → Capture short-lived features
- High diversity within videos → Allow multiple frames from interesting sections
Visual check: If output frames from same video look nearly identical, increase --min-gap.
Grid Resolution (--n-bins)¶
What it does: Number of bins per feature axis (total cells = n_bins³).
Trade-off:
- Higher (e.g., 10–12): Finer distinctions, more diversity, risk of sparse cells
- Lower (e.g., 5–7): Coarser grouping, less diversity, denser cells
Recommendations:
| Number of Candidates | Recommended --n-bins | Total Cells |
|---|---|---|
| <1,000 | 5 | 125 |
| 1,000–10,000 | 8 | 512 |
| 10,000–50,000 | 10 | 1,000 |
| >50,000 | 12 | 1,728 |
Rule of thumb: Aim for ~10–50 candidates per cell on average.
When to increase:
- Large candidate pool → Finer diversity resolution
- Over-sampling common scenes → Split dense cells
When to decrease:
- Small candidate pool → Avoid empty cells
- Under-utilizing candidates → Merge sparse cells
Diagnostic: After running sample, check output:
- Occupied cells < 20% of total: Consider lowering
n_bins(grid too sparse) - Occupied cells > 80% of total: Consider raising
n_bins(grid too dense)
Max Per Cell (--max-per-cell)¶
What it does: Maximum frames selected from each grid cell.
Default: ceil(max_frames / n_bins³) — ensures cells can collectively reach budget.
When to override:
- Force uniform distribution: Set low (e.g., 1–5) → each cell contributes equally
- Prefer high-interest frames: Set high (e.g., 20–50) → allows dense cells to dominate
Trade-offs:
| --max-per-cell | Effect |
|---|---|
| Low (1–5) | Strong diversity, may miss high-interest regions |
| Medium (10–20) | Balanced diversity and interest |
| High (50+) | Interest-driven, may over-sample common scenes |
Example (5000 frame budget, 8³ = 512 cells):
- Default
max_per_cell = 10: Each cell contributes ≤10, up to 5120 total (trimmed to 5000) - Override
max_per_cell = 5: Each cell contributes ≤5, up to 2560 total (may under-shoot budget) - Override
max_per_cell = 50: Each cell contributes ≤50, allows dense cells to provide more
When to use:
- Rare event detection: Lower
max_per_cellto ensure rare conditions get representation - High-interest prioritization: Raise
max_per_cellto allow scoring to dominate
Step 4: Run and Inspect¶
Run the Sampler¶
./sample \
--root-dir /path/to/videos \
--camera 1 \
--sample-fps 1.0 \
--max-frames 5000 \
--min-gap 2.0 \
--min-brightness 20 \
--max-brightness 220 \
--min-sharpness 25 \
--min-entropy 2.8 \
--n-bins 8 \
--output-dir ./frames \
--jobs 8
Check Output¶
Pass 1 Summary¶
video1.mp4
fps=29.97 interval=30 1200 frames examined → 720 quality-passed
video2.mp4
fps=29.97 interval=30 1500 frames examined → 890 quality-passed
Check:
- Pass rate: 40–60% ideal (too low → overly strict, too high → too permissive)
- Interval: Should match
fps / sample_fps(e.g., 30 fps / 1.0 fps = 30)
If pass rate too low: Relax quality thresholds (lower min-sharpness, lower min-entropy).
If pass rate too high: Tighten thresholds (raise min-brightness, raise min-sharpness).
Pass 2 Summary¶
After temporal filter (2.0s gap): 5234
After grid filter (8³ cells, ≤10/cell): 4237 (387 occupied cells)
Trimmed to budget: 5000
Check:
- Temporal filter rejection: ~10–40% typical (depends on
min_gapand video diversity) - Occupied cells: 30–70% of total typical (387/512 = 75% is good)
- Budget trim: Small trim (<10%) is good; large trim (>50%) suggests
n_binstoo small
If too few occupied cells (<20%): Lower n_bins.
If too many occupied cells (>80%): Raise n_bins.
Visual Inspection¶
Randomly sample 20–50 output frames:
What to look for:
- ✅ Diversity: Variety of lighting, subjects, turbidity
- ✅ Quality: No obvious blur, darkness, or featureless frames
-
✅ Temporal separation: No near-duplicates from same video
-
❌ Over-representation: One condition (e.g., blue water) dominates
- ❌ Under-representation: Rare conditions (e.g., organisms) missing
- ❌ Quality issues: Blur or darkness slipping through
If issues found: Adjust parameters (see below) and re-run.
Step 5: Iterate¶
Problem: Too Many Dark Frames¶
Symptoms: Many output frames are darker than desired.
Solutions:
- Raise
--min-brightness(e.g., 20 → 30) - Check
calibrateoutput: Are dark frames common in your data? If yes, you may need to accept them or filter videos upstream.
Problem: Too Many Blurry Frames¶
Symptoms: Many output frames are out of focus or motion-blurred.
Solutions:
- Raise
--min-sharpness(e.g., 25 → 40) - Check if blur is pervasive in source videos (vehicle motion, turbidity) — may need better optics or slower vehicle speed.
Problem: Over-Sampling Blue Water¶
Symptoms: Majority of output frames are featureless blue/green.
Solutions:
- Raise
--min-entropy(e.g., 2.8 → 3.5) to reject uniform scenes - Lower
--max-per-cellto limit contribution from dense (common) cells - Increase
--n-binsto fragment common scenes into more cells
Problem: Missing Rare Events¶
Symptoms: Known rare conditions (e.g., organisms, particle clouds) under-represented.
Solutions:
- Increase
--sample-fps(e.g., 1.0 → 2.0) to examine more frames (may catch short events) - Lower
--max-per-cellto reserve budget for sparse cells - Manually review
calibrateoutput: Are rare events rejected by quality gates? Relax thresholds if needed.
Problem: Not Reaching Frame Budget¶
Symptoms: Output has fewer frames than --max-frames.
Causes:
- Insufficient candidates after quality filtering (too strict thresholds)
- Temporal filter rejecting too many (too large
--min-gap) - Not enough source videos
Solutions:
- Relax quality thresholds (lower
--min-sharpness,--min-entropy) - Reduce
--min-gap(e.g., 2.0 → 1.0) - Add more source videos
Problem: Too Many Near-Duplicates¶
Symptoms: Multiple output frames from same video look nearly identical.
Solutions:
- Increase
--min-gap(e.g., 1.0 → 3.0) - Check if frames are from different grid cells (same visual appearance can have different metrics due to noise) — not a bug, but may need tighter grid resolution (
--n-bins)
Common Scenarios¶
Scenario 1: Deep-Sea Benthic Survey¶
Characteristics:
- Slow-moving vehicle (0.5 m/s)
- Low light (artificial only)
- High-interest targets (organisms, geology)
- Long transects (hours)
Recommended settings:
./sample \
--sample-fps 0.5 \
--max-frames 10000 \
--min-gap 5.0 \
--min-brightness 15 \
--max-brightness 230 \
--min-sharpness 20 \
--min-entropy 3.0 \
--n-bins 10
Rationale:
- Low
sample-fps: Long transects, slow speed → sparse sampling OK - High
min-gap: Slow motion → large gaps avoid redundancy - Low
min-brightness: Dark environment → accept dim frames - High
min-entropy: Reject featureless seafloor, keep organisms - High
n-bins: Large candidate pool → fine diversity resolution
Scenario 2: Mid-Water Plankton Survey¶
Characteristics:
- Medium speed (1–2 m/s)
- Ambient + artificial light (variable)
- Sparse targets (plankton, particles)
- Moderate transects (10–30 min)
Recommended settings:
./sample \
--sample-fps 1.0 \
--max-frames 5000 \
--min-gap 2.0 \
--min-brightness 20 \
--max-brightness 220 \
--min-sharpness 15 \
--min-entropy 2.5 \
--n-bins 8
Rationale:
- Moderate
sample-fps: Moderate speed → standard sampling - Moderate
min-gap: Balance redundancy and capture rate - Moderate
min-brightness: Variable light → middle range - Low
min-sharpness: Plankton may be blurry (small, fast) → accept softer focus - Low
min-entropy: Sparse targets in blue water → accept low-texture frames
Scenario 3: Shallow Reef Inspection¶
Characteristics:
- Slow/stationary (ROV)
- Bright natural light
- High detail (coral, fish)
- Short clips (minutes)
Recommended settings:
./sample \
--sample-fps 2.0 \
--max-frames 2000 \
--min-gap 3.0 \
--min-brightness 50 \
--max-brightness 200 \
--min-sharpness 30 \
--min-entropy 3.5 \
--n-bins 6
Rationale:
- High
sample-fps: Short clips → examine more frames - High
min-gap: Slow/static → large gaps avoid duplicates - High
min-brightness: Bright environment → reject dim frames - High
min-sharpness: High-quality optics → demand sharp focus - High
min-entropy: Rich detail → reject uniform areas - Low
n-bins: Small candidate pool (short clips) → coarser grid
Advanced Tuning¶
Custom Interest Scoring¶
If default interest score doesn't match your priorities, modify sample.cpp:
Current formula (line ~35):
static double interest_score(const FrameRecord& r) {
return r.entropy * std::log1p(r.sharpness) * (1.0 + r.motion);
}
Examples:
Prioritize sharpness over entropy:
Ignore motion (prefer static frames):
Heavily weight motion (prefer dynamic scenes):
After modifying: Rebuild (make -j$(nproc)) and re-run sample.
Two-Pass Sampling¶
For very large datasets (>100 videos), run in two passes:
Pass A: Coarse sampling (get representative frames quickly)
Pass B: Annotate and evaluate coarse frames
Identify under-represented conditions (e.g., organisms).
Pass C: Targeted sampling (focus on gaps)
Filter source videos to those with under-represented conditions, then:
./sample \
--root-dir /path/to/targeted_videos \
--sample-fps 2.0 \
--max-frames 5000 \
--min-entropy 3.5 \ # Higher threshold to target rich frames
--output-dir ./targeted_frames
Combine: Merge coarse + targeted frames for final dataset.
Summary Checklist¶
Before sampling:
- [x] Run
calibrateon representative clip(s) - [x] Examine metric distributions and per-gate rejections
- [x] Choose quality thresholds for ~50% pass rate
- [x] Choose
--sample-fpsbased on video frame rate and content - [x] Choose
--max-framesbased on training needs - [x] Choose
--min-gapbased on vehicle speed and scene change rate - [x] Choose
--n-binsbased on expected candidate count
After sampling:
- [x] Check Pass 1 pass rate (40–60% ideal)
- [x] Check Pass 2 occupied cells (30–70% ideal)
- [x] Visually inspect 20–50 random output frames
- [x] Verify diversity (lighting, subjects, conditions)
- [x] Verify quality (no blur, darkness, featureless frames)
- [x] Verify temporal separation (no near-duplicates)
If issues:
- [x] Adjust parameters and re-run (fast with caching)
- [x] Document final parameters for reproducibility
Next Steps¶
- Parameter Reference: Complete list of all command-line options
- Calibration Workflow: Detailed
calibratetool usage - Common Scenarios: More domain-specific examples