Quality Metrics¶
In-depth explanation of the four quality metrics used to filter frames.
Overview¶
The sampler computes four metrics for each examined frame:
- Brightness: Mean pixel intensity (detects dark/overexposed frames)
- Sharpness: Laplacian variance (detects blur/focus issues)
- Entropy: Shannon entropy (detects featureless frames)
- Motion: Frame-to-frame difference (detects camera/subject movement)
All metrics operate on grayscale (color is unreliable in underwater environments).
1. Brightness¶
Definition¶
Mean pixel value across the grayscale image:
where \(I(i,j) \in [0, 255]\) is the grayscale pixel value.
Implementation¶
Grayscale conversion: Uses OpenCV's weighted average (ITU-R BT.601):
Interpretation¶
| Range | Interpretation | Typical Cause |
|---|---|---|
| 0–20 | Very dark | Sensor failure, occlusion, deep water at night |
| 20–50 | Dark | Deep water, low ambient light |
| 50–100 | Moderate | Mid-water, artificial lighting |
| 100–150 | Bright | Shallow water, sunlight |
| 150–220 | Very bright | Strong artificial lighting, reflections |
| 220–255 | Overexposed | Lighting artifacts, sensor saturation |
Use Cases¶
Reject dark frames (sensor failure, occlusion):
Reject overexposed frames (blown-out highlights):
Environment-specific tuning:
- Deep ocean (>200m): Accept darker frames (--min-brightness 10)
- Shallow reef: Reject dim frames (--min-brightness 50)
Limitations¶
- Uniform across frame: Doesn't detect localized brightness issues (e.g., spotlight in corner)
- Color-blind: Doesn't account for color balance (underwater color shifts)
Workaround: Use entropy to detect featureless dark regions.
2. Sharpness (Laplacian Variance)¶
Definition¶
Variance of the Laplacian operator applied to the grayscale image:
The Laplacian detects edges via second derivatives:
High variance → sharp edges → in-focus image
Low variance → smooth gradients → blurred image
Implementation¶
cv::Mat gray, laplacian;
cv::cvtColor(frame, gray, cv::COLOR_BGR2GRAY);
cv::Laplacian(gray, laplacian, CV_64F);
cv::Scalar mu, sigma;
cv::meanStdDev(laplacian, mu, sigma);
double sharpness = sigma[0] * sigma[0]; // variance = sigma^2
OpenCV Laplacian kernel (default 3×3):
Interpretation¶
| Range | Interpretation | Typical Cause |
|---|---|---|
| 0–10 | Severe blur | Motion blur, out-of-focus, dirty lens |
| 10–30 | Slight blur | Turbidity, slight defocus |
| 30–100 | Acceptable | Normal underwater imaging |
| 100+ | Very sharp | High-quality optics, static scene |
Use Cases¶
Reject blurry frames:
High-quality datasets (demand sharp focus):
Accept some blur (fast-moving vehicle, turbid water):
Why Laplacian?¶
Alternatives:
| Method | Pros | Cons |
|---|---|---|
| Laplacian variance | Fast, rotation-invariant, correlates with perceptual sharpness | Sensitive to noise |
| Gradient magnitude (Sobel) | Robust to noise | Slower, direction-dependent |
| FFT-based (high-freq energy) | Noise-resistant | Very slow |
| Tenengrad (Sobel variance) | Similar to Laplacian | Slightly slower |
Laplacian is the best trade-off for real-time video processing.
Limitations¶
- Noise sensitivity: High-frequency noise increases variance (false positive)
- Uniform content: Featureless frames have low variance regardless of focus
Workaround: Combine with entropy (reject featureless + low-sharpness frames).
3. Entropy (Shannon Entropy)¶
Definition¶
Shannon entropy of the grayscale histogram:
where \(p_k\) is the probability of pixel value \(k\):
Interpretation: Measures information content / unpredictability.
- High entropy: Pixel values are evenly distributed (rich texture, detail)
- Low entropy: Pixel values are clustered (uniform, featureless)
Implementation¶
cv::Mat gray;
cv::cvtColor(frame, gray, cv::COLOR_BGR2GRAY);
// Compute histogram
int hist[256] = {0};
for (int r = 0; r < gray.rows; r++) {
const uchar* row = gray.ptr<uchar>(r);
for (int c = 0; c < gray.cols; c++)
hist[row[c]]++;
}
// Compute entropy
int total = gray.rows * gray.cols;
double entropy = 0.0;
for (int i = 0; i < 256; i++) {
if (hist[i] > 0) {
double p = (double)hist[i] / total;
entropy -= p * std::log2(p);
}
}
Interpretation¶
| Range | Interpretation | Example |
|---|---|---|
| 0–2 | Uniform | Solid color, nearly uniform blue water |
| 2–4 | Low texture | Open ocean, sparse plankton |
| 4–6 | Moderate detail | Mid-water particles, distant organisms |
| 6–8 | High texture | Dense particle clouds, complex organisms, seafloor |
Maximum possible: 8 bits (uniform distribution of all 256 gray levels)
Use Cases¶
Reject featureless frames (blue water, uniform backgrounds):
High-detail datasets (organisms, geology):
Accept sparse environments (open ocean):
Mathematical Properties¶
Uniform distribution (max entropy): \(\(p_k = \frac{1}{256} \text{ for all } k \Rightarrow E = -\sum_{k=0}^{255} \frac{1}{256} \log_2\left(\frac{1}{256}\right) = 8 \text{ bits}\)\)
Single value (min entropy): \(\(p_0 = 1, \, p_{k \neq 0} = 0 \Rightarrow E = -1 \cdot \log_2(1) = 0 \text{ bits}\)\)
Two-value distribution (e.g., 50% black, 50% white): \(\(E = -2 \cdot \frac{1}{2} \log_2\left(\frac{1}{2}\right) = 1 \text{ bit}\)\)
Limitations¶
- Global measure: Doesn't detect localized features (e.g., small organism on uniform background)
- Noise inflates entropy: Random noise increases histogram spread
Workaround: Combine with sharpness (reject noisy low-detail frames).
4. Motion (Frame Differencing)¶
Definition¶
Mean absolute difference between current and previous frame:
where \(I_t\) is current frame (grayscale) and \(I_{t-1}\) is previous frame.
Implementation¶
cv::Mat curr_gray, prev_gray, diff;
cv::cvtColor(curr_frame, curr_gray, cv::COLOR_BGR2GRAY);
cv::cvtColor(prev_frame, prev_gray, cv::COLOR_BGR2GRAY);
cv::absdiff(curr_gray, prev_gray, diff);
double motion = cv::mean(diff)[0];
Interpretation¶
| Range | Interpretation | Typical Cause |
|---|---|---|
| 0–2 | Static | Fixed camera, static scene |
| 2–10 | Slow drift | Slow vehicle speed, gentle currents |
| 10–30 | Moderate motion | Normal vehicle speed, subject movement |
| 30+ | Rapid motion | Fast vehicle speed, scene change, rapid pan |
Use Cases¶
Not a quality gate (by default): Motion can be zero without indicating poor quality.
Used in interest scoring: Prioritizes dynamic content.
Interest score formula: \(\(\text{score} = E \times \ln(1 + S) \times (1 + M)\)\)
Effect: - Zero motion: 1× multiplier (no penalty) - Low motion (M=5): 6× multiplier - High motion (M=30): 31× multiplier
If you want to penalize motion (prefer static frames):
Modify interest_score() in sample.cpp:
Limitations¶
- Camera motion vs. subject motion: Can't distinguish (both increase M)
- Scene changes: Transitions between scenes cause spikes (may be desirable or not)
- Temporal ordering: Requires sequential frames (can't be computed for random frames)
Workaround: Use temporal filtering (--min-gap) to avoid adjacent frames regardless of motion.
Metric Interactions¶
Brightness vs. Entropy¶
Low brightness, low entropy: Dark featureless frame (reject)
Low brightness, high entropy: Dark but textured (e.g., illuminated organism against dark background) (accept if brightness ≥ min)
High brightness, low entropy: Overexposed uniform frame (reject)
High brightness, high entropy: Well-lit detailed frame (ideal)
Recommendation: Use both gates:
Sharpness vs. Entropy¶
Low sharpness, low entropy: Blurry featureless frame (reject)
Low sharpness, high entropy: Blurry but textured (e.g., out-of-focus organism) (marginal quality)
High sharpness, low entropy: Sharp but uniform (e.g., focused on blue water) (low value)
High sharpness, high entropy: Sharp and detailed (ideal)
Recommendation: Use both gates:
Motion vs. Sharpness¶
Low motion, low sharpness: Static blurry frame (e.g., defocused, turbidity) (reject via sharpness)
Low motion, high sharpness: Static sharp frame (e.g., benthic survey) (ideal for stationary scenes)
High motion, low sharpness: Motion blur (common during transects) (reject via sharpness)
High motion, high sharpness: Sharp despite motion (well-stabilized or slow-moving subject) (ideal)
Recommendation: Sharpness gate handles motion blur; motion is used for ranking, not filtering.
Tuning Strategy¶
Step 1: Examine Distributions¶
Run calibrate on representative footage:
Look at percentiles: - p5 (5th percentile): ~80% pass threshold - p50 (median): ~50% pass threshold - p95 (95th percentile): ~20% pass threshold
Step 2: Set Initial Thresholds¶
Conservative (high pass rate, ~70–80%):
Moderate (balanced, ~50–60%):
Strict (low pass rate, ~20–30%):
Step 3: Iterate¶
- Run
samplewith chosen thresholds - Visually inspect ~20–50 output frames
- Adjust thresholds based on issues (see Troubleshooting)
- Re-run (fast with caching)
Advanced: Custom Metrics¶
To add a custom metric (e.g., contrast, color variance, corner density):
-
Define function in
triton_sampling.cpp: -
Add to
FrameRecordintriton_sampling.hpp: -
Compute in
sample_video_metrics(): -
Update caching (JSON serialization)
-
Use in quality gate or interest score
See API Reference for complete example.
Summary¶
| Metric | Purpose | Rejects | Typical Range |
|---|---|---|---|
| Brightness | Lighting quality | Dark/overexposed frames | 20–220 |
| Sharpness | Focus quality | Blurry/motion-blurred frames | 20–100 |
| Entropy | Information content | Featureless/uniform frames | 2.5–6.0 |
| Motion | Dynamic content | (Not a gate, used for ranking) | 0–30 |
All metrics are cheap to compute (~1–5 ms per frame @ 1920×1080).
Next Steps¶
- Diversity Selection: How selected frames are chosen from quality-passed candidates
- Temporal Filtering: Enforcing time gaps between frames
- Tuning Guide: Practical parameter selection