This document describes the pipeline implemented in r3sizer-core.
1. Input decoding (r3sizer-io / load.rs)
1.5. Input color-space ingress (color_space.rs) -- experimental, optional
2. sRGB -> linear RGB (color.rs)
2a. Resize (staged shrink) (resize.rs / resize_strategy.rs)
-- bilinear pre-reduce when ratio >= 3x, then Lanczos3
2.5. Region classification (classifier.rs) -- ContentAdaptive only
3. Contrast leveling (contrast.rs) -- optional
4. Baseline measurement (metrics/)
4a. Base resize quality (base_quality.rs)
-- ringing always; edge/texture retention only in Full diagnostics
5. Detail precomputation D = input - blur(input) -- computed once
5a. TwoPass coarse scan (pipeline.rs)
-- early stop after 3+ probes when P0 bracket found
5b. Dense window selection (pipeline.rs)
5c. TwoPass dense refinement (pipeline.rs)
6. Cubic fit (fit.rs)
7. Root solving (solve.rs)
7b. LOO stability check (pipeline.rs)
7c. Robustness flags + fallback reason (pipeline.rs)
8. Final sharpening (sharpen.rs + color.rs)
8.1 Chroma guard override (chroma_guard.rs) -- optional
9. Measure final artifact ratio (metrics/)
9.5. Quality evaluator (evaluator.rs) -- experimental, optional
10. Clamp + output (pipeline.rs + color.rs)
11. Recommendations (recommendations.rs)
12. Save (r3sizer-io / save.rs)
r3sizer-io::load::load_as_linear opens the file via the image crate,
converts to Rgb8, normalises bytes to f32 [0, 1], and immediately applies
the sRGB -> linear transform. The returned LinearRgbImage is already in
linear light.
When params.input_color_space is set, color_space::prepare_input intercepts
the pixel data before any processing:
InputColorSpace |
Behaviour |
|---|---|
Srgb (default / None) |
No-op — IO layer already linearised. |
LinearRgb |
Validates range; records out_of_range_fraction if values exceed [0, 1]. |
RawLinear |
Normalises to [0, 1] by dividing by the per-image maximum; records normalization_scale. |
Diagnostics are stored in AutoSharpDiagnostics::input_ingress (InputIngressDiagnostics).
color::srgb_to_linear applies the IEC 61966-2-1 (sRGB standard) transfer
function piecewise:
v / 12.92 if v <= 0.04045
((v + 0.055) / 1.055) ^ 2.4 otherwise
All subsequent operations are performed in linear light so that averaging and filtering are physically correct.
resize::downscale performs Lanczos3 resampling via fast_image_resize
(SIMD-accelerated on x86-64). The function operates on the raw f32 linear
pixel buffer. No clamping is applied; the output remains in linear f32.
Staged shrink: for shrink ratios >= 3x (max of X and Y axes), a two-stage
path is used. A fast bilinear pre-reduce brings the image to approximately 2x
the target dimensions, then a final Lanczos3 pass produces the output. This
follows the same principle as libvips' gap parameter: the bilinear pass
cheaply removes the bulk of the high-frequency energy that would otherwise
alias through the Lanczos kernel, while the final Lanczos3 pass preserves
edge sharpness at the target scale. For ratios below 3x, a single Lanczos3
pass is used directly. The diagnostic field used_staged_shrink records
which path was taken.
Note: The specific resampling kernel used in the original paper is not confirmed. Lanczos3 is used as a high-quality standard choice.
When params.resize_strategy is set, resize_strategy::downscale_with_strategy
is called instead. Two variants are available:
ResizeStrategy::Uniform { kernel }— single kernel applied to the entire image. Kernels:Lanczos3,MitchellNetravali,CatmullRom,Gaussian.ResizeStrategy::ContentAdaptive { classification, kernel_table }— classifies the source image, assigns a kernel per region class, produces one downsample per kernel, then blends pixels according to the per-pixel class assignment.
Diagnostics are stored in AutoSharpDiagnostics::resize_strategy_diagnostics
(ResizeStrategyDiagnostics): which kernels were used and the pixel-count per kernel.
When params.sharpen_strategy == ContentAdaptive, classifier::classify runs a
four-pass algorithm on the downscaled image to assign each pixel a RegionClass:
| Pass | What it computes |
|---|---|
| 1 | CIE Y luminance extraction |
| 2 | Unnormalized 3×3 Sobel gradient magnitude (max ≈ 5.66 for luma in [0,1]) |
| 3 | Local variance in a configurable square window (default 5×5) |
| 4 | Per-pixel classification via priority rules (see below) |
g >= gradient_high && v >= variance_high → RiskyHaloZone
g >= gradient_high → StrongEdge
v >= variance_high && g < gradient_low → Microtexture
v >= variance_low || g >= gradient_low → Textured
else → Flat
Default ClassificationParams: gradient_low=0.05, gradient_high=0.40,
variance_low=0.001, variance_high=0.010, variance_window=5.
gain_map_from_region_map converts the RegionMap to a GainMap (per-pixel f32
multipliers) using the GainTable. The gain map is consumed in Stage 8.
Per-class pixel counts are stored in AutoSharpDiagnostics::region_coverage
(RegionCoverage).
contrast::apply_contrast_leveling is applied to the downscaled image when
enable_contrast_leveling = true.
Current implementation: per-channel 1st-99th percentile stretch.
Status: placeholder. The paper-exact formula is not yet known.
Before any sharpening, the artifact ratio of the downscaled base image is
measured using the configured ArtifactMetric:
baseline_artifact_ratio = measure(base) // dispatches on ArtifactMetric
This captures any out-of-range values introduced by the resize stage (e.g. Lanczos ringing) independently of sharpening.
The baseline is used by MetricMode::RelativeToBase to isolate
sharpening-induced artifacts from resize-induced artifacts.
When params.evaluation_color_space is set, the measurement uses
chroma_guard::evaluate_in_color_space instead of the standard metric.
Immediately after the resize stage, base_quality::score_base_resize assesses
the downscaled image:
- Ringing score (always computed, no-reference) — sign-flip oscillation
near edges on the resized luma. Drives
envelope_scale:effective_p0 = target_artifact_ratio * envelope_scale. - Edge retention and texture retention (reference-aware, diagnostic-only in v1) — per-pixel Sobel energy ratio and local-variance ratio between the source and resized images. These require expensive O(W*H) Sobel and variance passes on the full-resolution source image.
Performance note: source-side edge/texture retention is computed only when
diagnostics_level == Full. In Summary mode (the default, and what
Fast/Balanced pipeline modes use), these fields are skipped to avoid the
O(W*H) Sobel + local-variance cost on the original input.
For each strength s_i in the probe set, sharpening is applied and artifacts
are measured.
Before the probe loop begins, the detail signal D = input - blur(input) is
computed once from the base image (or base luminance in Lightness mode) using
the configured Gaussian kernel. Each probe then applies sharpening as a
multiply-add output = input + s * D, avoiding a redundant blur per probe.
This collapses the per-probe cost from O(blur) + O(metric) to
O(multiply-add) + O(metric), since blur dominates at typical image sizes.
The default probing strategy uses a two-pass adaptive scheme:
-
Coarse scan —
coarse_countstrengths (default 7) are evaluated at uniform spacing over[coarse_min, coarse_max]. Early stopping: after the first 3 probes, the coarse scan exits as soon as a P0 crossing bracket is found (i.e.metric_value[i] <= P0 && metric_value[i+1] > P0). This typically saves 2-4 probes at the tail of the range. -
Dense refinement — the crossing bracket from the coarse pass is expanded by
window_margin(default 0.5x the bracket width) on each side, thendense_count(default 4) probes are evaluated uniformly within that window. -
Merge — coarse and dense samples are combined, sorted by strength, and near-duplicates (within 1e-5) are removed before fitting.
Both phases share the precomputed detail signal D, so the total cost is
1 x blur + (coarse_used + dense_count) x (multiply-add + metric).
Diagnostics for the two-pass scheme are stored in
AutoSharpDiagnostics::probe_pass_diagnostics (ProbePassDiagnostics),
including the dense window bounds and how many coarse probes were actually
evaluated before early stopping.
For static probe configurations (ProbeConfig::Explicit or
ProbeConfig::Range), all strengths are evaluated in a single pass using
the same detail-precomputation optimization.
SharpenMode::Rgb (legacy):
sharpen::unsharp_mask(base, s_i, sigma)-- unsharp mask on all RGB channels.- Measure artifact ratio on the sharpened image.
SharpenMode::Lightness (default):
color::extract_luminance(base)-- CIE Y luminance:L = 0.2126R + 0.7152G + 0.0722B.- Sharpen luminance only via unsharp mask.
color::reconstruct_rgb_from_lightness(base, sharpened_L)-- multiplicative reconstruction:k = L'/L; R' = k*R, G' = k*G, B' = k*B(with epsilon guard for L near 0).- Measure artifact ratio on the reconstructed RGB image.
The luminance is extracted once before the probe loop and reused across all probes for efficiency.
Each probe produces a raw artifact_ratio (P_total) and a metric_value
that depends on MetricMode:
AbsoluteTotal:metric_value = P_total(s).RelativeToBase(default):metric_value = max(0, P_total(s) - baseline).
The metric_value is what gets fitted and compared against P0.
Engineering proxy — the paper describes the target constraint in terms of the
"fraction of color values outside the valid RGB gamut". Two interpretations are
implemented, selectable via ArtifactMetric:
ChannelClippingRatio (default) — per-channel fraction:
P(s) = (count of channel values v where v < 0.0 or v > 1.0)
-------------------------------------------------------
total channel values = W x H x 3
PixelOutOfGamutRatio — per-pixel fraction:
P(s) = (count of pixels where any channel v < 0.0 or v > 1.0)
--------------------------------------------------------
total pixels = W x H
In both cases, values exactly equal to 0.0 or 1.0 are not counted as artifacts.
output[i] = input[i] + amount * (input[i] - gaussian_blur(input, sigma)[i])
amount is the probe strength s. Values intentionally go outside [0,1].
[0.05, 0.1, 0.2, 0.4, 0.8, 1.5, 3.0]
Non-uniform, denser near zero where the crossing typically occurs. Configurable
via ProbeConfig::Explicit or ProbeConfig::Range.
Each probe produces a MetricBreakdown via metrics::compute_metric_breakdown.
All four components are computed from real signal (none are stubs):
| Component | Method | What it measures |
|---|---|---|
GamutExcursion |
Dispatches on ArtifactMetric |
Fraction of channel values outside [0, 1] |
HaloRinging |
Cross-edge profiles (Sobel + bilinear sampling) | Fraction of edge profiles with ≥2 sign changes in the diff |
EdgeOvershoot |
Same profiles | Mean of max(0, peak_excursion / gradient_magnitude - 1) |
TextureFlattening |
Local 5×5 variance comparison | Mean ` |
The selection_score field equals GamutExcursion and is what the solver fits.
composite_score is a weighted sum (MetricWeights) available for diagnostics
but does not drive s* selection in v0.2.
aggregate is a deprecated alias for selection_score kept for JSON backward
compatibility.
Each ProbeSample stores its breakdown: Option<MetricBreakdown>. In
DiagnosticsLevel::Summary mode (default) per-probe breakdowns are stripped
from the serialized output to reduce JSON size.
fit::fit_cubic_with_quality fits a cubic via Vandermonde normal equations in f64 and
returns both the polynomial and a FitQuality record:
P_hat(s) = a*s^3 + b*s^2 + c*s + d
The function takes generic (x, y) data points. In RelativeToBase mode, the
pipeline prepends a known anchor point (0.0, 0.0) -- since zero sharpening
produces zero added artifacts by definition. This constrains the fit to pass
near the origin.
The 4x4 normal equations (V^T V)*c = V^T*P are solved using Gaussian
elimination with partial pivoting.
Requires >= 4 data points. Returns CoreError::FitFailed if the matrix is
numerically singular (pivot < 1e-14).
After a successful fit, the following quality metrics are computed:
| Field | Description |
|---|---|
r_squared |
Coefficient of determination; 1.0 = perfect fit |
residual_sum_of_squares |
Sum of squared residuals between predicted and actual values |
max_residual |
Largest absolute residual across all data points |
min_pivot |
Smallest pivot magnitude during Gaussian elimination (condition proxy) |
These are stored in diagnostics.fit_quality (None when fit is skipped or fails).
solve::find_sharpness solves P_hat(s*) = P0 via Cardano's formula
(depressed cubic via substitution s = t - b/(3a)):
- Discriminant > 0: three distinct real roots (trigonometric method).
- Discriminant = 0: repeated root (special case: triple root when p = q = 0).
- Discriminant < 0: one real root (Cardano).
Root selection: the largest root in [s_min, s_max] is chosen, maximising
sharpness while staying within the artifact budget.
Fallback (when no algebraic root is in range):
- Among probe samples with
metric_value <= P0, pick the largest strength. - If none qualify, pick the sample with the smallest
metric_value.
Budget reachability: The pipeline checks whether the target is structurally
achievable. In AbsoluteTotal mode, if baseline > P0, the budget is marked
unreachable before the solver even runs.
The solver reports how the strength was chosen via SelectionMode:
| Mode | Meaning |
|---|---|
PolynomialRoot |
s* found from the cubic polynomial root |
BestSampleWithinBudget |
No polynomial root; largest sample within P0 |
LeastBadSample |
All samples exceed P0; picked minimum metric value |
BudgetUnreachable |
Budget structurally unreachable (e.g. baseline > P0) |
After fitting and before solving, several robustness checks are performed. Results
are stored in diagnostics.robustness as RobustnessFlags:
fit::check_monotonicity scans the sorted probe samples and counts inversions
(cases where metric_value[i+1] < metric_value[i]).
| Flag | Meaning |
|---|---|
monotonic |
Zero inversions — P(s) is non-decreasing |
quasi_monotonic |
At most one inversion |
The pipeline refits the cubic N times, each time dropping one probe sample, re-solves for the root, and measures the maximum relative change in s*:
max_loo_root_change = max_i |s*_full - s*_drop_i| / s*_full
loo_stable = true when max_loo_root_change < 0.25 (25% relative shift).
N=7 refits of a 4x4 system is negligible cost.
| Flag | Threshold |
|---|---|
r_squared_ok |
R² > 0.85 |
well_conditioned |
min_pivot > 1e-8 |
If robustness checks indicate unreliable fitting (non-monotonic data or LOO
instability), the pipeline falls back to direct search and records the appropriate
FallbackReason (see below).
When the pipeline does not use the polynomial root (i.e. selection_mode != PolynomialRoot),
a typed FallbackReason is recorded in diagnostics.fallback_reason:
| Reason | Trigger |
|---|---|
BudgetTooStrictForContent |
Budget structurally unreachable (baseline > P0) |
DirectSearchConfigured |
FitStrategy::DirectSearch selected by caller |
FitFailed |
Polynomial fit returned an error |
MetricNonMonotonic |
Probe data is not monotonically increasing |
FitUnstable |
LOO cross-validation shows unstable root |
RootOutOfRange |
Fit succeeded but no root in [s_min, s_max] |
Priority order: first matching reason wins (listed in evaluation order above).
When selection_mode == PolynomialRoot, fallback_reason is None.
The selected strength s* is applied once more using the same sharpening mode
(RGB or lightness) as the probe stage.
Standard unsharp mask at s* — same as a single probe step.
The gain map computed in Stage 2.5 modulates the effective strength per pixel:
effective_strength(x, y) = s* * gain_map(x, y). In lightness mode, the
detail signal (luma - gaussian_blur(luma)) is computed once and applied via
sharpen::apply_adaptive_lightness_from_detail.
Backoff loop: if the adaptive result exceeds the artifact budget P0, the
pipeline iteratively scales down the global strength by backoff_scale_factor
(default 0.8) for up to max_backoff_iterations (default 4) attempts. The
outcome is reported via AdaptiveValidationOutcome:
| Outcome | Meaning |
|---|---|
PassedDirect |
Budget met on first try |
PassedAfterBackoff |
Budget met after N backoff iterations |
FailedBudgetExceeded |
Best result after all iterations still exceeds budget |
Stored in AutoSharpDiagnostics::adaptive_validation.
When params.experimental_sharpen_mode = Some(LumaPlusChromaGuard { max_chroma_shift })
(the default since v0.5), chroma_guard::sharpen_with_chroma_guard replaces the
output from Stage 8.
The chroma guard sharpens luminance, then monitors the per-pixel chroma shift
(in Cb/Cr space). Where the shift exceeds max_chroma_shift (default 10%), a
soft clamp brings chroma back toward the original value.
Diagnostics are stored in AutoSharpDiagnostics::chroma_guard (ChromaGuardDiagnostics):
| Field | Description |
|---|---|
pixels_clamped_fraction |
Fraction of pixels where chroma clamping was applied |
mean_chroma_shift |
Mean chroma shift magnitude across all pixels |
max_chroma_shift |
Maximum chroma shift magnitude |
Note: This stage is on by default since v0.5. To disable, set
experimental_sharpen_mode: None explicitly.
The artifact ratio on the final sharpened image (pre-clamp) is measured and recorded:
measured_artifact_ratio— P_total(s*) raw.measured_metric_value— mode-adjusted (relative or absolute).metric_components— fullMetricBreakdownof the final output.
When params.evaluator_config = Some(Heuristic) (the default since v0.5),
evaluator::HeuristicEvaluator runs after final sharpening and produces an
advisory quality prediction. It does not change the selected s*.
The evaluator extracts ImageFeatures from the sharpened image:
| Feature | Description |
|---|---|
edge_density |
Fraction of pixels with Sobel magnitude > threshold |
mean_gradient_magnitude |
Mean Sobel magnitude across all pixels |
gradient_variance |
Variance of gradient magnitudes |
mean_local_variance |
Mean 5×5 local variance |
local_variance_variance |
Variance of local variances (texture heterogeneity) |
laplacian_variance |
Variance of the Laplacian response (frequency content proxy) |
luminance_histogram_entropy |
Shannon entropy of 64-bin luminance histogram |
Output is stored in AutoSharpDiagnostics::evaluator_result (QualityEvaluation):
predicted_quality_score, confidence, optional suggested_strength, and the raw
features. All values are advisory.
Note: On by default since v0.5. To disable, set evaluator_config: None.
ClampPolicy::Clamp (default): every channel is clamped to [0, 1].
ClampPolicy::Normalize: image is divided by its maximum value.
color::image_linear_to_srgb applies the inverse IEC 61966-2-1 function.
The result is scaled to [0, 255] u8 and written to disk.
recommendations::generate_recommendations inspects the completed
AutoSharpDiagnostics and emits a list of Recommendation objects.
Each recommendation contains:
kind(RecommendationKind): what type of change is suggested (e.g.SwitchToContentAdaptive,LowerStrongEdgeGain,RaiseArtifactBudget,SwitchToLightness,WidenProbeRange,LowerSigma).severity(Severity):Info,Suggestion, orWarning.confidence: float in [0, 1].reason: human-readable explanation.patch(ParamPatch): self-contained partial parameter update the UI can apply and re-run.
Stored in AutoSharpDiagnostics::recommendations.
Every pipeline invocation records wall-clock microsecond timing via StageTiming:
| Field | What it measures |
|---|---|
resize_us |
Lanczos3 or strategy-based downscale (includes staged shrink when applicable) |
contrast_us |
Contrast leveling (0 when disabled) |
baseline_us |
Baseline artifact measurement |
base_quality_us |
Base resize quality scoring (ringing + optional edge/texture retention) |
probing_us |
Entire probe loop (detail precomputation + all probe strengths) |
fit_us |
Cubic polynomial fitting + solve |
robustness_us |
Monotonicity + LOO stability checks |
final_sharpen_us |
Final sharpening at selected s* (including chroma guard) |
clamp_us |
Clamping + final measurement |
total_us |
End-to-end pipeline time |
classification_us |
Region classification (None when Uniform) |
adaptive_validation_us |
Adaptive backoff (None when Uniform) |
ingress_us |
Input color-space ingress (None when not configured) |
evaluator_us |
Quality evaluator (None when not configured) |
Timing is always collected (no feature flag needed).
Three runtime performance-quality tradeoff modes are available. Each mode
adjusts probe budgets, adaptive complexity, and diagnostic depth while
preserving user-chosen P0, sigma, target dimensions, metric, and sharpen mode.
Apply via PipelineMode::apply before passing params to the pipeline (or set
params.pipeline_mode and call params.resolved()).
Minimal probing for lowest latency. Trades some quality on complex images for speed.
| Setting | Value |
|---|---|
| TwoPass coarse/dense | 4 / 2 |
| Sharpen strategy | Uniform (no classification, no gain map) |
| Chroma guard | Disabled |
| Evaluator | Disabled |
Current default behaviour — no overrides. Content-adaptive sharpening, two-pass probing (7 coarse / 4 dense), chroma guard, heuristic evaluator.
Extended probing with a wider dense window and full diagnostics.
| Setting | Value |
|---|---|
| TwoPass coarse/dense | 9 / 6 |
| TwoPass window margin | 0.7 |
| Sharpen strategy | ContentAdaptive (ensured; up to 6 backoff iterations) |
| Chroma guard | Ensured on |
| Evaluator | Ensured on |
AutoSharpDiagnostics (serialisable to JSON) contains:
| Field | Description |
|---|---|
input_size / output_size |
Image dimensions |
sharpen_mode |
Rgb or Lightness |
metric_mode |
AbsoluteTotal or RelativeToBase |
artifact_metric |
ChannelClippingRatio or PixelOutOfGamutRatio |
target_artifact_ratio |
P0 |
baseline_artifact_ratio |
P(base) before sharpening |
probe_samples |
All (s_i, P_total(s_i), metric_value(s_i), breakdown) tuples |
fit_status |
Success, Failed { reason }, or Skipped |
fit_coefficients |
Cubic [a, b, c, d] if fit succeeded |
fit_quality |
FitQuality (R², RSS, max residual, min pivot) or None |
crossing_status |
Found, NotFoundInRange, or NotAttempted |
robustness |
RobustnessFlags (monotonicity, LOO, condition) or None |
selected_strength |
s* applied |
selection_mode |
How s* was chosen (see Selection modes above) |
fallback_reason |
FallbackReason enum or None (see Stage 7c) |
budget_reachable |
Whether the target P0 is achievable |
measured_artifact_ratio |
P_total(s*) after final sharpen |
measured_metric_value |
Mode-adjusted metric of the final output |
metric_components |
MetricBreakdown of the final output (all four components) |
metric_weights |
MetricWeights used for composite score computation |
region_coverage |
Per-class pixel counts (None when Uniform) |
adaptive_validation |
AdaptiveValidationOutcome (None when Uniform) |
timing |
StageTiming — per-stage microsecond wall-clock times |
input_ingress |
InputIngressDiagnostics (None when not configured) |
resize_strategy_diagnostics |
Kernels used (None when default Lanczos3) |
chroma_guard |
ChromaGuardDiagnostics (None when not configured) |
evaluator_result |
QualityEvaluation advisory result (None when not configured) |
recommendations |
List of Recommendation objects (empty when none triggered) |
AutoSharpParams::default() is the canonical starting point. Notable defaults:
| Parameter | Default |
|---|---|
probe_strengths |
[0.05, 0.1, 0.2, 0.4, 0.8, 1.5, 3.0] |
target_artifact_ratio |
0.001 (0.1%) |
sharpen_sigma |
1.0 |
sharpen_mode |
Lightness |
metric_mode |
RelativeToBase |
artifact_metric |
ChannelClippingRatio |
fit_strategy |
Cubic |
output_clamp |
Clamp |
sharpen_strategy |
Uniform |
diagnostics_level |
Summary |
experimental_sharpen_mode |
Some(LumaPlusChromaGuard { max_chroma_shift: 0.10 }) (on by default) |
evaluator_config |
Some(Heuristic) (on by default) |
input_color_space |
None (Srgb) |
resize_strategy |
None (Lanczos3) |
evaluation_color_space |
None (Rgb) |
The last two rows (chroma guard and evaluator) changed in v0.5 — callers that need
the minimal baseline should set both to None explicitly.