Skip to content

test(rasterize): close public-API coverage gaps (deep-sweep pass 1)#2022

Merged
brendancol merged 3 commits into
mainfrom
deep-sweep-test-coverage-rasterize-2026-05-17
May 18, 2026
Merged

test(rasterize): close public-API coverage gaps (deep-sweep pass 1)#2022
brendancol merged 3 commits into
mainfrom
deep-sweep-test-coverage-rasterize-2026-05-17

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Summary

Closes documented but untested public-API surface in xrspatial.rasterize flagged by the test-coverage sweep on 2026-05-17. Adds 34 tests in xrspatial/tests/test_rasterize_coverage_2026_05_17.py. Sweep is test-only; no source files modified.

Coverage gaps closed:

  • Cat 3 HIGH -- 1x1 single-pixel raster across all four backends. test_rasterize.py covers 1xN strips and Nx1 strips but never the width=1 AND height=1 case, so the polygon scanline, line Bresenham, and point-burn kernels ship without the single-cell degenerate case.
  • Cat 4 HIGH -- like= template-raster parameter (rasterize.py:2038) is implemented by _extract_grid_from_like (line 1930) but no test exercises it. Pins dtype/bounds/coords inheritance, the three override branches (dtype, bounds, width/height), the three validation branches (not-DataArray, 3D, wrong dim names), and like= on all four backends. Mutation against rasterize.py:2183-2184 flipped the inheritance test red.
  • Cat 4 HIGH -- resolution= happy path. Only the oversize-rejection error path was tested (test_oversize_resolution_rejected); pins the scalar branch, the tuple branch, the ceil-and-clamp-to-1 semantics, and resolution= on all four backends.
  • Cat 4 HIGH -- non-empty GeometryCollection unpacking is documented at rasterize.py:1995 and implemented by _classify_geometries_loop (line 228) but only the empty-GC case was tested. Pins polygon+point and polygon+line+point collections plus parity across cupy / dask+numpy / dask+cupy.
  • Cat 1 MEDIUM -- eager cupy all_touched=True parity vs eager numpy. The existing parity test only covered dask+cupy all_touched, leaving the direct GPU all_touched kernel untested.
  • Cat 2 MEDIUM -- int32 dtype with default NaN fill silently casts to the int32-min sentinel. Pin the cast so any future raises-instead switch is visible as a code-review diff.

Test plan

  • All 34 new tests pass locally on a CUDA host (numpy + cupy + dask+numpy + dask+cupy).
  • Pre-existing 143 passing + 2 skipped tests in test_rasterize.py remain green.
  • Mutation against the like= dtype inheritance branch (elif like_dtype is not None: final_dtype = like_dtype) flips the inheritance test red, confirming the new coverage catches a real regression.
  • CI runs the cupy / dask+cupy branches on the GPU host (locally exercised but CI re-validates).

Add 34 tests in test_rasterize_coverage_2026_05_17.py closing documented
but untested surface flagged by the test-coverage sweep on 2026-05-17.

- Cat 3 HIGH: 1x1 single-pixel raster across all four backends.
  test_rasterize.py covers 1xN strips and Nx1 strips but never the
  width=1 AND height=1 case, so the polygon scanline, line Bresenham,
  and point-burn kernels ship without the single-cell degenerate case.
- Cat 4 HIGH: like= template-raster parameter (rasterize.py:2038) is
  implemented by _extract_grid_from_like (line 1930) but no test
  exercises it. Pins dtype/bounds/coords inheritance, the three
  override branches, the three validation branches, and like= on all
  four backends. Mutation against rasterize.py:2183-2184 flipped the
  inheritance test red.
- Cat 4 HIGH: resolution= happy path. Only the oversize-rejection
  error path was tested; pins the scalar branch, the tuple branch,
  the ceil-and-clamp-to-1 semantics, and resolution= on all four
  backends.
- Cat 4 HIGH: non-empty GeometryCollection unpacking is documented at
  rasterize.py:1995 and implemented by _classify_geometries_loop
  (line 228) but only the empty-GC case was tested. Pins polygon+point
  and polygon+line+point collections plus parity across cupy /
  dask+numpy / dask+cupy.
- Cat 1 MEDIUM: eager cupy all_touched=True parity vs eager numpy.
  The existing parity test only covered dask+cupy all_touched.
- Cat 2 MEDIUM: int32 dtype with default NaN fill silently casts to
  the int32-min sentinel. Pin the cast so any future raises-instead
  switch is visible as a code-review diff.

Sweep is test-only: no source files modified. Pre-existing 143 passing
plus 2 skipped tests in test_rasterize.py remain untouched.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label May 18, 2026
The int32-default-NaN-fill test pinned a literal INT32_MIN sentinel.
That value is platform-dependent: x86 yields INT32_MIN, Apple Silicon
yields 0.  Both are unspecified by C and by numpy, so the test now
derives the expected sentinel from a runtime numpy cast and asserts
rasterize emits the same value.  CI on macos-latest, 3.14 was failing
because the value there is 0, not INT32_MIN.
@brendancol
Copy link
Copy Markdown
Contributor Author

PR Review: test(rasterize): close public-API coverage gaps (deep-sweep pass 1)

Reviewed locally on the worktree. This is a true test-only PR (only xrspatial/tests/test_rasterize_coverage_2026_05_17.py plus the sweep bookkeeping CSV). All 34 new tests pass in 1.84s on a CUDA host. Backend matrix is honest: every gap that documents a parity claim runs on numpy / cupy / dask+numpy / dask+cupy.

Blockers

None.

Suggestions

  • test_polygon_eager_cupy (line 134) and test_polygon_dask_numpy (line 145) anchor the cupy / dask 1x1 result against an eager-numpy run recomputed inside the test rather than the literal 7.0. If both backends regressed in the same direction (e.g. the polygon kernel wrote fill instead of the value on a 1x1 grid), the parity tests would silently pass. Add a positive value pin: assert _as_numpy(cp_r)[0, 0] == 7.0. Same comment applies to test_polygon_dask_cupy (line 157) and the resolution / GeometryCollection parity tests at lines 340-372 and 430-460.

  • test_like_bounds_override (line 223-234) only asserts r.coords['x'].values.min() >= 0.0 and <= 2.0. With template width=6 and overridden bounds 0..2, expected x centres are [1, 3, 5, 7, 9, 11] / 6. A regression that placed coords at e.g. [0.1, 0.2, ..., 0.6] would still pass. Replace the range bounds with np.testing.assert_allclose(r.coords['x'].values, np.array([1, 3, 5, 7, 9, 11]) / 6.).

  • TestIntegerDtypeNanFill only runs on numpy. The cast np.full(..., nan).astype(int) is allocator-side and the cupy / dask paths allocate their own backing arrays (cupy.full(...).astype(int) differs from numpy's behaviour on some CUDA versions). Either parametrize across backends, or note in the class docstring that this is a numpy-only pin.

Nits

  • import dask.array as da # noqa: F401 (line 62) is only used for the import-availability probe. import dask is enough and avoids the noqa.

  • test_like_rejects_3d (line 283) and test_like_rejects_wrong_dim_names (line 289) hit the same if like.ndim != 2 or 'y' not in like.dims or 'x' not in like.dims: branch. Coverage is equivalent across them. Worth a one-line note in the class docstring that they target the same raise for documentation purposes.

  • test_polygon_line_point_in_collection (line 412) uses a horizontal line at y=5.5, x in [5.5, 9.5]. The comment cites Bresenham, but a horizontal line is the trivial degenerate of the algorithm. If the test is meant to exercise Bresenham proper, prefer a diagonal line.

  • Test naming drift: test_collection_eager_cupy_matches_numpy vs test_polygon_eager_cupy. The explicit _matches_numpy suffix is clearer. Optional: apply it consistently to cross-backend parity tests.

  • The module header (lines 9-29) cites source-file line numbers (rasterize.py:2038 etc.) which drift as the source evolves. Function-name-only references (_extract_grid_from_like, _classify_geometries_loop) stay valid longer.

What looks good

  • Real cross-backend coverage on every HIGH gap. Cupy and dask paths round-trip through actual kernel paths, not just construction smoke tests.
  • Validation-branch coverage on _extract_grid_from_like (TypeError, ValueError x2) is complete.
  • TestIntegerDtypeNanFill derives the sentinel from numpy itself rather than hard-coding INT32_MIN. Portable across x86 / Apple Silicon (follow-up commit 3ee15bc makes this explicit).
  • Test isolation: no tmpdir, no fixture leakage, no parametrize-ID collisions.
  • Runtime: 1.84s for 34 tests. No slow marker needed.
  • No production code touched. Sweep is genuinely test-only.

Checklist

  • All implemented backends produce consistent results (numpy / cupy / dask / dask+cupy)
  • NaN handling pinned for integer-dtype edge case
  • Edge cases covered (1x1 raster, empty GC already covered elsewhere, non-empty GC newly covered)
  • Dask chunk boundaries handled (chunks=(1,1) and (2,2) variants)
  • No premature materialization
  • Tests deterministic
  • Test runtime acceptable
  • Cross-backend pin on integer-dtype-NaN-cast (suggestion above)
  • Positive value pins on cupy / dask parity tests (suggestion above)

Pin absolute burn values on cupy / dask / dask+cupy parity tests so a
co-regression on both backends can't slip past a pure parity check.
Tighten test_like_bounds_override to compare expected cell centres
instead of just value ranges, and swap the GeometryCollection
polygon+line+point line for a 45-degree diagonal so Bresenham steps
in both axes. Document TestIntegerDtypeNanFill as numpy-only.
Plus rename parity tests with a _matches_numpy suffix, cross-reference
the two like= validation tests (same compound guard), trim brittle
source-file line numbers from the module docstring, and drop the
unused dask.array as da alias.
@brendancol brendancol merged commit bdfb286 into main May 18, 2026
4 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant