Improve polygonize performance: single-pass tracing, JIT merge helpers, batch shapely

**Author of Proposal:** @brendan

## Reason or Problem

`_follow` traces every polygon boundary twice -- once to count vertices, then again to fill the coordinate array. This doubles boundary-tracing cost, which is already the dominant phase for rasters with many small regions.

The dask chunk-merge path runs `_point_in_ring`, `_simplify_ring`, and `_signed_ring_area` as plain Python loops. On chunked rasters with many boundary-crossing polygons, these show up in profiles.

`_to_geopandas` builds `Polygon` objects one at a time. Shapely 2.0 has a batch `polygons()` function (compiled C) that does this faster.

## Proposal

**Design:**

No API changes -- these are all internal.

Rewrite `_follow` to trace in a single pass using a dynamically-grown numpy buffer. Start at 64 points, double when full, trim after the loop. The O(log n) reallocation overhead is small next to a full second traversal of the boundary.

Add `@ngjit` to `_point_in_ring`, `_simplify_ring`, and `_signed_ring_area`. Numeric loops over numpy arrays, no Python objects -- numba handles them fine.

In `_to_geopandas`, batch-construct hole-free polygons with `shapely.polygons()`. Polygons with holes keep using the scalar constructor. Shapely < 2.0 gets the current code path via a version check.

**Usage:**

No user-facing changes. Existing `polygonize()` calls get faster.

**Value:**

The tracing rewrite speeds up all four backends. JIT merge helpers mainly help dask, where boundary merging can eat a real fraction of total time. Batch shapely helps geopandas output.

## Stakeholders and Impacts

Internal only. The existing benchmark suite (`benchmarks/benchmarks/polygonize.py`) covers before/after measurement.

## Drawbacks

The single-pass buffer growth in `_follow` is harder to read than the current two-pass logic. The shapely batch path needs shapely >= 2.0 (Dec 2022), gated by a runtime check.

## Alternatives

Could pre-compute vertex counts from region perimeters, but that's itself another pass over the data. Numba typed lists instead of buffer doubling have worse per-append overhead.

## Unresolved Questions

None.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve polygonize performance: single-pass tracing, JIT merge helpers, batch shapely #1008

Reason or Problem

Proposal

Stakeholders and Impacts

Drawbacks

Alternatives

Unresolved Questions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Improve polygonize performance: single-pass tracing, JIT merge helpers, batch shapely #1008

Description

Reason or Problem

Proposal

Stakeholders and Impacts

Drawbacks

Alternatives

Unresolved Questions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions