Changelog¶
[3.4.0] — 2026-07-01¶
Added¶
InOutStabilizer(polar_high.decomposition) — a domain-agnostic in-out separation point picker (Ben-Ameur & Neto 2007) for cutting-plane / Benders drivers. Cutting-plane methods that generate each cut at the raw master vertexf_outtail off badly when the recourse is flat in the coupling variable (dual-degenerate slopes): the master wanders among cost-equivalent vertices and the bound closes very slowly. The stabilizer instead returns an interior separation pointf_sep = λ·centre + (1-λ)·f_outat which to generate the cut — better-centred cuts, no wandering, faster bound closure, at zero extra subproblem solves. Domain-free (operates only on{col_id -> value}point dicts and scalar weights); constructed ONE PER REGION.λ = 0.0is a verbatim no-op (returns its input unchanged, so byte-parity with exact Benders holds by construction). Convergence guarantee: the moment a region's cut fails to separate itsf_out, the nextseparation_pointreturns the master point verbatim (λ = 0⇒ exact Benders), and the null-step weight-shrink bottoms out at a forced0rather than a positive floor. Stateful, deterministic, side-effect-free.
[3.3.0] — 2026-07-01¶
Added¶
StallMonitor/StallVerdict(polar_high.decomposition) — a domain-agnostic tail-off / stall detector for cutting-plane / Benders drivers (the outer loop that consumesWarmProblem.add_cut_row/add_recourse_col). It notices when the outer iteration has stopped making progress and lets the driver bail out with a diagnostic instead of silently burning the iteration cap. The monitor knows only the two scalars every Benders-style loop already maintains — a lower bound and a best-so-far upper bound — plus one caller-suppliedreference_scale(a "sane objective magnitude" the driver computes from its own problem); it carries no domain concepts. A run is declared stalled only when a CONJUNCTION holds over a full trailing window: the relative gap exceedsgap_floor(far from converged), the best upper bound has not improved by more thanmin_rel(incumbent frozen), and the best upper bound is still aboveblowup_mult * reference_scale(frozen far above any sane magnitude — the penalty / complete-recourse regime). The conjunction is what separates a genuine tail-off from the benign frozen windows a converging run exhibits (early blow-up that shrinks fast, short flat stretches at a sane magnitude). Stateful (boundeddeque), deterministic, and side-effect-free — it only reports.
[3.2.0] — 2026-06-30¶
Changed¶
- Autoscale Layer 3 now centres the objective over HiGHS' cost comfort
zone. When a cost coefficient would trip a HiGHS warning — smallest
|c| < 1e-4("excessively small costs") or worst|c| > 1e+4— Layer 3 picks the power-of-twouser_objective_scaleexponent that places the band's geometric centresqrt(min·max)at the zone's geometric centre (1.0), instead of clamping the offending end to a boundary. For a band narrower than the zone both ends land inside[1e-4, 1e+4]; for a band wider than the zone the unavoidable violation falls symmetrically on both ends in log-space. Bands already inside the zone are untouched (N = 0), so well-scaled models are never perturbed. The exponent is a power of two and HiGHS unscales the objective and duals on output, so the reported solution is unchanged — only the magnitudes the simplex pivots on. The new audit tag iscenter. - The previously-unhandled small-cost case (HiGHS' "Problem has some excessively small costs" warning) is now corrected: it was silently ignored before — only large costs were scaled.
- The pure large-cost case now centres as well (previously clamped the
worst
|c|to the1e+4working ceiling). The result is a slightly larger-magnitude down-scale; being power-of-two it remains output-invariant.
[3.1.0] — 2026-06-25¶
Added¶
Solution.max_primal_infeasibilityandSolution.primal_feasibility_tolerance: expose the solver's achieved unscaled primal slack and its primal feasibility tolerance, so a caller that hand-checks a constraint on the returned solution (e.g. a Benders master coupling self-check) can size its tolerance from the solver's own achieved feasibility instead of a hard-coded magic constant. HiGHS enforces feasibility on the internally-scaled problem, so the unscaled slack reported here can exceed the nominal scaled tolerance — a normal solver artifact callers must allow for. Both return0.0for a synthesisedSolutionwith no live HiGHS handle.
[3.0.0] — 2026-06-25¶
Refocuses regional decomposition: the built-in dual-subgradient
LagrangianProblem driver is removed and replaced by small, generic
cutting-plane primitives, on top of which the caller builds a Benders
decomposition. Problem / WarmProblem / Param and all engine + solver
exports are otherwise unchanged. (The subgradient driver shipped only in the
unreleased 2.7–2.9 line; those tags never reached PyPI, so 3.0.0 is the first
published release to carry this decomposition rework — a 2.x → 3.0.0
upgrade gains the primitives, not just loses the driver.)
Removed¶
LagrangianProblemand the whole subgradient decomposition (lagrangian.py:LagrangianProblem,CouplingSpec,CouplingEntry,LagrangianSolution, and the parallel-subsolve pool), plus their top-level exports. Its one consumer (FlexTool) moved to a Benders decomposition built from the cut-append / warm-restart / parallel primitives below, which give a tight bound without a subgradient tail. Pinpolar-high<3for the old driver.
Added¶
WarmProblem.add_cut_row(col_ids, coefs, lower) -> int: append an optimality-cut rowΣ coefs·x >= lowerto the live (already-built) master and return itsrow_id; the cut dual is read bySolution.row_dual[row_id]. Plusadd_recourse_col(name, cost, lower, upper)for lazy recourse columns. Both are post-build live edits that deliberately bypass the build-time DSL lock (it only guards the fixed-size autoscale side vectors). No auto-scaling for appended rows — keep the master autoscale off, or pre-scalecoefsso the cut lives on the built columns' scale.WarmProblem.solve(*, retry_on_unknown=False): warm re-solve after a cut append — the retained basis lets dual simplex hot-start across the appended>=row, falling back to a singleclearSolvercold re-presolve only if the warm run fails to certifykOptimal(kUnknown/ transientkSolveError/ spuriouskUnboundedoff a stale basis). DefaultFalseis byte-identical to before. On a well-scaled master the warm path holds with no fallback, removing the super-linear cold-presolve cost that dominated the Benders master at scale.polar_high.parallel(exported at top level):solve_indexed_parallel(fanfn(i)across a thread pool, collect into per-index slots so the result is timing-independent; requires everyWarmProblemalready built, raises on an unbuilt one),prewarm_global_scheduler(pin the process-global HiGHS scheduler to one thread ONCE so concurrentrun()calls stay single-threaded and deterministic), andresolve_worker_count.workers <= 1keeps a sequential path byte-identical to a plainforloop. Recovers the scheduler-pin pattern from the removed subgradient pool; enables parallel Benders region solves.WarmProblem.set_output_flag(enabled): enable/disable HiGHS' native solve log for this problem; the preference persists on the handle across cold / warm / retry solves (applied immediately if built, else at the initial build — before the log routing, soFalsealso suppresses the HiGHS version banner). Lets a driver that fans out many sub-solves (Benders regions) mute the per-sub-solve output and show its own concise log instead.
Changed¶
- Coverage for two live
WarmProblemmethods (update_obj_coef_array,fix_cols) that the deleted Lagrangian tests exercised is relocated intotest_warm_problem.py. - Docs: the "Lagrangian" guide is replaced by "Decomposition building blocks" (cut-append / warm-restart / parallel); the API reference and cross-links are updated to the new public surface.
[2.9.0] — 2026-06-24¶
Adds opt-in thread-parallel subsolves and a per-subsolve callback to the Lagrangian driver, with HiGHS-log silencing for that path. Default behaviour is unchanged — with no new kwargs the solve stays sequential, fires no per-subsolve callback, and keeps today's verbose native log.
Added¶
LagrangianProblem.solve(max_workers=...): optional cap on the number of worker threads used to solve subproblems concurrently within each barrier (initial build, per-iteration, primal recovery).None(default) or1keeps the fully sequential path. The effective count is clamped to[1, n_subproblems]. When> 1, every subsolve is forced tothreads=1so eachh.run()is deterministic (HiGHS is non-deterministic withthreads > 1) and the box is not oversubscribed — two parallel solves with differentmax_workersare byte-identical. The COLD initial build also parallelizes ACROSS regions: the process-global HiGHS scheduler is pre-pinned to a single thread ONCE up front (_prewarm_global_scheduler), after which the per-region first solves fan out concurrently WITHOUT passingthreads(so no per-instanceresetGlobalScheduler). Parallelism is across regions only — each individual solve stays single-threaded on the pinned pool. If the one-time prewarm fails, the build falls back to a sequential cold loop on the calling thread (threads=1per first solve pins the scheduler) and the warm iterations still parallelize; the cold-parallel and cold-sequential builds are bit-identical. The executor is shut down on every exit path, including the no-coupling early return and any raised exception (fail-fast on the lowest non-optimal subproblem index, queued siblings cancelled).LagrangianProblem.solve(subsolve_callback=...): optional callable invoked at the start and finish of every individual subproblem solve. It fires from worker threads whenmax_workers > 1and MUST be thread-safe; exceptions are suppressed so a faulty observer can never abort the solve. The callback dict schema (pinned):- start:
{"event": "start", "iter": <int>, "subproblem": <int>, "phase": <"initial"|"iterate"|"recovery">} - finish:
{"event": "finish", "iter": <int>, "subproblem": <int>, "phase": <same>, "obj": <float>}—"obj"is present only when that subsolve reached optimality.phaseis"initial"for the build solve (iter == 0),"iterate"for outer iterations (iter >= 1), and"recovery"for the primal-recovery solve (iter == -1).
Changed¶
- When the caller uses the new functionality (
max_workers > 1or asubsolve_callback), the per-subsolve HiGHS native log is silenced. SetPOLAR_HIGH_LAGRANGIAN_VERBOSE=1to force the verbose native log back on. Plain existing callers keep today's verbose native log unchanged.
[2.8.0] — 2026-06-17¶
Retains each region's recovered-primal column values in the Lagrangian result so a downstream caller can reconstruct every subproblem's primal (e.g. investment-variable values) after the solve. Opt-in/backward- compatible — the new field defaults to an empty list and existing callers are unaffected.
Added¶
LagrangianSolution.subproblem_col_values: one numpy float64col_valuearray per subproblem (region), in subproblem order, each the region's FINAL recovered-primal column values. Populated on every return path ofLagrangianProblem.solve()— the main subgradient/primal-recovery path (a region whose recovery solve is skipped/non-optimal falls back to its most recent iterate, so the list is always full-length and index-aligned) and the trivial no-coupling early-return path (from each subproblem's initial solve). Each array is layout-aligned with that region's ownsubproblems[i]._vars[name].frame['col_id'], so a caller can index those col_ids into entryito assemble a whole-system primal from the per-region solves.
[2.7.0] — 2026-06-17¶
Adds opt-in live progress reporting to the Lagrangian driver. Default
behaviour is unchanged — solve() stays silent unless a callback is passed.
Added¶
LagrangianProblem.solve(progress_callback=...): an optional callable invoked once per outer subgradient iteration with that iteration's log dict (iter,alpha_k,max_abs_residual,total_obj), and once at the end with the final-summary dict (iter == -1, carryingbest_dual_total/recovered_total). Lets callers stream live decomposition progress instead of only inspectingiteration_logafter the fact.None(default) is a no-op and preserves the prior silent behaviour; callback exceptions are suppressed so a faulty observer can never abort the solve.
[2.6.0] — 2026-06-16¶
Adds an opt-in small-coefficient cutoff. Default behaviour is unchanged — byte-identical to 2.5.1 unless a caller sets the new threshold.
Added¶
Problem.coef_zero_threshold(default0.0= off): floors any LP matrix coefficient or RHS row-bound whose absolute value is below the threshold to exactly0.0, narrowing the numerical range (conditioning) of the LP for callers that opt in. Applied at every coefficient/RHS finalize point so it is independent of the build path — initial build (_solve_streaming,_build_canonical_matrix, includingrow_lb/row_ub) and warm in-place updates (WarmProblem.update_param,update_rhs).±inf/NaNsentinels are preserved and entries are replaced, never dropped, so matrix structure and determinism are unchanged; a threshold of0.0short-circuits to a no-op.
[2.5.1] — 2026-06-11¶
Hardens the 2.5.0 HiGHS-log routing so it can never lose the log. The LP build, autoscale, and solve numerics are unchanged from 2.5.0.
Fixed¶
route_highs_log_to_stdoutsuppressed HiGHS' native console write (log_to_console=False) and re-emitted via thesys.stdoutcallback on every routed solve. Suppressing native logging is a bet that the callback fires, and somehighspybuilds register thekCallbackLoggingcallback but never deliver a message — so suppress-native + silent-callback dropped the entire HiGHS log (observed in a Linux GUI subprocess on one machine, while an otherwise-identical machine with a differenthighspywas fine). The routing now suppresses native logging only whensys.stdoutis not backed by the native stdout fd (fd 1). When the sink already is fd 1 — a terminal, a pipe (e.g. a GUI reading a subprocess), or a file on fd 1 — HiGHS' own native log already reaches it, so native logging is left intact and the callback is skipped (new helper_sink_is_native_stdout). Routing + suppression now happens only wheresys.stdoutgenuinely diverges from fd 1 (Windows Basic ConsoleOutStream,redirect_stdout, pytestcapsys), i.e. where the native write is unreachable anyway.POLAR_HIGH_NATIVE_LOG=1still opts out everywhere.
[2.5.0] — 2026-06-10¶
Routes the HiGHS solver log through Python's sys.stdout so it is visible in
consoles that only capture the Python-level stream (not the native fd-1 write).
The LP build, autoscale, and solve numerics are byte-identical to 2.4.5; only
where the log appears changes.
Added¶
route_highs_log_to_stdout(h, *, stream=None)(new modulepolar_high._log_routing): registers a HiGHSkCallbackLoggingcallback that re-emits each message throughsys.stdout(resolved lazily, so it follows later redirection such as ipykernel's) and suppresses the duplicate native console write (log_to_console=False) once the callback is confirmed registered. Idempotent (safe on a reusedHighs), a no-op on silent solves (output_flagfalse), and fully defensive — anyhighspyerror leaves the native logging path untouched rather than risking a lost log.
Changed¶
- The in-process solve sites (
solvers/_highs.py, the streamingProblem.solve, andWarmProblem._initial_build) now route the HiGHS log throughsys.stdoutby default, right after options are applied so the version banner is captured. This makes the solver log visible under the Jupyter / Spine-Toolbox Basic Console on Windows (whereipykernelonly redirects fd 1 on POSIX, so the entire HiGHS log was previously lost) and underredirect_stdout/ pytestcapsys. SetPOLAR_HIGH_NATIVE_LOG=1to opt out and keep HiGHS' native fd-1 logging.
[2.4.5] — 2026-06-02¶
Docs-polish release. No runtime or public-API changes — the LP build, autoscale, and solve paths are byte-identical to 2.4.4.
Changed¶
- Loading-data Memory section rewritten to lead with what keeps the
footprint down — the integer-keyed coefficient matrix (
col_id/ row id /float64, no string labels) and the section-by-section streaming build that releases each constraint family's input frames before the next — before the long-format constant factor. Notes that HiGHS column names embed the dim labels and carry their own cost (~1.1 GB at the 3000² grid), shed viasave_memory=Trueorwrite_mps(emit_names=False). Conclusion updated to match the current benchmarks: polar-high matches or beats linopy/xarray peak memory on the irregular network LP and on the denseN × NLP withsave_memory. - Scaling guide: the "falsely infeasible" risk is now scoped to
badly-scaled models (eight or nine decades) rather than implied for
any wide spread; the "who this is for" bullet list is replaced with an
inline sentence; and a new section explains the three scaling layers
(detection, semantic rewrites, HiGHS-native global scaling) and how
they compose with HiGHS' own
simplex_scale_strategy. - Performance guide: the Threading section is rewritten — raising the thread count does speed up the build, with the trade-offs being per-thread scratch memory and fewer concurrent runs; the default of 1 is tuned for the "many independent solves" deployment. Scaling is added to the solver-options list.
Fixed¶
- env-vars guide: dropped the retired
POLAR_HIGH_RANGES_MAX_FAMILY_ROWS"Workload tuning" knob — it was retired when autoscale range detection moved to the per-term walk that bounds peak memory regardless of family size; the env var is no longer read and setting it is a no-op. Replaced the dead FlexTooldev/env_varsdocumentation link (404) with the repository. - Benchmark harness: sorted the
pulp_netimport block (ruffI001).
[2.4.4] — 2026-06-02¶
Docs-polish release. No runtime or public-API changes — the LP build, autoscale, and solve paths are byte-identical to 2.4.3.
Changed¶
Problem(dense_axes=...)now has substantive documentation: a new section in the Performance guide covers what the block-COO arm does (slice the dense suffix of eachVar's frame as a zero-copy numpy view, multiply with ufuncs), the row-sort contract the caller signs up to, suffix-matching against eachVar's dims, andPOLAR_HIGH_DISABLE_BLOCK_COO=1for A/B rollback. The Concepts page picks up a short mention pointing readers in. API reference was already covered via theProblem.__init__docstring.- Enum dtype alignment: the depth that lived in the README is
rebuilt under
docs/guide/loading-data.mdwith a concrete side-by-side example (capacity_dfon a subset Enum vocab,cost_dfon the full vocab) before the alignment-table contract. The README section is trimmed; the documentation index already links to the Guide. - Benchmark page picks up the v2.4 numbers and a clarification that
the
save_memorytrade-off axis is "how much work HiGHS does", not model size. mkdocs.ymldrops a legacy CDN script fromextra_javascript(MathJax 3 renders on every current browser without it).
Fixed¶
- Benchmark plot harness:
polar_darows fold into thepolarline on the dense plots so the published figure tracks the forthcoming block-COO-by-default behaviour without a redundant overlapping series. The network plot keepspolar_da_netas a distinct line where the irregular topology surfaces a small visible delta.
[2.4.3] — 2026-06-01¶
Benchmark-methodology + docs release. No runtime or public-API changes — the LP build, autoscale, and solve paths are byte-identical to 2.4.1.
Note on 2.4.2. The
v2.4.2git tag was pushed before thepyproject.tomlversion bump landed, so the published 2.4.2 wheel carriesversion = "2.4.1"in its metadata. 2.4.2 has been yanked on PyPI; use 2.4.3, which contains the same source as 2.4.2 plus the correct version string. Pinnedpolar-high==2.4.2installs are unaffected — yanking does not remove the wheel.
Changed¶
- Benchmark harness wraps each cell in a fresh
systemd-run --user --scope. The worker reads cgroup v2memory.peakfrom its own cgroup and emits it as a newcgroup_peak_mbcolumn — the kernel-level peak the OOM killer would charge against a budget, less noisy across reps than the process-levelru_maxrsswe previously plotted. Auto-falls back to plain subprocess on hosts wheresystemd-run --useris unavailable;--no-cgroup-scopeforces the fallback. In the fallback pathcgroup_peak_mbis NaN andpeak_rss_mbcontinues to work. - Benchmark grows two new polar variants alongside the existing
polar/polar_nettools: polar_sm/polar_sm_net— exercisesave_memory=Trueso the harness produces directly comparable regular-mode and one-shot-mode numbers on the same hardware.polar_da/polar_da_net— exercise the explicitProblem(dense_axes=...)contract on the dense and network LPs.docs/compare/benchmark.mdrefreshed against v2.4.x:- headline cells show both polar modes side-by-side with the cgroup peaks;
- "Measuring memory" leads with
cgroup_peak_mbas the canonical peak metric and demotespeak_rss_mbto a process-level note; - dense full-solve N=3000 peak drops 38.1 GB → 33.2 GB on polar regular (the autoscale memory-bounding work from 2.4.0 shows up here);
- network-LP threading speedup at N=10 000 ticks up from 1.35× to 1.40×; other rows refreshed accordingly.
[2.4.1] — 2026-06-01¶
CI / test-tooling release. No runtime or public-API changes — the LP build, autoscale, and solve paths are byte-identical to 2.4.0.
Fixed¶
psutiladded to thetestoptional-dependencies. The bounded-memory branch-fired profile tests detect which evaluation branch ran via the env-gated profile lines on stderr, and those lines arepsutil-gated: withoutpsutilinstalled,autoscale/_ranges.pysets_profile=Falseand the signals never print, so the four branch-fired tests (test_ranges_block_coo_branch_fired,test_ranges_rhs_bound_branch_fired,test_walk_fallback_profile_signal_fires,test_rhs_walk_fallback_no_skip_and_fires) skipped their assertions in CI's leanpip install -e ".[test]"environment. Declaringpsutilas a test dependency makes CI exercise the profiling signals. The bounded-memory feature itself was always correct and is unaffected — the byte-identical parity gate passes with or withoutpsutil.
Changed¶
rufflint + format cleanup ofsrc/andtests/(import sorting, an unnecessaryopen(..., "r")mode argument, a stray unused local, and a whole-treeruff formatreflow) so thelintCI job passes under the latestruff. No behavioural change.
[2.4.0] — 2026-06-01¶
Block-COO coefficient evaluation + bounded-memory autoscale¶
This release makes LP build and autoscale memory-bounded: no constraint or objective family can spike RAM by materialising a wide coefficient product. Two pillars:
Block-COO evaluation (Phase C). A polars frame sorted with its dense
axis trailing is physically a sequence of contiguous Arrow blocks; the
new path slices each block as a zero-copy numpy view and multiplies the
factors with ufuncs — no wide relational join, no wide intermediate. The
client declares its dense trailing axes once via the new
Problem(dense_axes=...) / Problem.declare_dense_axes(...) contract
(verified cheaply, O(n), no re-sort). Sum-wrapped chains are evaluated
in one pass via a captured SumBlockMeta reconstruction recipe, with a
relabel fast-path (reduce dims ⊆ var dims ⇒ a pure relabel, bit-identical
to the polars reduce) — the memory win for Sum-heavy families. Genuine
coefficient-combining sums stay bit-equivalent (FP reassociation ~1e-9).
Bounded-memory autoscale (Phase D). The autoscale Layer-1 range
readout and Layer-2 magnitude-bucketing used to materialise the wide
coefficient product just to read a statistic. Both now route through a
new general primitive, bounded_coefficient_walk, which iterates the
constraint/column spine in fixed row-batches and rebuilds each batch's
(rid/col_id, coef) via the block-COO builders + a prune-down backstop —
never holding more than one batch's product. Pluggable reducers compute
min/max (byte-identical, order-free) and the log2-magnitude histogram
(exponents may shift ±1 = an objective-invariant scaling change). The
reconstruction recipe is forwarded through post-Sum Expr-algebra
(scalar/Param mul+div, negation, subtraction, Where, and
set_objective's collapse-all Sum) so every wide-product term — the
objective and negated-Sum constraints included — routes through the
walk instead of a materialising collect. The size-blind family-row skip
is retired: every shape is now bounded, with no silent coverage gaps.
Validated on a 9-roll rolling-horizon LP (FlexTool DES / RETO-Africa): autoscale priv_dirty peak 46 → 23 GB, all objectives byte-identical, ~15% faster. The DSL is unchanged.
Added¶
Problem(dense_axes=...)/Problem.declare_dense_axes(...)— declare the pre-sorted dense trailing axes that enable block-COO evaluation.bounded_coefficient_walk+CoefWalkRecipe(.from_term,.from_rhs_chain,.from_rhs_param,.is_buildable) +MinMaxAbsReducer/Log2HistogramReducerinautoscale/_coef_walk.py.POLAR_HIGH_DISABLE_BLOCK_COO=1— fall every term back to the polars path (A/B rollback).POLAR_HIGH_BLOCK_COO_PROFILE/POLAR_HIGH_RANGES_PROFILE/POLAR_HIGH_LAYER2_PROFILE— env-gated instrumentation (no-op when off).
Changed¶
- Autoscale Layer-1 (
_ranges.py) and Layer-2 (_layer2.py) read coefficient ranges / magnitudes via the bounded walk. The ranges-PRE pass no longer gates the walk on the (not-yet-installed) Layer-2 side vectors, so it bounds every family there too. - The
Sumblock-COO path defers map-effectWherevia_Term.where_map_framesand forwardssum_block_metathrough Expr-algebra; a re-reducing outerSumstill correctly drops the recipe.
Removed¶
_skip_unbounded_over_capandPOLAR_HIGH_RANGES_MAX_FAMILY_ROWS— the size-blind family-row cap, superseded by the bounded walk.
[2.3.0] — 2026-05-29¶
Where pushdown (added in the same release window as the prune-down work below)¶
Where(expr, frame) with a pure-filter shape (frame columns are a
subset of the expression's open dims — no map effect introducing new
dims) now defers the filter into a new _Term.where_frames slot
instead of inner-joining frame into t.lazy eagerly. The LHS
prune-down (_build_lhs_pruned_plan) then applies each recorded
frame at the Var leaf AND at every Param atomic during chain rebuild —
mirror of the existing row_index pre-prune pattern. Net effect: the
filter narrows every intermediate, not just the final result.
Pure-filter Where now also PRESERVES var_source / param_sources /
coef_scalar on its output term (today's path cleared them). This
closes the Where leg of the "Sum/Where/Lag wrapping" limitation
flagged after the 2026-05-28 audit — LHS prune-down now fires on
Where-wrapped terms in addition to bare Var × Param × … chains.
Sum / Lag bake where_frames into t.lazy before consuming it
(they change row identity); the Sum leg remains future work.
Behaviour-preserving on every tested scenario: the LP matrix is byte-identical between the deferred-pushdown path and the env-var- disabled (eager-join) path. Validated on FlexTool's DES (RETO-Africa) scenario — 5.6M rows × 4.6M cols, same presolve reductions, same coefficient ranges. DES itself does not exercise any of the pushdown-eligible families (no per-process-profiles, no commodity ladder, no investment, no reserves), so no measurable RSS delta is expected there; richer scenarios with pure-filter Wheres over multi-atomic chains will see the win.
Added — Where pushdown¶
-
POLAR_HIGH_DISABLE_WHERE_PUSHDOWN=1— safety fallback env var. When set, everyWherecall eagerly inner-joinsframeintot.lazyand clears the leaf metadata exactly as the pre-v2.3.0 path. Use as an opt-out if a model surfaces unexpected drift on the pushdown path. -
_Term.where_frames: tuple[pl.LazyFrame, ...] | Noneslot — opt-in metadata recording pure-filterWhereframes so they can be applied at the leaves during chain rebuild. Internal — no public API change. -
_apply_where_frames(lazy, dims, where_frames)helper — used bySum/Lagto bake pending filters before consumingt.lazy, and by consumer fallback paths in_build_canonical_matrix/_solve_streaming/WarmProblem._initial_buildwhen leaf-rebuild prune-down can't fire. -
tests/test_where_pushdown_parity.py(11 tests) — parity coverage for pure-filter, map-effect, nested Where, Where-after-Sum, Sum-after-Where, anonymous-Param-chain, scalar-fold, Where-then-mul- Param, disable-guard, shared-empty-extras-nonempty, and RHS-Where- preserved-through-negation.
Fixed (latent — exposed by the parity sweep)¶
Where(expr, frame)withshared == [] and extras != ()now explicitly cross-joinsframeinstead of silently claiming the extras columns on_Term.dimswithout ever producing them int.lazy. Pre-fix this was a corruption waiting to happen; no known caller relied on the broken behaviour.
Removed (dead code)¶
- The
elif isinstance(rhs, (Var, Expr))/ standalone-block negation patterns in_build_canonical_matrix,_solve_streaming, andWarmProblem._initial_buildare deleted.Problem.add_cstrfolds Var/Expr RHS into the LHS viaExpr.__sub__before storing in_CstrProto, soproto.rhsonly ever reaches those sites as a Param or scalar — the elif branches were unreachable.
Prune-down (initial v2.3.0 work — merged earlier in the same release window)¶
Memory cliff fix for Param-chain RHS / LHS in _build_canonical_matrix
plus matching coverage in _solve_streaming and WarmProblem._initial_build.
On FlexTool's DES (RETO-Africa) scenario the canonicalise stage of the
first solve stalled inside profile_flow_upper_limit (1.5 M rows, RHS
= profile_value × process_existing_count × process_availability) — the
chained inner joins produced a ~2.6 billion-row Cartesian intermediate
before the row_index semi-join could prune it. The fix walks each
chain's named atomic Params and pre-prunes them against the constraint's
row_index keys (projected onto each atomic's own dim subset), bounding
the intermediate to the constraint row count.
Behaviour-preserving: every solve that completed before still produces identical numerics (verified against the FlexTool scenario parity suite, 139 polar_high tests + the previously-failing flextool scenarios). LP matrices are byte-identical between the prune-down path and the original merged-lazy path on every covered chain shape; the difference is solely intermediate peak memory.
Added¶
-
POLAR_HIGH_SOLVE_PROFILE=1— env-var-gated stderr profile lines covering every meaningful sub-step ofProblem._solve_streaming(cold path, 27 checkpoints) andWarmProblem._initial_build(18 checkpoints, including the per-family LP-build loop and the HiGHS handoff). Tab-separated[solve profile] phase=… rss_gb=… delta_gb=±… wall_s=…format mirroring thePOLAR_HIGH_WRITE_MPS_PROFILEprecedent. Zero overhead when unset. -
POLAR_HIGH_DISABLE_PRUNE_DOWN=1— safety fallback env var. When set, every multi-atomicParamchain in RHS / LHS handling falls through to the merged-lazy semi-join path. Use as an opt-out if a future model surfaces an unexpected numerical drift on the prune-down path. -
Param._value_scalarand_Term.coef_scalarslots — accumulate scalar folds (Param * float,Var * float,Expr.__neg__,Expr.__sub__, etc.) so the prune-down rebuild can seed its accumulator with the correct multiplicative constant. Without this tracking the rebuild would silently drop scalar factors that the merged-lazy path carries in thevalue/coefcolumn. Internal — no public API change. -
Per-family and per-term checkpoints inside
_build_canonical_matrix(gated byPOLAR_HIGH_WRITE_MPS_PROFILE=1). New labels:family_rhs_evaluated,family_rhs_l2baked,family_senses_built,family_rownames_built,family_term_plans_built,family_term_collect_start/family_term_collected(per LHS term),family_rhs_pruned_down(new prune-down path),family_lhs_scattered. Each emitsfamily=andfamily_idx=extras so per-family slicing is trivial. -
Reference tests:
tests/test_canonicalise_param_chain_prune.py(3 tests) — RHS prune-down parity vs merged-lazy path on synthetic 3-Param chain with disjoint-but-shared dims; coversParam.__truediv__.tests/test_lhs_param_chain_prune.py(3 tests) — LHS prune-down parity at all three call sites (_build_canonical_matrix,_solve_streaming,WarmProblem._initial_build).tests/test_prune_down_scalar_anonymous_fix.py(6 tests) — anonymous-Param-in-chain handling, scalar-fold tracking, sign propagation throughExpr.__neg__/__sub__, and the disable env-var fallback.tests/test_lp_view.pytest forto_csrpost-vectorisation (zero-copy CSR round-trip parity).
Changed¶
-
_build_canonical_matrixRHS handling (engine.py): whenrhs._sourcesis a chain of length ≥ 2 and the composite has dim columns, walk the atomics one at a time. Each atomic is semi-joined against the running accumulator's key projection (semi-join order: acc keys → atomic, NOT the other way around — atomic frame is the pre-pruned side). Final accumulator collects via the existing streaming fallback chain. Single-Param / scalar /Var-or-Expr-on-RHS branches unchanged. -
_build_lhs_pruned_plan(new helper) + three LHS call sites (_build_canonical_matrixL1664-1692,_solve_streamingL2738-2763,WarmProblem._initial_buildL3969-4002): whenterm.param_sourceshas length ≥ 2 ANDterm.var_sourceis set (i.e. the term has a direct Var anchor — not wrapped inSum/Where/Lagwhich clearvar_sourceto preserve safety), rebuild the LHS plan asrow_index ⋈ pruned_var ⋈ pruned_param_1 ⋈ pruned_param_2 …with each factor pre-pruned via semi-join. Sum / Where / Lag wrapped terms fall back to the original path. -
_lp_view.to_csrrow index construction: replaced a Pythonfor c in range(n_cols): col_of[a_start[c]:a_start[c+1]] = cloop withnp.repeat(np.arange(n_cols), np.diff(a_start)). Output identical (verified intest_lp_view.py); about 100× faster on a sparse 5 M-row LP. -
_build_canonical_matrixvariable loop: merged the two consecutive loops overself._vars.values()(col bounds / integrality andcol_namesconstruction) into a single pass. Eachv.frame["col_id"].to_numpy()materialises once instead of twice. -
~32
.astype(np.int64)/.astype(np.float64)call sites on freshly-allocated numpy arrays (from.to_numpy(),np.where,np.repeat,np.tile,np.concatenate) gainedcopy=False. Affected:_build_canonical_matrixper-family scatter (dimandscalarbranches), global / family dedup, objective-term collect, HiGHS bound translations in_build_lp_arrays/_solve_streaming/_initial_build, RHSnp.whererow_lb/row_ub translations, tracked-source scatter inWarmProblem, and_lp_view.from_problembound round-trip.
Fixed¶
-
Anonymous
Paraminstances (noname, no_sources) were silently dropped from_merge_param_sourcesoutput when participating in a chain with named Params. The prune-down rebuild then walked only the named atomics, missing the anonymous one's contribution. Fixed:_sources_for_propagationnow returns[(self, +1)]for anonymous atomics so the chain rebuild walks every constituent. -
Scalar folds (
Param * float,Var * float,Expr * float,Expr.__neg__,Expr.__sub__) collapsed constants into thevalue/coefcolumn without recording them in_sources/param_sources. The prune-down rebuild had no way to see the scalars and produced numerically-different LP coefficients (DES scenario parity tests caught this as a ~2 % objective drift ontest_fullYear_roll_matches_v3320_golden). Fixed via the newParam._value_scalar/_Term.coef_scalarslots; affected algebra ops propagate the scalar through to the prune-down accumulator.
Performance¶
- Canonicalise of FlexTool DES (RETO-Africa) on a 64 GB box:
- Before: stalled inside
profile_flow_upper_limitRHS evaluation after ~10 seconds, RSS climbing from 12.7 GB toward a peak of ~38 GB that exceeded available memory in some configurations. -
After: all 9 constraint families canonicalise in 27 seconds total;
_initial_buildexits at 49 seconds; peak RSS during the in-process HiGHS solve path is 23.3 GB. With--save-memory(subprocess HiGHS), peak is 15.2 GB. -
Wall-time impact of the perf quick-wins (
copy=False, vectorisedto_csr, merged var-loop): 5-15 % wall-time win on the canonicalise + HiGHS-handoff portion of large LPs. Memory peak win is small (5-10 %) — these are a separate stack from the prune-down fix and apply per-cell rather than per-chain.
Notes¶
- The fix is behaviour-preserving on every currently-tested scenario.
If a future model surfaces a numerical drift, set
POLAR_HIGH_DISABLE_PRUNE_DOWN=1as a workaround and report the scenario so the engine can be fixed. - The LHS prune-down activates only on terms with a direct Var anchor
(not wrapped in
Sum/Where/Lag). Sum-wrapped LHS chains fall back to the merged-lazy path; this is intentional for safety. Seespecs/block_coo_evaluation_handoff.mdfor the planned follow-on that handles those cases via a different mechanism. - See
specs/where_pushdown_handoff.mdfor the next architectural step (pushWhere(...)filter keys through the lazy plan tree the same way row_index keys are pushed today).
[2.2.0] — 2026-05-28¶
GLPK-style refactor of the Layer 2 scaling pipeline and the matrix emit path. Two architectural changes that, together, eliminate the "every consumer rebuilds the matrix from scratch" pattern that the v2.1.x line was tactically patching.
Added¶
-
Problem.canonicalise()— lazily builds and caches a single canonical CSC representation of the LP onProblem._matrix(a new_CanonicalMatrixslot-dataclass carryingcol_ptr/row_idx/valplus per-rowlb/ub/senseand per-colobj/lb/ub/integralityand names). Idempotent — repeat calls return the cached matrix unless_canonical_dirtyis set.add_var/add_cstrset the dirty flag; the cached matrix is released by_release_python_lp_inputsandwrite_mps(release=True). -
Problem._layer2_col_factor/Problem._layer2_row_factor— numpy side vectors written byflextool'sapply_layer2. The col-factor vector stores1 / cf_math(inverse); the row-factor vector storesrf_math(forward). At canonicalise time the vectors are baked into_matrix.val/_matrix.col_obj/_matrix.row_lb/_matrix.row_ubso consumers read pre-scaled values directly.Problem._layer2_lockedprevents post-Layer-2add_var/add_cstrthat would invalidate the side-vector sizing. -
Regression tests
tests/test_layer2_side_vector_emit.pyandtests/autoscale/test_ranges.py::test_ranges_via_streaming_honors_side_vectors— exercise every emit-site branch with explicit fake side vectors so missed multiply sites or indexing offsets fail fast without depending on flextool's bit-for-bit integration test.
Changed¶
-
Problem.write_mps(Stage B1) — now callscanonicalise()and walks_matrix.col_ptrcolumn-by-column. The previous per-family triple-list build, group_by dedup, concat, and global sort are consolidated into_build_canonical_matrix(runs once per state-version, shared across all consumers). Cross-consumer workflows (write_mps then solve, or any combination of the four canonical-consuming sinks) now family-walk exactly once. -
Problem._build_lp_arrays(Stage B2) — reduced to ~30 LoC. Reads from_matrixand applies the±inf → kHighsInfsubstitution. Per-family LHS walk, per-family RHS, Stage A multiply-at-emit, global dedup + sort all moved into_build_canonical_matrix. Back-compat shim parameters (n_cols,col_lb,col_ub) removed from the signature; the two callers (LpView.from_problem,_ranges_via_passmodel) updated. -
WarmProblem._initial_build(Stage B3) — bulk LP build (LHS / RHS / obj / bounds) now reads from_matrix. Tracked- source bookkeeping forWarmProblem.update_paramkeeps a small separate walk over_cstrsfiltered to terms withparam_sourcesset; these terms re-collect and apply the same Stage A multiply-at-emit so the cached_param_cellsfactors remain the SCALED coef (matches the pre-refactor formula thatupdate_paramrelies on). Skipped entirely whenself._mutable_paramsis empty. -
LpView.from_problem— readsm.col_objfrom the canonical matrix instead of walking_obj_termsand applying its own Stage A multiply-at-emit. After this commit, the ONLY remaining Stage A multiply-at-emit consumer isProblem._solve_streaming— intentional, its per-family CSR memory bound exists specifically to avoid full-matrix materialisation. -
polar_high.autoscale._ranges._ranges_via_streaminghonours the side vectors. When called post-Layer-2 it multiplies per-termabs(coef)by|row_factor| * |col_factor|after the polars collect (numpy, in place, no lazy plan modification) so the readout sees the same effective magnitudes the consumers will emit. No-op when the side vectors areNone(the pre-Layer-2 readout pattern is unchanged).
GLPK-likeness scorecard¶
| Property | GLPK | Pre-v2.2.0 | Post-v2.2.0 |
|---|---|---|---|
| Matrix is canonical (one copy) | ✓ | ✗ (per-family lazy + 2-3 transient copies during emit) | ✓ |
| Scaling lives separately from coefs | ✓ | ✗ (rewrote lazy plans via flextool's _layer2) |
✓ |
| Objective excluded from row scaling | ✓ | n/a (no row scaling) | ✓ |
| Build/scale is O(m+n+nnz) | ✓ | ✗ (transient triple copies + dedup hash + global sort coexisted) | Mostly ✓ (canonicalise still has transient peak ~3× nnz; consumers no longer rewalk) |
The remaining "Mostly" on build/scale is the transient peak during
_build_canonical_matrix itself: per-family triples → global concat
→ polars group_by dedup → sort → final CSC. A per-family
streaming canonicalisation (consumers process families one at a
time, dedup per-family, merge sorted CSC chunks) would close this
gap by exploiting the disjoint-row-range property each family
already has. Future work.
[2.1.3] — 2026-05-27¶
Fixed¶
-
Problem._build_lp_arrays,Problem._solve_streaming, andWarmProblem._initial_buildnow use the same semi-join + per-term streaming-collect pattern that v2.1.2 added toProblem.write_mps. Plain-inner-join +pl.collect_allpreviously materialised deep multi-Param LHS chains and multiplied the peak via parallel collect. Per-term peak is now bounded byrow_count × cols-per-rowinstead of the upstream Param-product cardinality. -
polar_high.autoscale.detect_rangeson a pre-solveProblemnow bypassesProblem._build_lp_arraysentirely. New_ranges_via_streamingwalks objective + constraint terms one at a time with a semi-join + per-rowabs(coef)collect (the shapewrite_mpsproves on the same chains) and numpy-reduces to min/max. Avoids materialising the full COO triple list + global dedup that the legacy_ranges_via_passmodelran for the same readout. Legacy code stays for back-compat but is no longer reached from production callers.
Added¶
-
POLAR_HIGH_RANGES_MAX_FAMILY_ROWSenv var (default1000000,0to disable) — skips constraint families above the threshold in_ranges_via_streaming's RHS + matrix readout. Background: polars' streaming engine intermittently fails to push the row-key semi-join into deep multi-Param product chains on very large families, so a single term collect can allocate >30 GB before failing on workloads like FlexTool's DES LP (profile_flow_upper_limitat 1.5 M rows × multi-Param rhs). Skipping means the range report rides on the families it could read. -
POLAR_HIGH_BUILD_LP_PROFILE=1andPOLAR_HIGH_RANGES_PROFILE=1diagnostic env vars — per-family / per-phasepsutilRSS deltas to stderr fromProblem._build_lp_arraysand from the autoscale range detectors respectively. Zero overhead when unset. -
Regression test
test_detect_ranges_param_chain_does_not_explode— synthetic 200k-row Var × 3-Param chain. Pre-fix peak 515 MB on this shape; post-fix under 300 MB (test asserts <300 MB).
[2.1.2] — 2026-05-27¶
Fixed¶
Problem.write_mpsper-term collect on LHS Param-chain terms (e.g.Var * Param₁ * Param₂ * ...) used to materialise the join chain's wide intermediate before the final row alignment. On a 9.9 M-row LP a single such family allocated +26 GB during oneterm.lazy.collect(), pushingwrite_mpspeak to ~43 GB despite the spec target of 2-3 GB. Retrofitted the same anti-explosion pattern the RHS path has used since v2.0.0 (_align_enum_join_keys→ semi-join against the row-index key set →collect(engine="streaming")withstreaming=Trueand plain-.collect()fallbacks). Synthetic 100 k-row test: 527 MB peak → 178 MB after the fix. Coefficients byte-identical with the pre-fix path; HiGHS objective unchanged.
[2.1.1] — 2026-05-27¶
Added¶
docs/guide/performance.md— new section "Writing MPS without HiGHS" coveringProblem.write_mps: API, the ~20× peak-memory advantage overHighs.writeModel,release=Truesemantics, cross-solver roundtrip coverage, and thePOLAR_HIGH_WRITE_MPS_PROFILE=1diagnostic env var. Cross-linked fromdocs/guide/scaling.mdanddocs/guide/solvers.md.
Fixed¶
- Replaced a stale reference to a fictional
Problem.solve(write_mps=...)kwarg indocs/guide/solvers.mdwith the realProblem.write_mpslink. - Ruff lint warnings in
tests/_bench_write_mps_parallel.py(intentional late polars import) and in theRHSsection ofProblem.write_mps(unused tuple element renamed_row_count).
[2.1.0] — 2026-05-27¶
Added¶
Problem.write_mps(path, *, free_format=True, column_order_strict=True, emit_names=True, release=False, name="POLAR_HIGH")— direct polars→MPS writer that bypasseshighspy.Highs.writeModel. Mirrors the per-family streaming pattern from_solve_streaming, performs one streaming sort by(col_id, row_id), and chunked-streams the COLUMNS section withINTORG/INTENDinteger markers. Target peak is ~2–3 GB on a 10 M-row / 5 M-col / 20 M-nz LP — about 20× lower thanHighs.writeModel's transient.release=Truereuses the same_release_python_lp_inputsteardown assolve(save_memory=True)so callers driving an out-of-process solver can drop the polar-side LP source immediately after the write.POLAR_HIGH_WRITE_MPS_PROFILE=1env var — when set,Problem.write_mpsemits per-phase and per-constraint-familypsutilRSS deltas to stderr for diagnosing memory hot spots. Zero overhead when unset (nopsutilimport, no closure call sites entered).tests/_bench_write_mps_parallel.py— synthetic-LP bench forwrite_mpspeak memory across single-family / multi-family topologies and polars thread counts.
Changed¶
- Wrapper-driven MPS roundtrip harness (
tests/test_mps_fallback_wrapper.py) is now parametrized over both the legacyLpView-based writer and the newProblem.write_mpsdirect writer, so the HiGHS / Gurobi / CPLEX / Xpress readback tests exercise both code paths.
[2.0.2] — 2026-05-26¶
Added¶
docs/guide/scaling.md— user-facing guide for thepolar_high.autoscalepackage: when to use it, the typicaldetect_ranges → recommend_scaling → apply_scalingpattern,ScalingMode/ScalingConfigknobs, the precedence rules, the min-floor guard + geometric-centring escape branch, and migration from the retiredauto_user_bound_scale=Trueflag. Wired intomkdocs.yml's Guide section between Solvers and Warm-starting.
Changed¶
- Stripped proper-name callouts of specific caller-side LPs from source comments and CHANGELOG entries. Replaced with generic scenario descriptions ("a full-year LP with RHS=(1.84e-3, 2.02e+8)" etc.) so the technical narrative survives without leaking caller-side LP names.
[2.0.1] — 2026-05-26¶
Fixed¶
- CI
ruff checkandruff format --checkfailures inherited from the v2.0.0 commits. Sorted / removed unused imports, ran ruff format, and migratedScalingMode(str, Enum)→ScalingMode( enum.StrEnum)to clear the UP042 hint. Behaviour difference:str(ScalingMode.OFF)now returns"off"instead of"ScalingMode.OFF"; no code insrc/ortests/stringifies the enum, so this is invisible at the API boundary.
Changed¶
- Cross-solver MPS-fallback tests now have sharpened skip strings that distinguish "wrapper-installed-but-CLI-binary-missing" from "solver wholly absent", and point at the new wrapper-driven test file for parallel coverage when only the Python wrapper is present.
Added¶
tests/test_mps_fallback_wrapper.py: for each commercial solver whose Python wrapper is installed (Gurobi, CPLEX, Xpress), writes the polar-high MPS file, reads it back into the wrapper, solves, and asserts the objective matches a direct in-memory HiGHS solve. Catches MPS-format issues end-to-end without needing the standalone CLI binary. COPT is intentionally out of scope here due to the in-process COPT/HiGHS native-symbol conflict documented insolvers/_copt.py.
[2.0.0] — 2026-05-26¶
Headline: much-improved automatic LP scaling via a new
polar_high.autoscale package. The previous one-shot
auto_user_bound_scale=True constructor flag is retired and
replaced by a richer caller-driven API that detects bound / cost /
RHS / matrix ranges and recommends user_bound_scale and
user_objective_scale exponents independently. The new path also
adds a min-floor guard that catches a class of false-infeasibility
results HiGHS' own suggestScaling can produce on wide-spread LPs.
See the Scaling guide
for the full caller story.
Added — autoscale¶
polar_high.autoscalepackage with three pieces:detect_ranges(problem_or_solution, config)returns aRangeReportwith the four(abs_min, abs_max)tuples (matrix,cost,col_bound,row_bound) plus per-category samples of smallest / largest contributors, usable on a builtProblemor on a returnedSolution(re-usesSolution.streamed_lp_rangeswhen available).recommend_scaling(ranges, config)returns aLayer3Planwithuser_bound_scaleanduser_objective_scaleinteger exponents, derived from HiGHS' ownsuggestScalingformula. Preserves the geometric-centering escape branch for severe asymmetric-bound LPs, now guarded by a min-floor check (see Fixed).ScalingModeenum (OFF/SOLVER_ONLY/BASIC/FULL) with helper predicates so library callers can decide policy per mode rather than per-call kwarg.- Precedence check: an axis whose
user_*_scaleis already set by the caller (viaset_solver_optionsor per-calloptions=) is skipped byrecommend_scaling. The caller's explicit value always wins. Problem.set_solver_option(name, value)andProblem.get_solver_option(name)accessors as the clean surface the precedence check reads from.
Removed (breaking)¶
Problem(auto_user_bound_scale: bool = ...)constructor option. The flag's one-shot, col-bound-only heuristic is superseded byautoscale.recommend_scaling(), which considers all four ranges independently and is configurable per-mode. Callers should:- Build the
Problemas before. - Call
detect_ranges(p, config)and thenrecommend_scaling( ranges, config)for the chosenScalingMode. - Apply the returned
Layer3PlanviaProblem.set_solver_option. See the autoscale package docstring for the migration pattern. - The internal
_recommend_user_bound_scalehelper that backed the retired flag.
Fixed¶
- False-infeasibility from over-aggressive scaling. When HiGHS'
own
suggestScalinglooks only at the max of(bound_max, rhs_max), it can pick auser_bound_scaleexponent that crushes the min belowkExcessivelySmallBoundValue(1e-4). HiGHS' presolve then mis-handles the near-zero rows and the LP comes back infeasible. Observed on a full-year LP with RHS=(1.84e-3, 2.02e+8): the formula picked N=-8 → scaled RHS min 7.2e-6 → spurious infeasibility. The newrecommend_scalingadds a min-floor guard: when the proposed delta would drag the scaled min below the threshold, the current scale is returned unchanged. - Duplicate-key rhs Param fan-out. The left-join from
row_indexagainst an upstream Param with duplicate(on=)keys used to surface as an opaqueValueError: operands could not be broadcast together with shapes (X,) (Y,)deep inside the solver adapter._build_lp_arrays(and the chunked / WarmProblem variants) now raise immediately at the join boundary, naming the offending constraint plus a sample of the duplicate keys. --highs-threads N>1silently ignored. HiGHS'setOptionValue("threads", N)is a no-op once the global Rayon scheduler has been initialised (which happens at defaultthreads=16). We now callHighs.resetGlobalScheduler(False)before applying the user's options so the requested thread count actually takes effect.
Notes¶
- The 1.5.x releases (sidecar RSS sampler,
save_memory=Trueone-shot mode, chunked LP-range accumulator) are subsumed under 2.0.0; their entries remain below as the detailed history.
[1.5.1] — 2026-05-24¶
Changed¶
docs/compare/benchmark.md: trim the "how this differs from earlier versions" methodology paragraph (covered in the 1.5.0 changelog entry) and minor wording cleanup.
[1.5.0] — 2026-05-24¶
Added¶
Problem.solve(save_memory: bool = False)opt-in one-shot mode for benchmark-style single solves. WhenTrue, polar-high drops its Python-side LP source-of-truth (term lazy plans, Param frames, caller-side column-bound / cost arrays, and thecol_names/row_nameslists) once HiGHS has copied them, and then writes the model to a temp MPS file, clears the originalHighsinstance, callsmalloc_trim(0)to return glibc arenas to the OS, and creates a freshHighsthat reads the model back beforeh.run(). The disk roundtrip resets HiGHS' incremental-addRowsallocator slack — at N=3000 dense full-solve it drops peak RSS from ~38 GB to ~28 GB at the cost of ~+90 s wall time (the MPS write + read). A subsequentProblem.solve()on a Problem that has been released raises a clearRuntimeError; WarmProblem-style incremental updates and re-solves are unavailable aftersave_memory=True. Cold-start rolling-horizon loops that rebuild theProblemfrom scratch each iteration are unaffected and benefit from the per-iteration memory drop. DefaultFalsepreserves the warm-restart-capable behaviour.
Changed¶
_running_finite_nonzero_min_max(used by the streaming LP-range accumulator) now scans in chunks of 1 M float64s instead of materialisingnp.abs(arr[finite])for the whole array. On a 36 M-nonzero constraint family that cuts the transient temp allocation from ~576 MB to ~16 MB. Functionally identical output._solve_streamingno longer concatenatescol_lb_hwithcol_ub_h(orrow_lbwithrow_ubper family) for range accumulation — each is scanned in place. Eliminates a 2·n_cols (and 2·n_rows-per-family) transient copy each._solve_streamingdropscol_lb_h/col_ub_h/col_obj_himmediately after the LP-range accumulation completes — HiGHS has its own internal copies fromaddCols, so the originals are not needed through the family loop andh.run(). ~432 MB at N=3000 dense.- Column-array construction (
col_lb/col_ub/col_obj/col_int/col_names) moved fromProblem.solve()into_solve_streamingso the caller's frame doesn't pin those arrays through the entire family loop andh.run()call. Combined with the drop above, this removes ~864 MB of caller-side residue at N=3000 dense. benchmark/run_one.pynow starts a sidecar thread that samplesVmRSSfrom/proc/self/statusat 25 ms cadence whilesolve()runs, and callsmalloc_trim(0)aftergc.collect()at the post-build and post-solve checkpoints. New CSV columns:rss_after_build_trim_mb,rss_after_solve_trim_mb,rss_solve_min_mb,rss_solve_p50_mb,rss_solve_p95_mb,rss_solve_max_mb,n_samples. Thepeak_rss_mbcolumn stays as before (ru_maxrss, the unavoidable high-water mark including transient HiGHS-setup scratch). Old 10-field CSV rows still parse throughplot.py.docs/compare/benchmark.mdrewritten: new memory-measurement methodology section explainspeak_rss_mbvsrss_solve_p50_mbvsrss_after_solve_trim_mb; new section on the regular vssave_memorymodes with a side-by-side polar-high comparison; headline tables updated to usesave_memory=Truefor the cross-tool comparison (matches linopy'sio_api="lp"file-handoff pattern). Threading-benefit numbers updated — speedup at N=10 000 is now 1.18× rather than 1.33× because the MPS roundtrip is a serial step that doesn't scale with thread count.
[1.4.0] — 2026-05-22¶
Removed¶
Problem.peek_lp_ranges(). The method rebuilt the full LP into numpy arrays via the non-streaming path purely to extract coefficient ranges — duplicate work the streaming solve already does. The same four(abs_min, abs_max)tuples (matrix,cost,col_bound,row_bound) are now populated automatically on everysolve()and exposed asSolution.streamed_lp_ranges. Callers that needed range inspection should read from theSolutioninstead; there is no more pre-solve range-inspection API.
Added¶
Problem(auto_user_bound_scale: bool = False)constructor option. WhenTrue, the streaming solve accumulates LP coefficient ranges during the family loop (at no extra allocation cost — it walks the per-family arrays we already build) and applies auser_bound_scalerecommendation viasetOptionValuebeforeHighs.run(), but only when the caller has not already setuser_bound_scalevia the options dict /set_solver_options. The embedded heuristic_recommend_user_bound_scale(bound_range, rhs_range)is a direct port of HiGHS' ownsuggestScalinglambda atHighsSolve.cpp:570-607: it pullsmax(bound_max, rhs_max)into HiGHS'[kExcessivelySmallBoundValue, kExcessivelyLargeBoundValue]=[1e-4, 1e+6]comfort zone using outer-rounded log2, and reproduces the integer HiGHS prints in its"Consider setting the user_bound_scale option to <N>"recommendation byte-for-byte.Solution.streamed_lp_ranges: dict | Nonefield. Populated by every solve that flows through_solve_streaming(which is the default path) with the four(abs_min, abs_max) | Nonerange tuples.Noneon solves that don't go through streaming (e.g. the non-streamingsolve(streaming=False)path).
Changed¶
_solve_streamingnow performs running min/max accumulation overcol_obj_h,col_lb_h/col_ub_h, and per-familyval64/row_lb/row_ubnumpy arrays. Cost is a handful of O(n) scans with no new allocations. Used to driveauto_user_bound_scaleand exposed onSolution.streamed_lp_ranges.- When
auto_user_bound_scale=True, the decision is now reported on stdout so the run log shows what scaling (if any) was applied — one of:applying user_bound_scale=N (bound …, rhs …; HiGHS' own kExcessively[Small|Large]BoundValue formula),no scaling -- max(bound, rhs) already within HiGHS' [1e-4, 1e+6] comfort zone (bound …, rhs …),no scaling -- no finite bound or RHS entries to evaluate, orcaller override in place (user_bound_scale=N).
[1.3.0] — 2026-05-22¶
Added¶
- Generic Enum-dtype alignment on every internal join site. When two
frames are joined on a column that is
pl.Enumon both sides but with different categorical vocabularies, polar-high now up-casts the narrower side to the wider Enum (provided one's categories are a subset of the other's). Enum-vs-pl.Utf8mismatches are resolved by casting the string side to the Enum dtype. Two Enums with neither-subset vocabs raise a clearValueErrorpointing the caller to cast topl.Utf8or build a union Enum. The behaviour is exposed as the internal helperpolar_high.engine._align_enum_join_keysand exercised by every internal.joincall site (operator joins,Where,Sum,Lag, constraint-emission,WarmProblemupdates). tests/test_enum_dtype_align.py: unit + end-to-end coverage of the new alignment behaviour, including the disjoint-vocab raise path and an end-to-endProblem.add_cstr/solvewith a narrower-vocab rhs Param.
Changed¶
- README "Enum dtype handling" subsection documenting the subset-up-cast rule and the raise-for-no-subset behaviour. No DSL surface change — existing models keep building unchanged; mixed-vocab models that previously needed per-site casts in caller code no longer do.
engine.py: when a constraint's rhs is aParam(or a chain ofParam * Param * ...), pre-filter the rhs lazy plan with a semi-join againstrow_index's join keys and collect via the streaming engine before the left-join into the constraint frame. Polars' optimiser doesn't always propagate the implicit row-set restriction through a multi-way Param product, so the intermediate buffers could blow up by orders of magnitude relative to the final row count. On FlexTool's South Africa 1-week PES-Hydro-dispatch case (ap_profile_value * p_process_existing_count * p_process_availabilityproduct), solver-finished ΔRSS drops from +28.77 GB to +9.40 GB (-67%) and the section runtime drops from 57.7 s to 17.5 s. Objective and total cost match the baseline byte-for-byte. Applied to all three rhs-Param call sites: the non-streamingProblem.add_cstrpath,_solve_streaming, andWarmProblem.solve. Falls back tocollect(streaming=True)on polars < 1.x.- README: quickstart code is now inlined (GitHub/PyPI don't render
pymdownx.snippetsincludes); the cross-product index is split into reusableunit_index/time_indexsets;capis built per-unit then concatenated;v_idxrenamed tocomposite_index(thev_prefix is reserved for variables);_idx→_indexthroughout. Problem.add_cstrarg order in README and quickstart fixture reordered tolhs_termsbeforesense— reads more naturally as lhs sense rhs. No API change (these are keyword args).
Removed¶
- Breaking:
Problem.peek_lp_ranges()removed. The method rebuilt the full LP into numpy arrays via the non-streaming path purely to extract coefficient ranges — duplicate work the streaming solve already does. Stream-time range accumulation now populatesSolution.streamed_lp_rangeswith the same four(abs_min, abs_max)tuples (matrix,cost,col_bound,row_bound) at zero extra cost on everysolve()that goes through_solve_streaming(the default). Callers that previously relied onpeek_lp_ranges()for diagnostics should readsol.streamed_lp_rangesaftersolve()returns; the module helperpolar_high.engine._recommend_user_bound_scaleconsumes the(lo, hi)of thecol_boundentry for the geo-midpoint heuristic. Thetop_k > 0per-coefficient name-lookup variant ofpeek_lp_rangeshas no streaming-time replacement; if needed, build the LP via the non-streaming path (solve(streaming=False)) and inspect via the solver-specific HiGHS diagnostics.
[1.2.0] — 2026-05-12¶
Added¶
polar_high.solversmodule: multi-solver dispatch behind a singlesolve(problem, solver_name=..., io_api=..., env=..., **options)entry point. HiGHS remains the default; Gurobi, CPLEX, FICO Xpress, and COPT are supported on a bring-your-own-license basis (we ship no binaries and no licenses).polar_high.solvers.available_solvers: runtime registry of installed solver Python wrappers, populated at import time. Tells you which wrappers are installed; license checks fire inside the adapter.IOMode.MPSfile-based fallback for users with a solver's CLI binary onPATHbut no matching Python wrapper. Writes a temp MPS viahighspy, invokes the CLI, parses the resulting.solfile. Coversgurobi_cl,cplex, Xpressoptimizer, andcopt_cmd.polar_high.solvers._lp_view.LpView: frozen, solver-agnostic extraction surface that every adapter consumes. Engine-private attribute access (Problem._build_lp_arraysetc.) is confined to this single module.- Optional install extras:
polar-high[gurobi],polar-high[cplex],polar-high[xpress],polar-high[copt]. Each pulls only the vendor's Python wrapper (plusscipywhere vectorized loads need it). docs/guide/solvers.md: user-facing guide covering detection, per-solver install, theio_api='mps'escape hatch, theenv=passthrough (Gurobi WLS example), and license troubleshooting.
Changed¶
Problem.solve(streaming=False)now routes throughpolar_high.solvers._highs.run. Behaviour and return type unchanged —streaming=Trueretains the existing HiGHS-only per-familyaddRowspath.- COPT adapter auto-routes through the
copt_cmdCLI fallback wheneverhighspyis already loaded in the interpreter. COPT 8.x's native core conflicts with HiGHS in-process (Highs.run()segfaults oncecoptpyis imported); the auto-route keeps both solvers usable from the samepolar-highvenv at the cost of a per-solve MPS write + subprocess invocation. Requirescopt_cmdon PATH (not shipped by thecoptpypip wheel); a cleanSolverNotAvailableErroris raised when it is missing. Details indocs/guide/solvers.md.
[1.1.4] — 2026-05-11¶
Added¶
Problem.peek_lp_ranges(): build the LP into numpy arrays and return the abs-value ranges of finite non-zero entries on each axis (matrix, cost, bounds, rhs) — same numbers HiGHS prints in its "Coefficient ranges" diagnostic, but available beforepassModel()runs. Optionaltop_kreturns the worst offenders per axis as(abs_value, col_name, row_name_or_side)triples. Lets callers pickuser_bound_scale/user_cost_scaleor refuse to solve a catastrophically scaled LP without paying for a full solve. Usesnp.argpartitionso the cost isO(n_nonzeros)..github/dependabot.yml: weekly dependency PRs for GitHub Actions and Python (pip) ecosystems. The initial commit (c3836f5) was the GitHub-provided template with an emptypackage-ecosystem; this release fills it in so the bot actually opens PRs.
Changed¶
engine.py: factor the non-streaming LP-build out ofsolve()into a private_build_lp_arrays()helper.solve()andpeek_lp_ranges()now share the same arrays — diagnostics are byte-for-byte what HiGHS sees.engine.py: for constraint families with > 50 000 rows, collect term plans one at a time instead ofpl.collect_all. Peak memory drops fromO(n_terms × frame)toO(frame), preventing stalls under memory pressure on large network models.engine.py: HiGHS no longer suppressed viah.silent()— solver progress and the "Coefficient ranges" line now print to stdout by default. Passoptions={"output_flag": False}to silence.
[1.1.3] — 2026-05-07¶
Changed¶
docs/guide/debugging.md: expanded with worked examples; doc snippets are now wired to test fixtures (tests/fixtures/debug_example.py,tests/fixtures/lagrangian_example.py,tests/fixtures/quickstart_example.py) so they're exercised by the test suite and can't silently rot.mkdocs.yml: dropdedent_sectionsfrom thesnippetspymdownx config — incompatible with the multi-fixture snippet layout.
[1.1.2] — 2026-05-05¶
Added¶
docs/guide/loading-data.md: new guide page on going from CSV / parquet / database tables toParamandVar, including the long-format vs. wide-format trade-off and how column names become dimension names.
Changed¶
docs.yml: drop thedevalias deploy onmainpushes; only tagged releases publish a versioned doc site.
[1.1.1] — 2026-05-05¶
Fixed¶
pyproject.toml: add Python 3.13 classifier. CI's test matrix already covers 3.13; the classifier was missing so the pyversions badge was reading "3.11 | 3.12" only.release.yml:skip-existing: trueon the PyPI publish step. Re-tagging the same version now no-ops on PyPI's duplicate-file rejection instead of showing the run as failed.
[1.1.0] — 2026-05-05¶
Changed¶
- BREAKING: renamed package
polar-high-opt→polar-high(Python modulepolar_high_opt→polar_high). All imports, PyPI install name, repo and docs URLs move with it. - BREAKING:
Problem.solve()defaults changed:streaming=True(per-familyaddRowsinstead of one bigpassModel; lower peak memory; numerically identical) andkeep_solver=False(the livehighspy.Highsis dropped after primal/dual extraction; passkeep_solver=Trueto retain it for post-solve inspection likesol.highs.writeModel(...)). - BREAKING:
polar_highsetsPOLARS_MAX_THREADS=1at import. Rayon coordination overhead exceeds the parallel speedup on typical LP-build workloads (see benchmark page). Override by setting the env var beforeimport polar_high. - COO row/column indices use
int32whennnz < 2^31, falling back toint64only when needed. Cuts working-set memory in the matrix-assembly phase. _Term.framecache is no longer populated duringProblem.solve()— the lazy plan is collected into a local that goes out of scope per family. Re-solves rebuild from the lazy plan as before.
Added¶
- Benchmark suite under
benchmark/: denseN×NLP (replicates linopy's benchmark) and a sparse network-flow LP with irregular edge→node topology. Reproducible via subprocess-isolated cells inbenchmark/run.py; figures rendered bybenchmark/plot.py. - New
docs/compare/benchmark.mdwith five figures and the story for each (build-only headline, threads scaling at fixed N, threading benefit on the network LP, network LP, linopy-format replication). Threadingsection indocs/guide/performance.mddocumenting the default-1 choice and the override pattern.- Tiny dispatch LP (wind + coal × 3 hours) replaces the abstract
i / jplaceholder in README anddocs/quickstart.md.
[1.0.1] — 2026-05-05¶
Added¶
- GitHub Actions: tests on push/PR (Python 3.11–3.13), docs deploy on main + tag (mike), PyPI release on tag (trusted publishing).
- Ruff lint + format configured in
pyproject.toml;[lint]optional-dependency added. - README badges: PyPI version, Python versions, license, tests CI, docs CI, ruff.
Changed¶
- Repo / docs URLs moved from
jkiviluo/polar-hightonodal-tools/polar-high; documentation site is hosted athttps://nodal-tools.fi/polar-high/. - One-time
ruff formatreflow across the source tree.
Fixed¶
- Dead intra-doc anchor link in
guide/performance.md(thevars-and-params.md"Param × Param" heading slugifies to a single hyphen, not two).
Removed¶
- Two dangling unused locals (
engine.pyandtest_warm_problem.py).
[1.0.0] — 2026-05-05¶
First public release.
Added¶
Var,Param,Expr— building blocks for indexed expressions expressed as polars DataFrames.Sum,Where,Lag— aggregation, filtering, and time-shift primitives that compile to LP rows efficiently.Problem— assemble an LP/MIP and solve via HiGHS (highspy).WarmProblem— re-solve with parameter / RHS / objective updates while preserving the basis.LagrangianProblem— generic dual-subgradient driver for Lagrangian decomposition of coupled subproblems.Solution— primal values, constraint duals, reduced costs, and a livehighspy.Highshandle for advanced post-solve inspection.- MkDocs + mike documentation site under
docs/.