neml2.es

Python-native equation-system assembly for implicit updates.

Layered:

class neml2.es.AssembledMatrix(row_layout, col_layout, tensors=<factory>)[source]

Bases: object

2D grid of per-(row_group, col_group) tensor blocks.

Per-block storage decision:

  • both row group AND col group dense -> shape (*dyn, row_storage_with_sub_folded, col_storage_with_sub_folded); base_ndim=2, sub_batch_ndim=0.

  • any side block -> that side’s sub_batch is preserved as intmd dims. Shape (*dyn, *intmd_row, *intmd_col, row_storage, col_storage); base_ndim=2, sub_batch_ndim = len(intmd_row) + len(intmd_col).

matmul:

  • mm(aik, bkj) contracts the trailing two base axes.

  • if col_layout.structure[k] == "block" (the inner group is block), a sub_batch reduction (sum over the intmd dims contributed by group k) runs after the per-block mm. This is the v2-parity “block intmd_sum after mm” pattern.

Parameters:
col_layout: AxisLayout
disassemble()[source]

Unpack per-block tensors into a SparseMatrix.

Each cell cells[row_var][col_var] is a typed dynamic-base Tensor that carries the block’s batch_ndim / sub_batch_ndim verbatim. For "dense" blocks the sub_batch axes have been folded into base, so the slices are flat in row + col. Boundary callers (e.g. the pyzag interface) read the underlying dict-of-dict via .cells and unwrap to raw .data at their own framework boundary.

Return type:

SparseMatrix

group(i, j)[source]
Parameters:
Return type:

AssembledMatrix

row_layout: AxisLayout
static select_blocks(row_layout, col_layout, blocks)[source]

Build an AssembledMatrix from per-(row_var, col_var) typed blocks.

Inverse of disassemble(). Each blocks[row_name][col_name] is a typed dynamic-base Tensor with base_ndim=2 and base_shape == (row_storage, col_storage) sized to match _var_storage() for the given layout structure. Missing (row_var, col_var) pairs are zero-filled.

Only the dense-x-dense case is supported (no intmd sub_batch dims on either side) – that’s the only path the pyzag interface and the round-trip test currently exercise. Block-structure layouts would need the corresponding sub_batch axes to live in each block’s storage and the cat axes shifted accordingly; raise so callers don’t silently get the wrong shape.

Parameters:
Return type:

AssembledMatrix

tensors: list[list[Tensor]]
class neml2.es.AssembledVector(layout, tensors)[source]

Bases: object

Per-group dense vector blocks.

Each group’s tensor follows the SubBatchStructure of the owning AxisLayout:

  • "block" -> (*dyn, *sub_batch, group_storage_size), base_ndim=1, sub_batch_ndim=len(sub_batch_shape).

  • "dense" -> (*dyn, group_storage_size_with_sub_folded), base_ndim=1, sub_batch_ndim=0.

Arithmetic and dot products forward to Tensor ops.

Parameters:
disassemble()[source]

Unpack per-group tensors back into a SparseVector.

Inverse of from_dict() – the per-group split point is the per-variable storage size; for "dense" groups the sub_batch is unfolded back to the declared shape before re-typing.

Returns a SparseVector (the typed dual of this object). The underlying name -> typed-wrapper dict is on .values for callers that need the raw mapping.

Return type:

SparseVector

classmethod from_dict(layout, values)[source]

Pack typed-wrapper values (one per variable) into per-group tensors.

For a "block" group, the per-variable wrappers are flattened to base then concatenated along base (last) axis – sub_batch axes stay as intermediate dims on the resulting Tensor.

For a "dense" group, each wrapper’s sub_batch is folded INTO the variable’s flat base storage before concat: per-variable contribution becomes (*dyn, sub_total * base_size), then all contributions concat to (*dyn, sum_var (sub_total * base_size)).

Per CLAUDE.md rule 1: values is strictly typed – raw torch.Tensor is rejected. External boundaries wrap with the appropriate TensorWrapper subclass at the construction site.

Parameters:
Return type:

AssembledVector

group(index)[source]
Parameters:

index (int)

Return type:

AssembledVector

layout: AxisLayout
tensors: list[Tensor]
class neml2.es.AxisLayout(groups, specs, sub_batch_shapes=None, structure=None)[source]

Bases: object

Ordered variable groups, their tensor types, per-variable sub-batch shapes, and per-group SubBatchStructure.

Parameters:
block_size()[source]

Per-(dynamic-batch, sub-batch-site) storage size.

Return type:

int

group_size(index)[source]
Parameters:

index (int)

Return type:

int

group_sub_batch_shape(index)[source]

Common sub-batch shape across every variable in group index.

For a BLOCK group every variable must share the same sub_batch_shape (otherwise the block tensor can’t be a single rectangular tensor). Raises on disagreement. For a DENSE group the per-variable shapes may differ — they get folded into base on assembly — and this method returns the FIRST variable’s shape for shape inference only.

Parameters:

index (int)

Return type:

Size

groups: tuple[tuple[str, ...], ...]
property ngroup: int
property nvar: int
specs: dict[str, type[TensorWrapper]]
storage_size()[source]
Return type:

int

structure: tuple[Literal['block', 'dense'], ...]
sub_batch_shape(name)[source]

Per-variable sub-batch shape (empty when the var is sub-batch-trivial).

Parameters:

name (str)

Return type:

Size

sub_batch_shapes: dict[str, Size]
sub_layout(index)[source]

Single-group sub-layout containing only self.groups[index].

Parameters:

index (int)

Return type:

AxisLayout

type_of(name)[source]
Parameters:

name (str)

Return type:

type[TensorWrapper]

var_size(name)[source]
Parameters:

name (str)

Return type:

int

vars()[source]
Return type:

tuple[str, …]

with_sub_batch_shapes(sub_batch_shapes)[source]

Return a new layout with updated sub-batch shapes (frozen replacement).

Parameters:

sub_batch_shapes (dict[str, Size])

Return type:

AxisLayout

class neml2.es.IFT(system, linear_solver, selected_pairs=None)[source]

Bases: _SystemModule

Exportable IFT Jacobian $du/dg = -A^{-1} B$ for a converged ImplicitUpdate.

Contract: (*u_groups, *g_groups) -> *blocks where each block is one per-variable-pair (unknown, given) entry of -du_dg, emitted in unknown_names (outer) × given_names (inner) order – matching the jacobian_pairs metadata in _compile_implicit_segment().

The equation system assembles the dense A / B (via the model’s per-variable chain rule), applies the IFT solve, then disassemble()s the resulting AssembledMatrix into per-(unknown, given) blocks. Each block keeps its natural per-structure shape ("dense" -> (*B, u_storage, g_storage); a BLOCK side stays block-diagonal-compact, no N² fold). The C++ runtime then composes these blocks against dg_dmaster exactly like a forward segment’s per-pair Jacobian blocks – one uniform per-pair path for forward and implicit.

Parameters:
emitted_pairs()[source]

The (unknown, given) pairs this graph emits, in emission order.

Return type:

list[tuple[str, str]]

forward(*args)[source]

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

Parameters:

args (Tensor)

Return type:

tuple[Tensor, …]

class neml2.es.LinearSystem[source]

Bases: object

Base class for systems with assembled operators.

A()[source]
Return type:

AssembledMatrix

A_and_B()[source]
Return type:

tuple[AssembledMatrix, AssembledMatrix]

A_and_B_and_b()[source]
Return type:

tuple[AssembledMatrix, AssembledMatrix, AssembledVector]

A_and_b()[source]
Return type:

tuple[AssembledMatrix, AssembledVector]

SECTION = 'EquationSystems'

HIT section for neml2-syntax classification – inherited by every registered subclass (ModelNonlinearSystem lives under [EquationSystems] in the input file).

assemble(need_A, need_B, need_b)[source]
Parameters:
Return type:

tuple[AssembledMatrix | None, AssembledMatrix | None, AssembledVector | None]

b()[source]
Return type:

AssembledVector

property blayout: AxisLayout
g()[source]
Return type:

AssembledVector

property glayout: AxisLayout
set_g(g)[source]
Parameters:

g (AssembledVector | SparseVector)

Return type:

None

set_u(u)[source]
Parameters:

u (AssembledVector | SparseVector)

Return type:

None

setup_blayout(sub_batch_shapes=None)[source]
Parameters:

sub_batch_shapes (Mapping[str, Size] | None)

Return type:

AxisLayout

setup_glayout(sub_batch_shapes=None)[source]
Parameters:

sub_batch_shapes (Mapping[str, Size] | None)

Return type:

AxisLayout

setup_ulayout(sub_batch_shapes=None)[source]
Parameters:

sub_batch_shapes (Mapping[str, Size] | None)

Return type:

AxisLayout

u()[source]
Return type:

AssembledVector

property ulayout: AxisLayout
class neml2.es.ModelNonlinearSystem(model, unknowns, residuals=None, structure=None)[source]

Bases: NonlinearSystem

A nonlinear system defined by a Model.

Parameters:
assemble(need_A, need_B, need_b)[source]
Parameters:
Return type:

tuple[AssembledMatrix | None, AssembledMatrix | None, AssembledVector | None]

classmethod from_hit(node, factory)[source]
Parameters:
  • node (nmhit.Node)

  • factory (_NativeInputFile)

Return type:

ModelNonlinearSystem

g()[source]
Return type:

AssembledVector

hit = <neml2.schema.HitSchema object>
initialize(*, u, g, dyn_shape=())[source]

Set the state and per-variable layout from typed SparseVector inputs.

u and g carry their own AxisLayout, which pins each variable’s sub_batch_shape. The system trusts these layouts directly – no separate sub_batch_ndim dict is needed because the wrappers + layout already encode the same information consistently.

Each SparseVector’s values may pass typed wrappers (preferred per the wrapper-discipline rule) or raw torch.Tensor (auto-wrapped with the input_spec’s type for caller convenience at construction sites that haven’t migrated yet – these wrap-on-entry, not wrap-on-exit, so no metadata is lost).

Call sites that have raw dicts + a sub_batch_ndim count dict should funnel through to_sparse() to construct the typed SparseVector pair.

Parameters:
Return type:

None

set_g(g)[source]
Parameters:

g (AssembledVector | SparseVector)

Return type:

None

set_u(u)[source]
Parameters:

u (AssembledVector | SparseVector)

Return type:

None

set_u_from_group_raws(u_raws)[source]

Commit per-group raw unknown tensors (the solver boundary).

Inverse of _vector_to_per_group_raws(self.u()): each per-group raw is re-typed via wrap_group_raw() using the unknown layout’s structure, then committed through set_u(). Used by the C++-backed Newton solver to write the converged iterate back into the system state.

Parameters:

u_raws (list[Tensor])

Return type:

None

setup_blayout(sub_batch_shapes=None)[source]
Parameters:

sub_batch_shapes (Mapping[str, Size] | None)

Return type:

AxisLayout

setup_glayout(sub_batch_shapes=None)[source]
Parameters:

sub_batch_shapes (Mapping[str, Size] | None)

Return type:

AxisLayout

setup_ulayout(sub_batch_shapes=None)[source]
Parameters:

sub_batch_shapes (Mapping[str, Size] | None)

Return type:

AxisLayout

to(*args, **kwargs)[source]

Move the underlying Model and any populated state to a new device / dtype.

Matches torch’s nn.Module.to signature and convention: forwards *args / **kwargs to self.model.to(...) (covering to(device='cuda'), to(dtype=torch.float32), to('cuda', non_blocking=True) etc.) and additionally walks self._state – the per-variable typed wrappers populated by initialize() – moving each one through TensorWrapper.to(). Returns self so call chains like system = neml2.load_nonlinear_system(...).to('cuda') work the same way they do for nn.Module.

ModelNonlinearSystem is intentionally not an nn.Module (it composes one rather than being one), so torch’s Module.to semantics don’t reach the system automatically – this method is the bridge.

Return type:

ModelNonlinearSystem

to_sparse(u, g, sub_batch_ndim=None)[source]

Convert typed (u_dict, g_dict, sub_batch_ndim) into (u_sv, g_sv) SparseVector pair ready to pass to initialize().

Use this helper at internal call sites that already build per-variable typed-wrapper dicts + a per-variable sub_batch_ndim count dict (the classic ImplicitUpdate / pyzag adapter shape). Per-variable sub_batch_shapes are derived from each wrapper’s trailing batch dims so the resulting SparseVector layouts encode the same sub-batch structure that the wrappers carry.

Per CLAUDE.md rule 1, raw torch.Tensor inputs are rejected – wrap at the boundary first. For user-facing surfaces with no sub-batch (the typical test / notebook shape), construct SparseVector(system.ulayout, {name: typed_wrapper, ...}) directly instead – no helper needed.

Parameters:
Return type:

tuple[SparseVector, SparseVector]

u()[source]
Return type:

AssembledVector

class neml2.es.NewtonStep(system, linear_solver)[source]

Bases: _SystemModule

Exportable Newton step-direction graph.

Contract: (*u_groups, *g_groups) -> (*du_groups, *b_groups) where du_groups are the per-unknown-group step directions (in ulayout.groups order) and b_groups are the per-residual-group b = -r(u) at the current iterate (in blayout.groups order). The C++ runtime applies u_groups[i] = u_groups[i] + alpha * du_groups[i] per-group for line-search trials via cheap RHS evaluations.

Parameters:

system (ModelNonlinearSystem)

forward(*args)[source]

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

Parameters:

args (Tensor)

Return type:

tuple[Tensor, …]

class neml2.es.NonlinearSystem[source]

Bases: LinearSystem

Nonlinear system with C++-matching Newton sign convention.

class neml2.es.RHS(system)[source]

Bases: _SystemModule

Exportable residual graph.

Contract: (*u_groups, *g_groups) -> (*b_groups) – per-group raw tensors. b_group = -r_group for each residual group; the C++ runtime computes a per-batch convergence norm by reducing each group tensor over its trailing sub_batch + base axes and summing across groups (no per-variable narrow on the hot path).

Parameters:

system (ModelNonlinearSystem)

forward(*args)[source]

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

Parameters:

args (Tensor)

Return type:

tuple[Tensor, …]

class neml2.es.SparseMatrix(row_layout, col_layout, cells)[source]

Bases: object

Per-(row_var, col_var) cell map; the typed dual of AssembledMatrix.

cells[row_var][col_var] is the assembled per-cell Tensor block that AssembledMatrix.tensors[i][j] holds – K has already been folded into trailing base via _tangent_block_to_trailing_k(). Construction validates that the outer keys cover every row variable in row_layout; missing inner (row_var, col_var) entries are allowed and become zero blocks at assembly time (per-block sparsity is normal in chain-rule derivatives).

Parameters:
assemble()[source]

Walk row x col groups and pack into an AssembledMatrix.

Delegates to AssembledMatrix.select_blocks(), which handles per-block sparsity (missing entries -> zero blocks).

Return type:

AssembledMatrix

cells: Mapping[str, Mapping[str, Tensor]]
col_layout: AxisLayout
row_layout: AxisLayout
to(*args, **kwargs)[source]

Move every cell to a new device / dtype; returns a new SparseMatrix.

Return type:

SparseMatrix

class neml2.es.SparseVector(layout, values)[source]

Bases: object

Per-variable typed-wrapper vector; the typed dual of AssembledVector.

values maps each variable name listed in layout.vars() to its typed value (a TensorWrapper subclass instance). Construction validates that every layout variable is covered.

Per CLAUDE.md rule 1: values is strictly typed – raw torch.Tensor is rejected. External boundaries that have raw tensors (the pyzag adapter, AOTI tracer fixtures, user code in notebooks / tests) wrap with the appropriate TensorWrapper subclass at the construction site, not by handing raw tensors to an internal neml2 helper.

Parameters:
assemble()[source]

Stack values into per-group tensors via the layout’s grouping.

Return type:

AssembledVector

items()[source]
Return type:

ItemsView[str, TensorWrapper]

keys()[source]
layout: AxisLayout
to(*args, **kwargs)[source]

Move every value to a new device / dtype; returns a new SparseVector.

Return type:

SparseVector

values: Mapping[str, TensorWrapper]
neml2.es.norm(v)[source]

Batched Euclidean norm over all assembled vector groups.

Parameters:

v (AssembledVector)

Return type:

Tensor

neml2.es.norm_sq(v)[source]

Batched squared Euclidean norm over all assembled vector groups.

Parameters:

v (AssembledVector)

Return type:

Tensor