neml2.types

Typed Python tensor wrappers.

Each class is a @dataclass(frozen=True, eq=False) over a single data: torch.Tensor field, registered as a pytree node so torch.export flattens cleanly at the export boundary while the authoring surface stays strongly typed inside nn.Module.forward().

Operator overloads (+, -, *, @, -x, abs(x), x ** n) live on the wrapper classes along with constructors. Shape-manipulation ops are exposed through region-view properties (t.batch, t.dynamic_batch, t.sub_batch, t.base) so the intent is unambiguous — e.g. t.sub_batch.unsqueeze(-1) or t.base.transpose(-2, -1). Everything else (invariants, decompositions, transcendentals, math-bearing type conversions like euler_rodrigues(Rot) -> R2) lives in neml2.types.functions as free functions, matching how the C++ side exposes them.

class neml2.types.MillerIndex(data, sub_batch_ndim=0, sub_batch_state=(), sub_batch_meta=(), k_ndim=0, k_state=(), k_pairing=())[source]

Bases: PrimitiveTensor

Wraps a torch.Tensor of shape (..., 3) carrying Miller indices.

Inherits all arithmetic and zeros/ones/full/empty/fill factories from PrimitiveTensor. No class-specific overrides — the C++ analogue has the same minimal surface.

Parameters:
BASE_NDIM: ClassVar[int] = 1
BASE_SHAPE: ClassVar[tuple[int, ...]] = (3,)
data: torch.Tensor
k_ndim: int = 0
k_pairing: tuple = ()
k_state: tuple = ()
sub_batch_meta: tuple = ()
sub_batch_ndim: int = 0
sub_batch_state: tuple = ()
class neml2.types.PrimitiveTensor(data, sub_batch_ndim=0, sub_batch_state=(), sub_batch_meta=(), k_ndim=0, k_state=(), k_pairing=())[source]

Bases: TensorWrapper

Fixed-base-shape typed tensor with inherited arithmetic and factories.

Parameters:
classmethod empty(*batch, dtype=None, device=None)[source]

Wrapper of given shape with undefined data.

Parameters:
Return type:

Self

classmethod fill(*components, dtype=None, device=None)[source]

Build a wrapper from prod(BASE_SHAPE) scalar components.

Components are taken in row-major order and reshaped to cls.BASE_SHAPE. Subclasses with non-trivial packing semantics (e.g. SR2 with Mandel √2 shear scaling, or short-form 1/3-component overloads) override this.

Parameters:
Return type:

Self

classmethod full(*batch, fill_value, dtype=None, device=None)[source]

Wrapper of given shape filled with fill_value.

Parameters:
Return type:

Self

classmethod ones(*batch, dtype=None, device=None)[source]

Ones-filled wrapper of dynamic shape batch and base cls.BASE_SHAPE.

Parameters:
Return type:

Self

classmethod zeros(*batch, dtype=None, device=None)[source]

Zero-filled wrapper of dynamic shape batch and base cls.BASE_SHAPE.

Parameters:
Return type:

Self

classmethod zeros_like(template, *, sub_batch_shape=None)[source]

Zero-filled wrapper inheriting template’s K + dynamic batch layout.

sub_batch_shape (defaulting to template.sub_batch_shape) overrides the sub-batch region, useful when the caller needs a zero tail of a different cell-axis length than template to splice into a typed cat / arithmetic. The K metadata (k_ndim / k_state / k_pairing) is carried from template so the result aligns rank-by-rank for downstream chain-rule binary ops (zeros are direction-agnostic in K, so inheriting template’s state is the only choice that keeps the leading K axes positionally consistent with the operand the result will combine with).

Parameters:
Return type:

Self

class neml2.types.R2(data, sub_batch_ndim=0, sub_batch_state=(), sub_batch_meta=(), k_ndim=0, k_state=(), k_pairing=())[source]

Bases: PrimitiveTensor

Wraps a torch.Tensor of shape (..., 3, 3).

Parameters:
BASE_NDIM: ClassVar[int] = 2
BASE_SHAPE: ClassVar[tuple[int, ...]] = (3, 3)
data: torch.Tensor
classmethod identity(*, dtype=None, device=None)[source]
Parameters:
Return type:

R2

k_ndim: int = 0
k_pairing: tuple = ()
k_state: tuple = ()
sub_batch_meta: tuple = ()
sub_batch_ndim: int = 0
sub_batch_state: tuple = ()
class neml2.types.Rot(data, sub_batch_ndim=0, sub_batch_state=(), sub_batch_meta=(), k_ndim=0, k_state=(), k_pairing=())[source]

Bases: PrimitiveTensor

Wraps a torch.Tensor of shape (..., 3) in MRP packing.

Parameters:
BASE_NDIM: ClassVar[int] = 1
BASE_SHAPE: ClassVar[tuple[int, ...]] = (3,)
data: torch.Tensor
classmethod identity(*, dtype=None, device=None)[source]

The identity rotation — the zero MRP vector.

Parameters:
Return type:

Rot

k_ndim: int = 0
k_pairing: tuple = ()
k_state: tuple = ()
sub_batch_meta: tuple = ()
sub_batch_ndim: int = 0
sub_batch_state: tuple = ()
class neml2.types.SR2(data, sub_batch_ndim=0, sub_batch_state=(), sub_batch_meta=(), k_ndim=0, k_state=(), k_pairing=())[source]

Bases: PrimitiveTensor

Wraps a torch.Tensor of shape (..., 6) in Mandel packing.

Parameters:
BASE_NDIM: ClassVar[int] = 1
BASE_SHAPE: ClassVar[tuple[int, ...]] = (6,)
data: torch.Tensor
classmethod fill(*components, dtype=None, device=None)[source]

Build an SR2 from 1, 3, or 6 tensor components (mirrors C++ SR2::fill).

  • 1 value a -> diag(a, a, a).

  • 3 values -> the three diagonal entries, zero shear.

  • 6 values s11 s22 s33 s23 s13 s12 -> the full symmetric tensor; the three shear entries are scaled by sqrt(2) into Mandel storage.

Overrides the generic PrimitiveTensor.fill() to handle the short forms and the Mandel √2 scaling. The 6-component form is not a raw tensor([...]).reshape((6,)) — the shear scaling matters.

Parameters:
Return type:

SR2

classmethod identity(*, dtype=None, device=None)[source]
Parameters:
Return type:

SR2

k_ndim: int = 0
k_pairing: tuple = ()
k_state: tuple = ()
sub_batch_meta: tuple = ()
sub_batch_ndim: int = 0
sub_batch_state: tuple = ()
class neml2.types.SSR4(data, sub_batch_ndim=0, sub_batch_state=(), sub_batch_meta=(), k_ndim=0, k_state=(), k_pairing=())[source]

Bases: PrimitiveTensor

Wraps a torch.Tensor of shape (..., 6, 6) in Mandel packing.

Parameters:
BASE_NDIM: ClassVar[int] = 2
BASE_SHAPE: ClassVar[tuple[int, ...]] = (6, 6)
data: torch.Tensor
classmethod identity(*, dtype=None, device=None)[source]

Full identity δ_{ij}δ_{kl} in Mandel packing.

In the Mandel basis this is a 3x3 block of 1.0 in the top-left (volumetric) corner with all off-diagonal/deviatoric entries zero, i.e. it acts as A -> tr(A) * I rather than A -> A. Distinct from identity_sym() (which is the (6,6) eye).

Parameters:
Return type:

SSR4

classmethod identity_C1(*, dtype=None, device=None)[source]

Cubic-symmetry projector $I_C1$: top-left 3x3 identity block of the Mandel (6, 6) matrix; selects the cubic on-diagonal normal-stress coefficient (matches C++ SSR4::identity_C1).

Parameters:
Return type:

SSR4

classmethod identity_C2(*, dtype=None, device=None)[source]

Cubic-symmetry projector $I_C2$: the three normal-stress off-diagonal ones in the top-left 3x3 Mandel block; selects the cubic off-diagonal normal-stress coefficient (matches C++ SSR4::identity_C2).

Parameters:
Return type:

SSR4

classmethod identity_C3(*, dtype=None, device=None)[source]

Cubic-symmetry projector $I_C3$: bottom-right 3x3 identity block of the Mandel (6, 6) matrix; selects the cubic shear coefficient (matches C++ SSR4::identity_C3).

Parameters:
Return type:

SSR4

classmethod identity_dev(*, dtype=None, device=None)[source]

Deviatoric projection dev(A) = I_dev : A — equals I_sym - I_vol.

Parameters:
Return type:

SSR4

classmethod identity_sym(*, dtype=None, device=None)[source]

Symmetric identity I_sym : A = A for any SR2 $A$ — the (6,6) eye.

Parameters:
Return type:

SSR4

classmethod identity_vol(*, dtype=None, device=None)[source]

Volumetric projection vol(A) = I_vol : A.

Parameters:
Return type:

SSR4

k_ndim: int = 0
k_pairing: tuple = ()
k_state: tuple = ()
sub_batch_meta: tuple = ()
sub_batch_ndim: int = 0
sub_batch_state: tuple = ()
class neml2.types.Scalar(data, sub_batch_ndim=0, sub_batch_state=(), sub_batch_meta=(), k_ndim=0, k_state=(), k_pairing=(), *, dtype=None, device=None)[source]

Bases: PrimitiveTensor

Wraps a torch.Tensor of base shape () (i.e., one number per batch entry).

Parameters:
BASE_NDIM: ClassVar[int] = 0
BASE_SHAPE: ClassVar[tuple[int, ...]] = ()
classmethod arange(start, end=None, step=1, *, dtype=None, device=None)[source]

Like torch.arange: arange(N) -> [0, …, N-1], arange(a, b, s) -> [a, a+s, …] up to (excluding) b.

Parameters:
Return type:

Scalar

data: Tensor
classmethod from_value(x, *, like)[source]

Construct a Scalar inheriting dtype/device from an existing wrapper.

Parameters:
Return type:

Scalar

classmethod full(*shape, fill_value, dtype=None, device=None)[source]

Wrapper of given shape filled with fill_value.

Parameters:
Return type:

Scalar

k_ndim: int = 0
k_pairing: tuple = ()
k_state: tuple = ()
classmethod linspace(start, end, steps, *, dtype=None, device=None)[source]

steps values uniformly spaced from start to end inclusive.

Parameters:
Return type:

Scalar

classmethod ones(*shape, dtype=None, device=None)[source]

Ones-filled wrapper of dynamic shape batch and base cls.BASE_SHAPE.

Parameters:
Return type:

Scalar

sub_batch_meta: tuple = ()
sub_batch_ndim: int = 0
sub_batch_state: tuple = ()
classmethod zeros(*shape, dtype=None, device=None)[source]

Zero-filled wrapper of dynamic shape batch and base cls.BASE_SHAPE.

Parameters:
Return type:

Scalar

class neml2.types.Tensor(data, batch_ndim=0, sub_batch_ndim=0, sub_batch_state=(), sub_batch_meta=(), k_ndim=0, k_state=(), k_pairing=())[source]

Bases: object

Dynamic-base-shape tensor with a (batch, sub_batch, base) runtime split.

Parameters:
as_typed(cls)[source]

View this Tensor as a TensorWrapper subclass.

Requires the trailing dims of data to match cls.BASE_SHAPE exactly. sub_batch_ndim and sub_batch_labels are preserved; any leading batch_ndim becomes the wrapper’s dynamic batch.

Parameters:

cls (type[TensorWrapper])

Return type:

TensorWrapper

property base: _BaseView

View into the trailing base region. Unlike TensorWrapper, the base region is mutable on a dynamic-base Tensorexpand / unsqueeze / squeeze / cat all work.

property base_ndim: int
property base_shape: Size
property batch: _BatchView

View into the leading (dynamic) batch region. Ops preserve sub_batch_ndim. Supports unsqueeze / squeeze / expand / broadcast_to / cat.

batch_ndim: int = 0
property batch_shape: Size
data: Tensor
property device: device
property dtype: dtype
flatten_base()[source]

Collapse all base axes into one trailing axis; result has base_ndim=1.

Return type:

Tensor

flatten_sub_batch()[source]

Absorb sub-batch axes into a NEW leading base axis.

Result has sub_batch_ndim=0 and base_ndim increased by 1 (the new leading base axis equals prod(sub_batch_shape)). Use to materialise a block-diagonal-storage Tensor as one flat matrix when the downstream consumer demands base-only storage.

Return type:

Tensor

flatten_sub_batch_into_first_base_axis()[source]

Fold sub_batch axes INTO (not beside) the first base axis.

Result has sub_batch_ndim=0 and base_ndim unchanged; the first base axis grows from base_shape[0] to prod(sub_batch) * base_shape[0]. Distinct from flatten_sub_batch() which inserts the collapsed sub_batch as a NEW leading base axis.

This is the inverse of the assembly’s “BLOCK-compact preserves per-grain structure” decision: when downstream code needs the fully-unfolded form (e.g. a global-row matmul against per-grain col blocks), call this. Requires base_ndim >= 1.

Return type:

Tensor

fold_preserving(preserved_idx, canonical_order, other_idx, fold_size)[source]

Permute preserved sub_batch axes to leading position (in canonical_order), then fold (remaining_sub + base) into a single trailing axis.

Result has batch_ndim unchanged, sub_batch_ndim = len(canonical_order), sub_batch_labels = canonical_order, base_ndim=1, base[0] = fold_size.

Used by from_dict()’s preserved-label storage path to keep declared per-axis sub_batch dims as real sub_batch on the assembled slab while folding the rest.

Parameters:
Return type:

Tensor

classmethod from_typed(wrapper, *, batch_ndim=None)[source]

Construct a Tensor view of an existing TensorWrapper.

The wrapper’s sub_batch_ndim, BASE_NDIM, and sub_batch_labels are carried over verbatim; batch_ndim defaults to data.ndim - sub_batch_ndim - BASE_NDIM (i.e. everything not claimed by sub-batch or base). Pass an explicit batch_ndim to override – useful when the caller knows the dyn rank but the wrapper’s leading axes encode a different layout (e.g. the JVP K-dim).

Parameters:
Return type:

Tensor

k_ndim: int = 0
k_pairing: tuple = ()
k_state: tuple = ()
property ndim: int
new_zeros(*, batch_shape=None, sub_batch_shape=None, base_shape=None)[source]

Allocate a same-dtype, same-device zero Tensor with the given region shapes.

Any region whose shape is omitted defaults to self’s shape for that region – handy for building a zero block that mirrors an existing block’s dyn / sub-batch layout.

Parameters:
Return type:

Tensor

property shape: Size
solve(other)[source]

Solve self @ x = other for x via torch.linalg.solve.

self.base_ndim == 2 (square matrix); other.base_ndim in (1, 2). (batch, sub_batch) axes are leading-broadcast through torch’s batched solve – so a block-diagonal system (*B, L, n, n) \ (*B, L, n) solves L independent dense LUs per batch element without any explicit dispatch. That is the whole point of the dynamic-base design.

Sub-batch dispatch mirrors __matmul__():

  • Same number of BLOCK axes on both sides: standard batched solve.

  • RHS has BLOCK axes the LHS lacks AND folding them into RHS’s first base axis would align A’s last base with the resulting column dim: do the unfold (the case where A was a dense per-grain coupling already folded by assembly and b is per-grain BLOCK).

  • DENSE sub_batch on either operand at this point is a bug: assembly should have folded it.

Parameters:

other (Tensor)

Return type:

Tensor

property sub_batch: _SubBatchView

View into the per-site sub-batch region. Ops adjust sub_batch_ndim.

sub_batch_meta: tuple = ()
sub_batch_ndim: int = 0
property sub_batch_shape: Size
sub_batch_state: tuple = ()
to(*args, **kwargs)[source]
Return type:

Tensor

unflatten_base(*shape)[source]

Reshape the base region to shape. Total base storage must match.

Parameters:

shape (int)

Return type:

Tensor

classmethod zeros(*, batch_shape=(), sub_batch_shape=(), base_shape=(), dtype=None, device=None)[source]

Zero-filled Tensor with explicit region shapes.

sub_batch_labels optionally attaches per-axis labels (length must match len(sub_batch_shape)). Used by the BLOCK-aware _build_group_block to build zero blocks at the cell’s canonical sub_batch shape with the cell’s preserved labels.

Parameters:
Return type:

Tensor

classmethod zeros_like(other)[source]
Parameters:

other (Tensor)

Return type:

Tensor

class neml2.types.TensorWrapper(data, sub_batch_ndim=0, sub_batch_state=(), sub_batch_meta=(), k_ndim=0, k_state=(), k_pairing=())[source]

Bases: object

Shared shape decomposition + region views + _rewrap machinery for the typed tensor wrappers.

Subclasses (@dataclass(frozen=True, eq=False)) declare the seven instance fields listed in the module docstring; the trailing BASE_NDIM dimensions of data have shape BASE_SHAPE (in Mandel packing where applicable).

Parameters:
BASE_NDIM: ClassVar[int]
BASE_SHAPE: ClassVar[tuple[int, ...]]
property base: BaseView[Self]
property base_shape: Size
property batch: BatchView[Self]
property batch_shape: Size

All non-K, non-base dims, i.e. dynamic + sub-batch.

data: Tensor
property device: device
property dtype: dtype
property dynamic_batch: DynamicBatchView[Self]
property dynamic_batch_shape: Size
property k: KView[Self]

Read-only view over the K region (leading k_ndim axes).

k_ndim: int
k_pairing: tuple[int | None, ...]
property k_shape: Size

The K-region storage shape (raw data.shape[:k_ndim]).

"broadcast" axes here are size 1 in storage; the logical extent of a paired-broadcast K axis equals the paired sub axis’s extent (recovered via sub_batch_shape and k_pairing).

k_state: tuple[Literal['full', 'broadcast'], ...]
materialize()[source]

Force every "broadcast" sub-batch axis into "full" storage.

Cheap when sub_batch_state is already empty or all-"full" (no-op return). Does NOT touch the K region – K materialisation is the job of fullify().

Parameters:

self (Self)

Return type:

Self

property ndim: int

Total tensor rank (len(shape)) – equals k_ndim + dynamic_batch_ndim + sub_batch_ndim + BASE_NDIM.

property shape: Size
property sub_batch: SubBatchView[Self]
sub_batch_meta: tuple[int, ...]
sub_batch_ndim: int
property sub_batch_shape: Size
sub_batch_state: tuple[Literal['full', 'broadcast'], ...]
to(*args, **kwargs)[source]
Return type:

Self

with_sub_batch_ndim(sub_batch_ndim)[source]

Return a structurally-identical wrapper with the declared sub_batch_ndim overridden.

Same storage, same K layout. Per-axis sub_batch_state / sub_batch_meta are reset to empty since the axis count is changing – callers that need a labelled state must re-attach it.

Use this to normalize a wrapper that crossed a layer boundary with the wrong sub axis count (e.g. a predictor output that lost its declared per-site axes) instead of unwrapping to raw data and re-constructing.

Parameters:

sub_batch_ndim (int)

Return type:

Self

class neml2.types.Vec(data, sub_batch_ndim=0, sub_batch_state=(), sub_batch_meta=(), k_ndim=0, k_state=(), k_pairing=())[source]

Bases: PrimitiveTensor

Wraps a torch.Tensor of shape (..., 3).

Parameters:
BASE_NDIM: ClassVar[int] = 1
BASE_SHAPE: ClassVar[tuple[int, ...]] = (3,)
data: torch.Tensor
k_ndim: int = 0
k_pairing: tuple = ()
k_state: tuple = ()
sub_batch_meta: tuple = ()
sub_batch_ndim: int = 0
sub_batch_state: tuple = ()
class neml2.types.WR2(data, sub_batch_ndim=0, sub_batch_state=(), sub_batch_meta=(), k_ndim=0, k_state=(), k_pairing=())[source]

Bases: PrimitiveTensor

Wraps a torch.Tensor of shape (..., 3) storing the axial vector.

Parameters:
BASE_NDIM: ClassVar[int] = 1
BASE_SHAPE: ClassVar[tuple[int, ...]] = (3,)
data: torch.Tensor
classmethod identity(*, dtype=None, device=None)[source]

The zero skew tensor — the additive identity (no canonical ‘unit’ skew).

Parameters:
Return type:

WR2

k_ndim: int = 0
k_pairing: tuple = ()
k_state: tuple = ()
sub_batch_meta: tuple = ()
sub_batch_ndim: int = 0
sub_batch_state: tuple = ()
neml2.types.abs(a)[source]

Element-wise absolute value.

Parameters:

a (_TW)

Return type:

_TW

neml2.types.align_sub_batch(*wrappers)[source]

Pad each wrapper’s sub_batch_ndim up to the common max.

Direct Python analogue of C++ utils::align_intmd_dim. For each input wrapper whose sub_batch_ndim < smax, returns a view whose data has (smax - sub_batch_ndim) size-1 axes inserted at the start of its sub-batch region.

Returns (aligned, smax).

Parameters:

wrappers (TensorWrapper)

Return type:

tuple[tuple[TensorWrapper, …], int]

neml2.types.bilinear_interpolation(arg1, arg2, abscissa1, abscissa2, ordinate)[source]

Bilinear interpolation of ordinate over a 2-D rectilinear grid.

Mirrors the C++ BilinearInterpolation::set_value evaluator with the fixed _dim=0 convention used by every in-tree test fixture. Accepts any ordinate typed wrapper (Scalar, Vec, SR2) shaped (..., N1, N2, *base) with sub_batch_ndim=2; the abscissae carry sub_batch_ndim=1 and the arguments carry sub_batch_ndim=0. The returned wrapper has sub_batch_ndim=0 and shape (..., *base).

Parameters:
Return type:

_TW

neml2.types.bilinear_interpolation_slopes(arg1, arg2, abscissa1, abscissa2, ordinate)[source]

Return (dP/dx1, dP/dx2) at the query point, one wrapper per axis.

The bilinear cell evaluates as $P = Y00 + (Y10-Y00) xi + (Y01-Y00) eta + (Y11-Y10-Y01+Y00) xi eta$ where $xi = (x1-X10)/(X11-X10)$, $eta = (x2-X20)/(X21-X20)$. Differentiating once in each argument gives the two slope wrappers returned here, sized like the ordinate’s base.

Parameters:
Return type:

tuple[_TW, _TW]

neml2.types.cat(views, dim=-1)[source]

Concatenate wrappers along a region-relative axis.

Each element of views must be the same region kind over wrappers that share batch_ndim / sub_batch_ndim. The cat-axis size is the only thing allowed to vary. Works uniformly on dynamic-base Tensor views (.batch / .sub_batch / .base) and on static-base TensorWrapper region views from the same axis convention (the static-base .base is fixed and not in the cat-able set, but .batch / .dynamic_batch / .sub_batch work).

Example

>>> a = Tensor.zeros(batch_shape=(2,), base_shape=(3,))
>>> b = Tensor.zeros(batch_shape=(2,), base_shape=(4,))
>>> cat([a.base, b.base]).data.shape
torch.Size([2, 7])
Parameters:

dim (int)

neml2.types.clamp(a, lo=None, hi=None)[source]

Element-wise clamp, matching neml2::clamp.

Either bound may be None to leave that side unbounded. Bounds are plain scalars (the C++ clamp overload used by leaves takes scalar endpoints); preserves wrapper type and sub_batch_ndim.

Parameters:
Return type:

_TW

neml2.types.compose(r1, r2)[source]

Compose two MRP rotations: r1 r2 (apply r2 first, then r1).

Matches operator*(const Rot&, const Rot&) in src/neml2/tensors/Rot.cxx. The result is again an MRP; the formula handles the standard MRP composition with denominator $1 + ||r1||^2 ||r2||^2 - 2 r1·r2$.

Parameters:
Return type:

Rot

neml2.types.cosh(s)[source]

Hyperbolic cosine. Matches neml2::cosh(const Scalar&).

Parameters:

s (Scalar)

Return type:

Scalar

neml2.types.det(A)[source]

Determinant of a (…, 3, 3) second-order tensor wrapper.

Accepts R2 (full 3x3) or SR2 (Mandel-packed). Returns a Scalar over the wrapper’s batch + sub-batch axes. Mirrors neml2::det.

Parameters:

A (TensorWrapper)

Return type:

Scalar

neml2.types.dev(A)[source]

Deviatoric part of a symmetric rank-2 tensor: A - vol(A).

Matches neml2::dev(const SR2&) in src/neml2/tensors/functions/dev.cxx.

Parameters:

A (SR2)

Return type:

SR2

neml2.types.dexp_map(w)[source]

Derivative ∂(exp_map(w))/∂w, returned as a 3x3 R2.

Mirrors WR2::dexp_map in src/neml2/tensors/WR2.cxx.

Parameters:

w (WR2)

Return type:

R2

neml2.types.diff(view, n=1, dim=0)[source]

n-th order finite difference along an axis of a region view.

The selected axis length shrinks by n; the wrapper type and sub_batch_ndim are preserved (torch.diff does not collapse the axis, so this is not a reduction). view must be t.dynamic_batch or t.sub_batch.

Matches neml2::intmd_diff in src/neml2/tensors/functions/diff.cxx.

Parameters:
  • view (DynamicBatchView[_TW] | SubBatchView[_TW])

  • n (int)

  • dim (int)

Return type:

_TW

neml2.types.drotate(r1, r2)[source]

d(r2 r1) / d(r2) where the composition is r2 * r1.

Mirrors Rot::drotate in src/neml2/tensors/Rot.cxx. Returns R2.

Parameters:
Return type:

R2

neml2.types.drotate_self(r1, r2)[source]

d(r2 r1) / d(r1) where the composition is r2 * r1.

Mirrors Rot::drotate_self in src/neml2/tensors/Rot.cxx. The naming follows the C++ convention: $r1.rotate(r2) == r2 * r1$ (apply r1 first then r2), and r1.drotate_self(r2) is the derivative of that composed rotation w.r.t. the receiver r1. Returns R2.

Parameters:
Return type:

R2

neml2.types.euler_rodrigues(r)[source]

Convert an MRP rotation to its 3x3 rotation matrix.

Mirrors Rot::euler_rodrigues in src/neml2/tensors/Rot.cxx:

R = (1+rr)^-2 * ( (1+rr)^2 * I + 4(1-rr) W + 8 W^2 )

where $W$ is the skew-symmetric matrix of $r$ (via R2::skew) and $rr = ||r||^2$.

Implementation: closed-form per-element, no W @ W matmul. Since W = [[0,-w2,w1],[w2,0,-w0],[-w1,w0,0]] is skew, W^2 is symmetric with only 6 independent components – computed below from raw w0/w1/w2 to skip the aten.bmm lowering, which triggers a PyTorch Inductor codegen bug under dynamic-batch export (int_array_0 referenced in the generated wrapper without ever being declared – see benchmark/scpdecoup for the failing pattern).

Parameters:

r (Rot)

Return type:

R2

neml2.types.exp(s)[source]
Parameters:

s (Scalar)

Return type:

Scalar

neml2.types.exp_map(w)[source]

Exponential of a skew axial vector — yields an MRP rotation.

Mirrors WR2::exp_map in src/neml2/tensors/WR2.cxx. Uses a Taylor series near $||w||^2 ≈ 0$ to avoid the singularity at the origin; the other singularity at $||w||^2 = 2π$ is unavoidable and shared with the C++ implementation.

Parameters:

w (WR2)

Return type:

Rot

neml2.types.gt(a, b)[source]
Parameters:
Return type:

_TW

neml2.types.heaviside(a)[source]

Element-wise Heaviside step function $H(a) = (sign(a) + 1) / 2$.

Matches neml2::heaviside in src/neml2/tensors/functions/heaviside.cxx (which uses the same (sign(a) + 1) / 2 form), preserving wrapper type and sub_batch_ndim.

Parameters:

a (_TW)

Return type:

_TW

neml2.types.inner(A, B)[source]

Frobenius inner product A : B over the wrappers’ base axes.

Mirrors neml2::inner(const Tensor&, const Tensor&) in src/neml2/tensors/functions/inner.cxx: contract every base component of A against the matching base component of B, leaving a Scalar over the shared batch + sub-batch axes.

For SR2 the Mandel packing’s per-component weights are already absorbed into the storage (off-diagonal entries carry the sqrt(2) factor), so the contraction collapses to a plain dot product over the 6-component vector and naturally agrees with the full-tensor Frobenius product. For R2 (or any non-symmetric wrapper) the contraction sums over all base axes directly.

Parameters:
Return type:

Scalar

neml2.types.inv(A)[source]

Matrix inverse of a (…, 3, 3) second-order tensor wrapper.

For R2 returns the full inverse as an R2; for SR2 returns the inverse repacked into Mandel form (the inverse of a symmetric tensor is symmetric). Mirrors neml2::inv.

Parameters:

A (_TW)

Return type:

_TW

neml2.types.jvp_compose(r1, r2, *, dr1=None, dr2=None)[source]

Pushforward of compose() (compose(r1, r2)) along its operands.

$d(compose(r1, r2)) = (∂/∂r1)·dr1 + (∂/∂r2)·dr2$ with the operand derivatives given by the established drotate() / drotate_self() convention ($∂compose(r1, r2)/∂r1 = drotate(r2, r1)$, $∂compose(r1, r2)/∂r2 = drotate_self(r2, r1)$ — both 3×3 R2 linear maps from a 3-vector tangent to a 3-vector tangent). Pass only the operands that vary; a None tangent means that operand is held fixed.

Parameters:
Return type:

Rot

neml2.types.jvp_euler_rodrigues(r, dr)[source]

Pushforward of euler_rodrigues() (Rot→R2) along the tangent dr.

Closed-form via the body-frame angular rate. Rot is the Modified Rodrigues Parameter (MRP) form (Rot.cxx), for which the MRP-rate / body-rate kinematic relation (Schaub & Junkins, Analytical Mechanics of Space Systems) inverts to:

ω_b = (4 / s²) · [ (1 − r·r) v − 2 (r × v) + 2 (r·v) r ]
D R[r]{v} = R(r) · [ω_b]_×

with $s = 1 + r·r$ and v = dr. The action collapses to one 3×3 skew matrix multiply — no (..., 3, 3, 3) derivative kernel is ever formed. (The simpler $ω_b = (2/s)(v − r × v)$ you may have seen applies to the classical Rodrigues form $R = I + (2/s)([r]_× + [r]_ײ)$, not the MRP form NEML2 uses.)

Parameters:
Return type:

R2

neml2.types.jvp_exp_map(w, dw)[source]

Pushforward of exp_map() (WR2→Rot) along the tangent dw.

Closed-form rank-1-plus-identity: $dexp_map(w) = a(|w|²) I + b(|w|²) w wᵀ$, so the action is $dr = a·dw + b·(w·dw)·w$ — two vector ops, no 3×3 matrix materialised. $a$ and $b$ are the same scalar coefficients dexp_map() builds the matrix from, with the same Taylor branch near ||w||² 0 to avoid the origin singularity.

Parameters:
Return type:

Rot

neml2.types.jvp_linear_interpolation(argument, abscissa, ordinate, dargument)[source]

Differential pushforward of linear_interpolation() along dargument.

Returns slope · dargument where slope is the piecewise-constant dy/dx at argument. The leading-K seed axis of a chain-rule tangent rides through naturally as a broadcast batch dim on dargument. Hides the searchsorted + gather behind a typed- function boundary so leaves stay in pure typed-wrapper algebra — same pattern as jvp_compose(), jvp_exp_map(), etc.

Parameters:
Return type:

Scalar

neml2.types.jvp_rotate(x, R, dR)[source]

Pushforward of rotate() along the tangent dR.

Overloaded on x’s type. The forward is always linear in x, so only the R-direction needs an explicit primitive; the x-direction is just rotate(dx, R) and can be expressed directly.

neml2.types.linear_interpolation(argument, abscissa, ordinate)[source]

Piecewise-linear interpolation of a Scalar table.

abscissa and ordinate may carry their own leading batch (e.g. per-sample interpolation tables introduced by pyzag-style parameter calibration); the broadcast-safe gather in _gather_along_last() handles those naturally.

Parameters:
Return type:

Scalar

neml2.types.log(s)[source]

Natural logarithm. Matches neml2::log(const Scalar&).

Parameters:

s (Scalar)

Return type:

Scalar

neml2.types.log10(s)[source]

Base-10 logarithm. Matches neml2::log10(const Scalar&).

Parameters:

s (Scalar)

Return type:

Scalar

neml2.types.lt(a, b)[source]
Parameters:
Return type:

_TW

neml2.types.macaulay(a)[source]

Element-wise Macaulay bracket $<a>_+ = a * H(a) = a * (sign(a) + 1) / 2$.

Matches neml2::macaulay in src/neml2/tensors/functions/macaulay.cxx; preserves wrapper type and sub_batch_ndim.

Parameters:

a (_TW)

Return type:

_TW

neml2.types.mean(view, dim=0)[source]

Mean over one axis of a region view.

view must be t.dynamic_batch or t.sub_batch. Always collapses the axis (no keepdim); reducing a sub-batch axis drops sub_batch_ndim by 1. Returns the same wrapper type as the view’s underlying wrapper.

Like sum(), materialises any "broadcast" sub-batch axis before reducing so the per-site mean counts every site.

Parameters:
  • view (DynamicBatchView[_TW] | SubBatchView[_TW])

  • dim (int)

Return type:

_TW

neml2.types.norm(A, eps=0.0)[source]

Euclidean / Frobenius norm depending on the input type.

  • norm(A: SR2, eps=0.0) – Frobenius norm of a symmetric rank-2 tensor in Mandel packing. The eps regularizer keeps the result differentiable at A == 0; matches neml2::norm on the v2 C++ side.

  • norm(t.base)sqrt(sum_over_base(t * t)) on a Tensor. Returns a Tensor with base_ndim=0; batch / sub_batch axes are preserved.

Parameters:

eps (float)

neml2.types.opaque_pow(a, n)[source]

Element-wise power routed through the neml2::opaque_pow custom op.

Inductor treats the custom op as a fusion barrier, which prevents the pow from being inlined into a downstream reduction’s per-output recompute. Profiled wins so far:

  • PowerLawSlipRule -> SumSlipRates -> K-tangent (scpcoup CUDA B=8192: 4.95 s without the barrier vs 2.14 s with it, 2.3x; across the CP suite 2-3x).

  • PerzynaPlasticFlowRate (isoharden CUDA B=8192: 117 ms without the barrier vs 102 ms with it, 1.15x).

The barrier costs a real fusion opportunity at small batches – the same isoharden case at B=1024 is 90 ms transparent vs 100 ms opaque, so opaque is a net loss when the reduction redundancy doesn’t dominate. opaque_pow is leaf-specific opt-in: profile the leaf, pick whichever is faster on the batches you care about.

Parameters:
Return type:

_TW

neml2.types.outer(a, b)[source]

Tensor product a b of two SR2s, producing an SSR4 in Mandel packing.

Parameters:
Return type:

SSR4

neml2.types.pow(a, n)[source]

Element-wise power. Calls torch.pow directly.

Transparent to Inductor: fuses with surrounding pointwise ops. The sensible default for new leaves. If profiling on a representative benchmark shows the pow being recomputed redundantly inside a fused reduction kernel (Triton can do this even when the reduction itself looks small), switch the call site to opaque_pow() and re-time.

Parameters:
Return type:

_TW

neml2.types.r2_from_sr2(s)[source]

Unpack an SR2 (Mandel) into a full R2 (..., 3, 3).

Matches the C++ R2(const SR2&) constructor / mandel_to_full.

NOTE: an earlier attempt rewrote this as s.data @ P_unpack (a (6, 9) projection matmul). The op count dropped from ~10 to 2, but the change regressed CUDA wall-time by ~9 % at small batch because Inductor doesn’t fuse a separate matmul kernel with the surrounding pointwise ops, while it does fuse the explicit select+stack chain below into a single Triton kernel. The pointwise form is the deliberate CUDA-favourable choice — see the note in the README under the CP single-crystal bench.

Parameters:

s (SR2)

Return type:

R2

neml2.types.r2_from_wr2(w)[source]

Unpack a WR2 axial vector into a full skew-symmetric R2.

Matches R2(const WR2&) / skew_to_full — the convention is the same as R2::skew(Vec): W = [[0,-w2,w1],[w2,0,-w0],[-w1,w0,0]]. See r2_from_sr2() for why the explicit select+stack chain is preferred over a matmul-against-fixed-projection form (CUDA fusion).

Parameters:

w (WR2)

Return type:

R2

neml2.types.rotate(x, R)[source]

Rotate a typed tensor by an R2 rotation matrix.

Overloaded on the operand type:

  • SR2 -> SR2sym(R S Rᵀ) packed back to Mandel.

  • WR2 -> WR2skew(R W Rᵀ) packed back to an axial vector.

  • R2 -> R2 — the full asymmetric R A Rᵀ (no projection).

  • SSR4 -> SSR4 — the 6×6 Mandel basis rotation Q(R) T Q(R)ᵀ.

neml2.types.sign(a)[source]

Element-wise sign.

Parameters:

a (_TW)

Return type:

_TW

neml2.types.sinh(s)[source]

Hyperbolic sine. Matches neml2::sinh(const Scalar&).

Parameters:

s (Scalar)

Return type:

Scalar

neml2.types.skew(t)[source]

Skew part of an R2 packed as an axial vector.

$w0 = (m[2,1] - m[1,2]) / 2$, $w1 = (m[0,2] - m[2,0]) / 2$, $w2 = (m[1,0] - m[0,1]) / 2$ (matches the W = [[0,-w2,w1], [w2,0,-w0],[-w1,w0,0]] convention).

Parameters:

t (R2)

Return type:

WR2

neml2.types.sqrt(s)[source]
Parameters:

s (Scalar)

Return type:

Scalar

neml2.types.stack(views, dim=0)[source]

Stack values along a NEW axis inside a chosen region view.

Dispatches on view type:

  • DynamicBatchView / SubBatchView over a fixed-base TensorWrapper -> typed wrapper output.

  • _RegionView over a dynamic-base Tensor -> Tensor output.

All views must be the same kind over operands sharing region ndims and (apart from the new axis) data shape.

Example

>>> v0 = Vec.fill(6.0, 4.0, 0.0)
>>> v1 = Vec.fill(8.0, 5.0, 0.0)
>>> stack([v0.dynamic_batch, v1.dynamic_batch]).data.shape
torch.Size([2, 3])
Parameters:

dim (int)

neml2.types.sum(view, dims=0, keepdim=False)[source]

Sum over axes of a region view.

Two view families are supported:

  • TensorWrapper (t.dynamic_batch or t.sub_batch). When summing over a sub-batch axis with keepdim=False, the result’s sub_batch_ndim drops by the number of reduced axes. Returns the same wrapper type as the view’s underlying wrapper. When reducing over a sub-batch axis that’s currently stored "broadcast" (size 1 with logical extent in sub_batch_meta), the wrapper is materialised first so the sum sees every per-site copy.

  • Tensor base (t.base on a Tensor). dims=None means “sum over every base axis”, collapsing to base_ndim=0; an explicit dims reduces the named region- relative axes. keepdim follows torch semantics.

t.batch and TensorWrapper t.base are rejected: the former would straddle dynamic/sub-batch; the latter would change the wrapper type (use Tensor.base for dynamic-base reductions).

Parameters:
neml2.types.sym(t)[source]

Symmetric part of an R2 packed in Mandel form.

Equivalent to $full_to_mandel((T + T^T) / 2)$: diagonals are kept as-is, off-diagonals carry the symmetric average times sqrt(2).

Parameters:

t (R2)

Return type:

SR2

neml2.types.tanh(s)[source]

Hyperbolic tangent. Matches neml2::tanh(const Scalar&).

Parameters:

s (Scalar)

Return type:

Scalar

neml2.types.tr(A)[source]

Trace of a symmetric rank-2 tensor.

Matches neml2::tr(const SR2&) in src/neml2/tensors/functions/tr.cxx.

Parameters:

A (SR2)

Return type:

Scalar

neml2.types.unit(A, eps=0.0)[source]

Normalize $A$ by its Frobenius norm. eps regularizes at A == 0.

Parameters:
Return type:

SR2

neml2.types.vec_component(v, i)[source]

Extract the i-th Scalar component of a Vec (i in 0, 1, 2).

Mirrors the C++ Vec::operator()(int) slot access used by leaves like VecComponents that decompose a Vec into per-axis Scalars. Preserves sub-batch metadata; works inside a leaf’s forward without dropping out of wrapper algebra.

Parameters:
Return type:

Scalar

neml2.types.vec_from_scalars(s0, s1, s2)[source]

Assemble a Vec from three Scalar components.

Mirrors the C++ Vec::fill(Scalar, Scalar, Scalar) factory: stacks the three Scalar values along a fresh trailing axis to produce a (..., 3) Vec. All three inputs must share dtype/device; sub-batch alignment flows through align_sub_batch() so per-sub-batch and global Scalars combine cleanly.

Parameters:
Return type:

Vec

neml2.types.vol(A)[source]

Volumetric part of a symmetric rank-2 tensor: tr(A)/3 * I.

Matches neml2::vol(const SR2&) in src/neml2/tensors/functions/vol.cxx.

Parameters:

A (SR2)

Return type:

SR2

neml2.types.where(c, a, b)[source]

Element-wise select, matching neml2::where.

Parameters:
Return type:

_TW