neml2.types¶
Typed Python tensor wrappers.
Each class is a @dataclass(frozen=True, eq=False) over a single
data: torch.Tensor field, registered as a pytree node so torch.export
flattens cleanly at the export boundary while the authoring surface stays
strongly typed inside nn.Module.forward().
Operator overloads (+, -, *, @, -x, abs(x), x ** n)
live on the wrapper classes along with constructors. Shape-manipulation
ops are exposed through region-view properties (t.batch,
t.dynamic_batch, t.sub_batch, t.base) so the intent is
unambiguous — e.g. t.sub_batch.unsqueeze(-1) or
t.base.transpose(-2, -1). Everything else (invariants,
decompositions, transcendentals, math-bearing type conversions like
euler_rodrigues(Rot) -> R2) lives in neml2.types.functions
as free functions, matching how the C++ side exposes them.
- class neml2.types.MillerIndex(data, sub_batch_ndim=0, sub_batch_state=(), sub_batch_meta=(), k_ndim=0, k_state=(), k_pairing=())[source]¶
Bases:
PrimitiveTensorWraps a torch.Tensor of shape
(..., 3)carrying Miller indices.Inherits all arithmetic and
zeros/ones/full/empty/fillfactories fromPrimitiveTensor. No class-specific overrides — the C++ analogue has the same minimal surface.- Parameters:
- data: torch.Tensor¶
- class neml2.types.PrimitiveTensor(data, sub_batch_ndim=0, sub_batch_state=(), sub_batch_meta=(), k_ndim=0, k_state=(), k_pairing=())[source]¶
Bases:
TensorWrapperFixed-base-shape typed tensor with inherited arithmetic and factories.
- Parameters:
- classmethod empty(*batch, dtype=None, device=None)[source]¶
Wrapper of given shape with undefined data.
- Parameters:
batch (int)
dtype (torch.dtype | None)
device (torch.device | str | None)
- Return type:
Self
- classmethod fill(*components, dtype=None, device=None)[source]¶
Build a wrapper from
prod(BASE_SHAPE)scalar components.Components are taken in row-major order and reshaped to
cls.BASE_SHAPE. Subclasses with non-trivial packing semantics (e.g.SR2with Mandel √2 shear scaling, or short-form 1/3-component overloads) override this.- Parameters:
components (float)
dtype (torch.dtype | None)
device (torch.device | str | None)
- Return type:
Self
- classmethod full(*batch, fill_value, dtype=None, device=None)[source]¶
Wrapper of given shape filled with
fill_value.- Parameters:
batch (int)
fill_value (float)
dtype (torch.dtype | None)
device (torch.device | str | None)
- Return type:
Self
- classmethod ones(*batch, dtype=None, device=None)[source]¶
Ones-filled wrapper of dynamic shape
batchand basecls.BASE_SHAPE.- Parameters:
batch (int)
dtype (torch.dtype | None)
device (torch.device | str | None)
- Return type:
Self
- classmethod zeros(*batch, dtype=None, device=None)[source]¶
Zero-filled wrapper of dynamic shape
batchand basecls.BASE_SHAPE.- Parameters:
batch (int)
dtype (torch.dtype | None)
device (torch.device | str | None)
- Return type:
Self
- classmethod zeros_like(template, *, sub_batch_shape=None)[source]¶
Zero-filled wrapper inheriting
template’s K + dynamic batch layout.sub_batch_shape(defaulting totemplate.sub_batch_shape) overrides the sub-batch region, useful when the caller needs a zero tail of a different cell-axis length thantemplateto splice into a typed cat / arithmetic. The K metadata (k_ndim/k_state/k_pairing) is carried fromtemplateso the result aligns rank-by-rank for downstream chain-rule binary ops (zeros are direction-agnostic in K, so inheritingtemplate’s state is the only choice that keeps the leading K axes positionally consistent with the operand the result will combine with).- Parameters:
template (TensorWrapper)
- Return type:
Self
- class neml2.types.R2(data, sub_batch_ndim=0, sub_batch_state=(), sub_batch_meta=(), k_ndim=0, k_state=(), k_pairing=())[source]¶
Bases:
PrimitiveTensorWraps a torch.Tensor of shape
(..., 3, 3).- Parameters:
- data: torch.Tensor¶
- class neml2.types.Rot(data, sub_batch_ndim=0, sub_batch_state=(), sub_batch_meta=(), k_ndim=0, k_state=(), k_pairing=())[source]¶
Bases:
PrimitiveTensorWraps a torch.Tensor of shape
(..., 3)in MRP packing.- Parameters:
- data: torch.Tensor¶
- class neml2.types.SR2(data, sub_batch_ndim=0, sub_batch_state=(), sub_batch_meta=(), k_ndim=0, k_state=(), k_pairing=())[source]¶
Bases:
PrimitiveTensorWraps a torch.Tensor of shape
(..., 6)in Mandel packing.- Parameters:
- data: torch.Tensor¶
- classmethod fill(*components, dtype=None, device=None)[source]¶
Build an SR2 from 1, 3, or 6 tensor components (mirrors C++
SR2::fill).1 value
a->diag(a, a, a).3 values -> the three diagonal entries, zero shear.
6 values
s11 s22 s33 s23 s13 s12-> the full symmetric tensor; the three shear entries are scaled bysqrt(2)into Mandel storage.
Overrides the generic
PrimitiveTensor.fill()to handle the short forms and the Mandel √2 scaling. The 6-component form is not a rawtensor([...]).reshape((6,))— the shear scaling matters.
- class neml2.types.SSR4(data, sub_batch_ndim=0, sub_batch_state=(), sub_batch_meta=(), k_ndim=0, k_state=(), k_pairing=())[source]¶
Bases:
PrimitiveTensorWraps a torch.Tensor of shape
(..., 6, 6)in Mandel packing.- Parameters:
- data: torch.Tensor¶
- classmethod identity(*, dtype=None, device=None)[source]¶
Full identity δ_{ij}δ_{kl} in Mandel packing.
In the Mandel basis this is a 3x3 block of 1.0 in the top-left (volumetric) corner with all off-diagonal/deviatoric entries zero, i.e. it acts as
A -> tr(A) * Irather thanA -> A. Distinct fromidentity_sym()(which is the (6,6) eye).
- classmethod identity_C1(*, dtype=None, device=None)[source]¶
Cubic-symmetry projector $I_C1$: top-left 3x3 identity block of the Mandel (6, 6) matrix; selects the cubic on-diagonal normal-stress coefficient (matches C++
SSR4::identity_C1).
- classmethod identity_C2(*, dtype=None, device=None)[source]¶
Cubic-symmetry projector $I_C2$: the three normal-stress off-diagonal ones in the top-left 3x3 Mandel block; selects the cubic off-diagonal normal-stress coefficient (matches C++
SSR4::identity_C2).
- classmethod identity_C3(*, dtype=None, device=None)[source]¶
Cubic-symmetry projector $I_C3$: bottom-right 3x3 identity block of the Mandel (6, 6) matrix; selects the cubic shear coefficient (matches C++
SSR4::identity_C3).
- classmethod identity_dev(*, dtype=None, device=None)[source]¶
Deviatoric projection
dev(A) = I_dev : A— equalsI_sym - I_vol.
- classmethod identity_sym(*, dtype=None, device=None)[source]¶
Symmetric identity
I_sym : A = Afor any SR2 $A$ — the (6,6) eye.
- class neml2.types.Scalar(data, sub_batch_ndim=0, sub_batch_state=(), sub_batch_meta=(), k_ndim=0, k_state=(), k_pairing=(), *, dtype=None, device=None)[source]¶
Bases:
PrimitiveTensorWraps a torch.Tensor of base shape
()(i.e., one number per batch entry).- Parameters:
data (Tensor)
sub_batch_ndim (int)
sub_batch_state (tuple)
sub_batch_meta (tuple)
k_ndim (int)
k_state (tuple)
k_pairing (tuple)
dtype (torch.dtype | None)
device (torch.device | str | None)
- classmethod arange(start, end=None, step=1, *, dtype=None, device=None)[source]¶
Like
torch.arange:arange(N)->[0, …, N-1],arange(a, b, s)->[a, a+s, …]up to (excluding)b.
- classmethod from_value(x, *, like)[source]¶
Construct a Scalar inheriting dtype/device from an existing wrapper.
- Parameters:
like (TensorWrapper)
- Return type:
- classmethod full(*shape, fill_value, dtype=None, device=None)[source]¶
Wrapper of given shape filled with
fill_value.
- classmethod linspace(start, end, steps, *, dtype=None, device=None)[source]¶
stepsvalues uniformly spaced fromstarttoendinclusive.
- class neml2.types.Tensor(data, batch_ndim=0, sub_batch_ndim=0, sub_batch_state=(), sub_batch_meta=(), k_ndim=0, k_state=(), k_pairing=())[source]¶
Bases:
objectDynamic-base-shape tensor with a
(batch, sub_batch, base)runtime split.- Parameters:
- as_typed(cls)[source]¶
View this Tensor as a TensorWrapper subclass.
Requires the trailing dims of
datato matchcls.BASE_SHAPEexactly.sub_batch_ndimandsub_batch_labelsare preserved; any leadingbatch_ndimbecomes the wrapper’s dynamic batch.- Parameters:
cls (type[TensorWrapper])
- Return type:
- property base: _BaseView¶
View into the trailing base region. Unlike
TensorWrapper, the base region is mutable on a dynamic-base Tensor –expand/unsqueeze/squeeze/catall work.
- property batch: _BatchView¶
View into the leading (dynamic) batch region. Ops preserve
sub_batch_ndim. Supportsunsqueeze/squeeze/expand/broadcast_to/cat.
- flatten_base()[source]¶
Collapse all base axes into one trailing axis; result has
base_ndim=1.- Return type:
- flatten_sub_batch()[source]¶
Absorb sub-batch axes into a NEW leading base axis.
Result has
sub_batch_ndim=0andbase_ndimincreased by 1 (the new leading base axis equalsprod(sub_batch_shape)). Use to materialise a block-diagonal-storage Tensor as one flat matrix when the downstream consumer demands base-only storage.- Return type:
- flatten_sub_batch_into_first_base_axis()[source]¶
Fold sub_batch axes INTO (not beside) the first base axis.
Result has
sub_batch_ndim=0andbase_ndimunchanged; the first base axis grows frombase_shape[0]toprod(sub_batch) * base_shape[0]. Distinct fromflatten_sub_batch()which inserts the collapsed sub_batch as a NEW leading base axis.This is the inverse of the assembly’s “BLOCK-compact preserves per-grain structure” decision: when downstream code needs the fully-unfolded form (e.g. a global-row matmul against per-grain col blocks), call this. Requires
base_ndim >= 1.- Return type:
- fold_preserving(preserved_idx, canonical_order, other_idx, fold_size)[source]¶
Permute preserved sub_batch axes to leading position (in
canonical_order), then fold (remaining_sub + base) into a single trailing axis.Result has
batch_ndimunchanged,sub_batch_ndim = len(canonical_order),sub_batch_labels = canonical_order,base_ndim=1,base[0] = fold_size.Used by
from_dict()’s preserved-label storage path to keep declared per-axis sub_batch dims as real sub_batch on the assembled slab while folding the rest.
- classmethod from_typed(wrapper, *, batch_ndim=None)[source]¶
Construct a Tensor view of an existing TensorWrapper.
The wrapper’s
sub_batch_ndim,BASE_NDIM, andsub_batch_labelsare carried over verbatim;batch_ndimdefaults todata.ndim - sub_batch_ndim - BASE_NDIM(i.e. everything not claimed by sub-batch or base). Pass an explicitbatch_ndimto override – useful when the caller knows the dyn rank but the wrapper’s leading axes encode a different layout (e.g. the JVP K-dim).- Parameters:
wrapper (TensorWrapper)
batch_ndim (int | None)
- Return type:
- new_zeros(*, batch_shape=None, sub_batch_shape=None, base_shape=None)[source]¶
Allocate a same-dtype, same-device zero Tensor with the given region shapes.
Any region whose shape is omitted defaults to
self’s shape for that region – handy for building a zero block that mirrors an existing block’s dyn / sub-batch layout.
- solve(other)[source]¶
Solve
self @ x = otherforxviatorch.linalg.solve.self.base_ndim == 2(square matrix);other.base_ndim in (1, 2).(batch, sub_batch)axes are leading-broadcast through torch’s batched solve – so a block-diagonal system(*B, L, n, n) \ (*B, L, n)solves L independent dense LUs per batch element without any explicit dispatch. That is the whole point of the dynamic-base design.Sub-batch dispatch mirrors
__matmul__():Same number of BLOCK axes on both sides: standard batched solve.
RHS has BLOCK axes the LHS lacks AND folding them into RHS’s first base axis would align
A’s last base with the resulting column dim: do the unfold (the case whereAwas a dense per-grain coupling already folded by assembly andbis per-grain BLOCK).DENSE sub_batch on either operand at this point is a bug: assembly should have folded it.
- property sub_batch: _SubBatchView¶
View into the per-site sub-batch region. Ops adjust
sub_batch_ndim.
- classmethod zeros(*, batch_shape=(), sub_batch_shape=(), base_shape=(), dtype=None, device=None)[source]¶
Zero-filled Tensor with explicit region shapes.
sub_batch_labelsoptionally attaches per-axis labels (length must matchlen(sub_batch_shape)). Used by the BLOCK-aware_build_group_blockto build zero blocks at the cell’s canonical sub_batch shape with the cell’s preserved labels.
- class neml2.types.TensorWrapper(data, sub_batch_ndim=0, sub_batch_state=(), sub_batch_meta=(), k_ndim=0, k_state=(), k_pairing=())[source]¶
Bases:
objectShared shape decomposition + region views +
_rewrapmachinery for the typed tensor wrappers.Subclasses (
@dataclass(frozen=True, eq=False)) declare the seven instance fields listed in the module docstring; the trailingBASE_NDIMdimensions ofdatahave shapeBASE_SHAPE(in Mandel packing where applicable).- Parameters:
- property base: BaseView[Self]¶
- property batch: BatchView[Self]¶
- property dynamic_batch: DynamicBatchView[Self]¶
- property k: KView[Self]¶
Read-only view over the K region (leading
k_ndimaxes).
- property k_shape: Size¶
The K-region storage shape (raw
data.shape[:k_ndim])."broadcast"axes here are size 1 in storage; the logical extent of a paired-broadcast K axis equals the paired sub axis’s extent (recovered viasub_batch_shapeandk_pairing).
- materialize()[source]¶
Force every
"broadcast"sub-batch axis into"full"storage.Cheap when
sub_batch_stateis already empty or all-"full"(no-op return). Does NOT touch the K region – K materialisation is the job offullify().- Parameters:
self (Self)
- Return type:
Self
- property ndim: int¶
Total tensor rank (
len(shape)) – equalsk_ndim + dynamic_batch_ndim + sub_batch_ndim + BASE_NDIM.
- property sub_batch: SubBatchView[Self]¶
- with_sub_batch_ndim(sub_batch_ndim)[source]¶
Return a structurally-identical wrapper with the declared
sub_batch_ndimoverridden.Same storage, same K layout. Per-axis
sub_batch_state/sub_batch_metaare reset to empty since the axis count is changing – callers that need a labelled state must re-attach it.Use this to normalize a wrapper that crossed a layer boundary with the wrong sub axis count (e.g. a predictor output that lost its declared per-site axes) instead of unwrapping to raw data and re-constructing.
- Parameters:
sub_batch_ndim (int)
- Return type:
Self
- class neml2.types.Vec(data, sub_batch_ndim=0, sub_batch_state=(), sub_batch_meta=(), k_ndim=0, k_state=(), k_pairing=())[source]¶
Bases:
PrimitiveTensorWraps a torch.Tensor of shape
(..., 3).- Parameters:
- data: torch.Tensor¶
- class neml2.types.WR2(data, sub_batch_ndim=0, sub_batch_state=(), sub_batch_meta=(), k_ndim=0, k_state=(), k_pairing=())[source]¶
Bases:
PrimitiveTensorWraps a torch.Tensor of shape
(..., 3)storing the axial vector.- Parameters:
- data: torch.Tensor¶
- neml2.types.align_sub_batch(*wrappers)[source]¶
Pad each wrapper’s
sub_batch_ndimup to the common max.Direct Python analogue of C++
utils::align_intmd_dim. For each input wrapper whosesub_batch_ndim < smax, returns a view whosedatahas(smax - sub_batch_ndim)size-1 axes inserted at the start of its sub-batch region.Returns
(aligned, smax).- Parameters:
wrappers (TensorWrapper)
- Return type:
tuple[tuple[TensorWrapper, …], int]
- neml2.types.bilinear_interpolation(arg1, arg2, abscissa1, abscissa2, ordinate)[source]¶
Bilinear interpolation of
ordinateover a 2-D rectilinear grid.Mirrors the C++
BilinearInterpolation::set_valueevaluator with the fixed_dim=0convention used by every in-tree test fixture. Accepts anyordinatetyped wrapper (Scalar, Vec, SR2) shaped(..., N1, N2, *base)withsub_batch_ndim=2; the abscissae carrysub_batch_ndim=1and the arguments carrysub_batch_ndim=0. The returned wrapper hassub_batch_ndim=0and shape(..., *base).
- neml2.types.bilinear_interpolation_slopes(arg1, arg2, abscissa1, abscissa2, ordinate)[source]¶
Return
(dP/dx1, dP/dx2)at the query point, one wrapper per axis.The bilinear cell evaluates as $P = Y00 + (Y10-Y00) xi + (Y01-Y00) eta + (Y11-Y10-Y01+Y00) xi eta$ where $xi = (x1-X10)/(X11-X10)$, $eta = (x2-X20)/(X21-X20)$. Differentiating once in each argument gives the two slope wrappers returned here, sized like the ordinate’s base.
- neml2.types.cat(views, dim=-1)[source]¶
Concatenate wrappers along a region-relative axis.
Each element of
viewsmust be the same region kind over wrappers that sharebatch_ndim/sub_batch_ndim. The cat-axis size is the only thing allowed to vary. Works uniformly on dynamic-baseTensorviews (.batch/.sub_batch/.base) and on static-baseTensorWrapperregion views from the same axis convention (the static-base.baseis fixed and not in the cat-able set, but.batch/.dynamic_batch/.sub_batchwork).Example
>>> a = Tensor.zeros(batch_shape=(2,), base_shape=(3,)) >>> b = Tensor.zeros(batch_shape=(2,), base_shape=(4,)) >>> cat([a.base, b.base]).data.shape torch.Size([2, 7])
- Parameters:
dim (int)
- neml2.types.clamp(a, lo=None, hi=None)[source]¶
Element-wise clamp, matching
neml2::clamp.Either bound may be
Noneto leave that side unbounded. Bounds are plain scalars (the C++clampoverload used by leaves takes scalar endpoints); preserves wrapper type andsub_batch_ndim.
- neml2.types.compose(r1, r2)[source]¶
Compose two MRP rotations:
r1 ∘ r2(apply r2 first, then r1).Matches
operator*(const Rot&, const Rot&)insrc/neml2/tensors/Rot.cxx. The result is again an MRP; the formula handles the standard MRP composition with denominator $1 + ||r1||^2 ||r2||^2 - 2 r1·r2$.
- neml2.types.det(A)[source]¶
Determinant of a (…, 3, 3) second-order tensor wrapper.
Accepts
R2(full 3x3) orSR2(Mandel-packed). Returns aScalarover the wrapper’s batch + sub-batch axes. Mirrorsneml2::det.- Parameters:
A (TensorWrapper)
- Return type:
- neml2.types.dev(A)[source]¶
Deviatoric part of a symmetric rank-2 tensor:
A - vol(A).Matches
neml2::dev(const SR2&)insrc/neml2/tensors/functions/dev.cxx.
- neml2.types.dexp_map(w)[source]¶
Derivative
∂(exp_map(w))/∂w, returned as a 3x3R2.Mirrors
WR2::dexp_mapinsrc/neml2/tensors/WR2.cxx.
- neml2.types.diff(view, n=1, dim=0)[source]¶
n-th order finite difference along an axis of a region view.The selected axis length shrinks by
n; the wrapper type andsub_batch_ndimare preserved (torch.diffdoes not collapse the axis, so this is not a reduction).viewmust bet.dynamic_batchort.sub_batch.Matches
neml2::intmd_diffinsrc/neml2/tensors/functions/diff.cxx.
- neml2.types.drotate(r1, r2)[source]¶
d(r2 ∘ r1) / d(r2)where the composition isr2 * r1.Mirrors
Rot::drotateinsrc/neml2/tensors/Rot.cxx. ReturnsR2.
- neml2.types.drotate_self(r1, r2)[source]¶
d(r2 ∘ r1) / d(r1)where the composition isr2 * r1.Mirrors
Rot::drotate_selfinsrc/neml2/tensors/Rot.cxx. The naming follows the C++ convention: $r1.rotate(r2) == r2 * r1$ (applyr1first thenr2), andr1.drotate_self(r2)is the derivative of that composed rotation w.r.t. the receiverr1. ReturnsR2.
- neml2.types.euler_rodrigues(r)[source]¶
Convert an MRP rotation to its 3x3 rotation matrix.
Mirrors
Rot::euler_rodriguesinsrc/neml2/tensors/Rot.cxx:R = (1+rr)^-2 * ( (1+rr)^2 * I + 4(1-rr) W + 8 W^2 )where $W$ is the skew-symmetric matrix of $r$ (via
R2::skew) and $rr = ||r||^2$.Implementation: closed-form per-element, no
W @ Wmatmul. SinceW = [[0,-w2,w1],[w2,0,-w0],[-w1,w0,0]]is skew,W^2is symmetric with only 6 independent components – computed below from raww0/w1/w2to skip theaten.bmmlowering, which triggers a PyTorch Inductor codegen bug under dynamic-batch export (int_array_0referenced in the generated wrapper without ever being declared – seebenchmark/scpdecoupfor the failing pattern).
- neml2.types.exp_map(w)[source]¶
Exponential of a skew axial vector — yields an MRP rotation.
Mirrors
WR2::exp_mapinsrc/neml2/tensors/WR2.cxx. Uses a Taylor series near $||w||^2 ≈ 0$ to avoid the singularity at the origin; the other singularity at $||w||^2 = 2π$ is unavoidable and shared with the C++ implementation.
- neml2.types.gt(a, b)[source]¶
- Parameters:
a (_TW)
b (TensorWrapper | float | int)
- Return type:
_TW
- neml2.types.heaviside(a)[source]¶
Element-wise Heaviside step function $H(a) = (sign(a) + 1) / 2$.
Matches
neml2::heavisideinsrc/neml2/tensors/functions/heaviside.cxx(which uses the same(sign(a) + 1) / 2form), preserving wrapper type andsub_batch_ndim.- Parameters:
a (_TW)
- Return type:
_TW
- neml2.types.inner(A, B)[source]¶
Frobenius inner product
A : Bover the wrappers’ base axes.Mirrors
neml2::inner(const Tensor&, const Tensor&)insrc/neml2/tensors/functions/inner.cxx: contract every base component of A against the matching base component of B, leaving a Scalar over the shared batch + sub-batch axes.For
SR2the Mandel packing’s per-component weights are already absorbed into the storage (off-diagonal entries carry thesqrt(2)factor), so the contraction collapses to a plain dot product over the 6-component vector and naturally agrees with the full-tensor Frobenius product. ForR2(or any non-symmetric wrapper) the contraction sums over all base axes directly.- Parameters:
A (TensorWrapper)
B (TensorWrapper)
- Return type:
- neml2.types.inv(A)[source]¶
Matrix inverse of a (…, 3, 3) second-order tensor wrapper.
For
R2returns the full inverse as anR2; forSR2returns the inverse repacked into Mandel form (the inverse of a symmetric tensor is symmetric). Mirrorsneml2::inv.- Parameters:
A (_TW)
- Return type:
_TW
- neml2.types.jvp_compose(r1, r2, *, dr1=None, dr2=None)[source]¶
Pushforward of
compose()(compose(r1, r2)) along its operands.$d(compose(r1, r2)) = (∂/∂r1)·dr1 + (∂/∂r2)·dr2$ with the operand derivatives given by the established
drotate()/drotate_self()convention ($∂compose(r1, r2)/∂r1 = drotate(r2, r1)$, $∂compose(r1, r2)/∂r2 = drotate_self(r2, r1)$ — both 3×3R2linear maps from a 3-vector tangent to a 3-vector tangent). Pass only the operands that vary; aNonetangent means that operand is held fixed.
- neml2.types.jvp_euler_rodrigues(r, dr)[source]¶
Pushforward of
euler_rodrigues()(Rot→R2) along the tangentdr.Closed-form via the body-frame angular rate.
Rotis the Modified Rodrigues Parameter (MRP) form (Rot.cxx), for which the MRP-rate / body-rate kinematic relation (Schaub & Junkins, Analytical Mechanics of Space Systems) inverts to:ω_b = (4 / s²) · [ (1 − r·r) v − 2 (r × v) + 2 (r·v) r ] D R[r]{v} = R(r) · [ω_b]_×
with $s = 1 + r·r$ and
v = dr. The action collapses to one 3×3 skew matrix multiply — no(..., 3, 3, 3)derivative kernel is ever formed. (The simpler $ω_b = (2/s)(v − r × v)$ you may have seen applies to the classical Rodrigues form $R = I + (2/s)([r]_× + [r]_ײ)$, not the MRP form NEML2 uses.)
- neml2.types.jvp_exp_map(w, dw)[source]¶
Pushforward of
exp_map()(WR2→Rot) along the tangentdw.Closed-form rank-1-plus-identity: $dexp_map(w) = a(|w|²) I + b(|w|²) w wᵀ$, so the action is $dr = a·dw + b·(w·dw)·w$ — two vector ops, no 3×3 matrix materialised. $a$ and $b$ are the same scalar coefficients
dexp_map()builds the matrix from, with the same Taylor branch near||w||² ≈ 0to avoid the origin singularity.
- neml2.types.jvp_linear_interpolation(argument, abscissa, ordinate, dargument)[source]¶
Differential pushforward of
linear_interpolation()alongdargument.Returns
slope · dargumentwhereslopeis the piecewise-constantdy/dxatargument. The leading-K seed axis of a chain-rule tangent rides through naturally as a broadcast batch dim ondargument. Hides thesearchsorted+ gather behind a typed- function boundary so leaves stay in pure typed-wrapper algebra — same pattern asjvp_compose(),jvp_exp_map(), etc.
- neml2.types.jvp_rotate(x, R, dR)[source]¶
Pushforward of
rotate()along the tangentdR.Overloaded on
x’s type. The forward is always linear inx, so only theR-direction needs an explicit primitive; thex-direction is justrotate(dx, R)and can be expressed directly.
- neml2.types.linear_interpolation(argument, abscissa, ordinate)[source]¶
Piecewise-linear interpolation of a Scalar table.
abscissaandordinatemay carry their own leading batch (e.g. per-sample interpolation tables introduced by pyzag-style parameter calibration); the broadcast-safe gather in_gather_along_last()handles those naturally.
- neml2.types.lt(a, b)[source]¶
- Parameters:
a (_TW)
b (TensorWrapper | float | int)
- Return type:
_TW
- neml2.types.macaulay(a)[source]¶
Element-wise Macaulay bracket $<a>_+ = a * H(a) = a * (sign(a) + 1) / 2$.
Matches
neml2::macaulayinsrc/neml2/tensors/functions/macaulay.cxx; preserves wrapper type andsub_batch_ndim.- Parameters:
a (_TW)
- Return type:
_TW
- neml2.types.mean(view, dim=0)[source]¶
Mean over one axis of a region view.
viewmust bet.dynamic_batchort.sub_batch. Always collapses the axis (nokeepdim); reducing a sub-batch axis dropssub_batch_ndimby 1. Returns the same wrapper type as the view’s underlying wrapper.Like
sum(), materialises any"broadcast"sub-batch axis before reducing so the per-site mean counts every site.- Parameters:
view (DynamicBatchView[_TW] | SubBatchView[_TW])
dim (int)
- Return type:
_TW
- neml2.types.norm(A, eps=0.0)[source]¶
Euclidean / Frobenius norm depending on the input type.
norm(A: SR2, eps=0.0)– Frobenius norm of a symmetric rank-2 tensor in Mandel packing. Theepsregularizer keeps the result differentiable atA == 0; matchesneml2::normon the v2 C++ side.norm(t.base)–sqrt(sum_over_base(t * t))on aTensor. Returns aTensorwithbase_ndim=0;batch/sub_batchaxes are preserved.
- Parameters:
eps (float)
- neml2.types.opaque_pow(a, n)[source]¶
Element-wise power routed through the
neml2::opaque_powcustom op.Inductor treats the custom op as a fusion barrier, which prevents the pow from being inlined into a downstream reduction’s per-output recompute. Profiled wins so far:
PowerLawSlipRule->SumSlipRates-> K-tangent (scpcoup CUDA B=8192: 4.95 s without the barrier vs 2.14 s with it, 2.3x; across the CP suite 2-3x).PerzynaPlasticFlowRate(isoharden CUDA B=8192: 117 ms without the barrier vs 102 ms with it, 1.15x).
The barrier costs a real fusion opportunity at small batches – the same isoharden case at B=1024 is 90 ms transparent vs 100 ms opaque, so opaque is a net loss when the reduction redundancy doesn’t dominate.
opaque_powis leaf-specific opt-in: profile the leaf, pick whichever is faster on the batches you care about.
- neml2.types.outer(a, b)[source]¶
Tensor product
a ⊗ bof two SR2s, producing an SSR4 in Mandel packing.
- neml2.types.pow(a, n)[source]¶
Element-wise power. Calls
torch.powdirectly.Transparent to Inductor: fuses with surrounding pointwise ops. The sensible default for new leaves. If profiling on a representative benchmark shows the pow being recomputed redundantly inside a fused reduction kernel (Triton can do this even when the reduction itself looks small), switch the call site to
opaque_pow()and re-time.
- neml2.types.r2_from_sr2(s)[source]¶
Unpack an SR2 (Mandel) into a full
R2(..., 3, 3).Matches the C++
R2(const SR2&)constructor /mandel_to_full.NOTE: an earlier attempt rewrote this as
s.data @ P_unpack(a(6, 9)projection matmul). The op count dropped from ~10 to 2, but the change regressed CUDA wall-time by ~9 % at small batch because Inductor doesn’t fuse a separate matmul kernel with the surrounding pointwise ops, while it does fuse the explicit select+stack chain below into a single Triton kernel. The pointwise form is the deliberate CUDA-favourable choice — see the note in the README under the CP single-crystal bench.
- neml2.types.r2_from_wr2(w)[source]¶
Unpack a
WR2axial vector into a full skew-symmetricR2.Matches
R2(const WR2&)/skew_to_full— the convention is the same asR2::skew(Vec):W = [[0,-w2,w1],[w2,0,-w0],[-w1,w0,0]]. Seer2_from_sr2()for why the explicit select+stack chain is preferred over a matmul-against-fixed-projection form (CUDA fusion).
- neml2.types.rotate(x, R)[source]¶
Rotate a typed tensor by an
R2rotation matrix.Overloaded on the operand type:
SR2 -> SR2—sym(R S Rᵀ)packed back to Mandel.WR2 -> WR2—skew(R W Rᵀ)packed back to an axial vector.R2 -> R2— the full asymmetricR A Rᵀ(no projection).SSR4 -> SSR4— the 6×6 Mandel basis rotationQ(R) T Q(R)ᵀ.
- neml2.types.skew(t)[source]¶
Skew part of an
R2packed as an axial vector.$w0 = (m[2,1] - m[1,2]) / 2$, $w1 = (m[0,2] - m[2,0]) / 2$, $w2 = (m[1,0] - m[0,1]) / 2$ (matches the
W = [[0,-w2,w1], [w2,0,-w0],[-w1,w0,0]]convention).
- neml2.types.stack(views, dim=0)[source]¶
Stack values along a NEW axis inside a chosen region view.
Dispatches on view type:
DynamicBatchView/SubBatchViewover a fixed-baseTensorWrapper-> typed wrapper output._RegionViewover a dynamic-baseTensor-> Tensor output.
All views must be the same kind over operands sharing region ndims and (apart from the new axis) data shape.
Example
>>> v0 = Vec.fill(6.0, 4.0, 0.0) >>> v1 = Vec.fill(8.0, 5.0, 0.0) >>> stack([v0.dynamic_batch, v1.dynamic_batch]).data.shape torch.Size([2, 3])
- Parameters:
dim (int)
- neml2.types.sum(view, dims=0, keepdim=False)[source]¶
Sum over axes of a region view.
Two view families are supported:
TensorWrapper (
t.dynamic_batchort.sub_batch). When summing over a sub-batch axis withkeepdim=False, the result’ssub_batch_ndimdrops by the number of reduced axes. Returns the same wrapper type as the view’s underlying wrapper. When reducing over a sub-batch axis that’s currently stored"broadcast"(size 1 with logical extent insub_batch_meta), the wrapper is materialised first so the sum sees every per-site copy.Tensor base (
t.baseon aTensor).dims=Nonemeans “sum over every base axis”, collapsing tobase_ndim=0; an explicitdimsreduces the named region- relative axes.keepdimfollows torch semantics.
t.batchand TensorWrappert.baseare rejected: the former would straddle dynamic/sub-batch; the latter would change the wrapper type (useTensor.basefor dynamic-base reductions).
- neml2.types.sym(t)[source]¶
Symmetric part of an
R2packed in Mandel form.Equivalent to $full_to_mandel((T + T^T) / 2)$: diagonals are kept as-is, off-diagonals carry the symmetric average times
sqrt(2).
- neml2.types.tr(A)[source]¶
Trace of a symmetric rank-2 tensor.
Matches
neml2::tr(const SR2&)insrc/neml2/tensors/functions/tr.cxx.
- neml2.types.unit(A, eps=0.0)[source]¶
Normalize $A$ by its Frobenius norm.
epsregularizes atA == 0.
- neml2.types.vec_component(v, i)[source]¶
Extract the
i-th Scalar component of aVec(i in 0, 1, 2).Mirrors the C++
Vec::operator()(int)slot access used by leaves likeVecComponentsthat decompose a Vec into per-axis Scalars. Preserves sub-batch metadata; works inside a leaf’s forward without dropping out of wrapper algebra.
- neml2.types.vec_from_scalars(s0, s1, s2)[source]¶
Assemble a
Vecfrom threeScalarcomponents.Mirrors the C++
Vec::fill(Scalar, Scalar, Scalar)factory: stacks the three Scalar values along a fresh trailing axis to produce a(..., 3)Vec. All three inputs must share dtype/device; sub-batch alignment flows throughalign_sub_batch()so per-sub-batch and global Scalars combine cleanly.
- neml2.types.vol(A)[source]¶
Volumetric part of a symmetric rank-2 tensor:
tr(A)/3 * I.Matches
neml2::vol(const SR2&)insrc/neml2/tensors/functions/vol.cxx.
- neml2.types.where(c, a, b)[source]¶
Element-wise select, matching
neml2::where.- Parameters:
c (TensorWrapper)
a (_TW)
b (_TW)
- Return type:
_TW