neml2.models.model¶
Model — base class for Python-native NEML2 models.
Most subclasses declare a class-level hit = HitSchema(...); the base class
derives input_spec / output_spec (variable name → type) from its
input(...) / output(...) fields and provides a default from_hit.
Dynamic-I/O models may still set input_spec / output_spec directly and
override from_hit. Every subclass implements forward().
Unified forward contract:
# Pure forward — returns outputs only.
out = model(*inputs)
# Forward + Jacobian pushforward —
# ``v[in_name][leaf_name]`` is a typed wrapper with leading seed axis K:
# data shape ``(K, *B, *sub, *base_in)``.
# Returns ``(*outputs, v_out_dict)`` where ``v_out_dict[out_name][leaf_name]``
# is the output's typed wrapper with the same leading K. No explicit
# Jacobian block is materialised inside the chain-rule graph.
*vals, v_out = model(*inputs, v={"name": {"leaf": sensitivity_matrix, ...}})
# Forward + first- and second-order Jacobian pushforward (opt-in) —
# ``v2[in_name][seed_a][seed_b]`` is a typed wrapper with two leading seed
# axes ``(N_a, N_b, *B, *sub, *base_in)``.
# Returns ``(*outputs, v_out, v2_out)``. Only models that may appear inside
# a Normality wrap implement v2; callers passing v2 must also pass v.
*vals, v_out, v2_out = model(*inputs, v=..., v2=...)
Variable names in input_spec / output_spec are plain strings with no
hierarchical prefix, e.g. "strain", "stress", "plastic_strain".
- neml2.models.model.ChainRuleAction¶
alias of
Callable[[…],TensorWrapper]
- class neml2.models.model.Model(**hit_values)[source]¶
-
Base class for Python-native NEML2 models with declared variable names.
Subclasses declare
input_specandoutput_specas class-level dicts (static models) or instance attributes set in__init__(models whose output count depends on constructor arguments).input_speckey order matchesforward()’s positional argument order;output_speckey order matches the return-tuple / return-value order.When called without
v,forwardreturns the outputs directly (a single typed wrapper or a tuple thereof). When called withv, it additionally returns the sensitivity dict as the final element of the tuple.- SECTION: ClassVar[str] = 'Models'¶
HIT section every registered subclass belongs to. Inherited; subclasses that deliberately live elsewhere (none today) can override.
- SUPPORTS_SECOND_ORDER: bool = False¶
True iff this Model’s
forwardacceptsv2andvhkwargs and propagates them viaapply_chain_rule_2(). Default False — most leaves only implement first-order chain rule and don’t need v2. A leaf must set this to True if it may appear inside aNormalitywrap (directly or transitively through a ComposedModel); Normality’s constructor walks the inner chain and raises if any leaf has this flag unset.Plain attribute (not ClassVar) so
ComposedModelcan shadow it on instances based on whether all its children support v2.- Type:
Opt-in flag
- apply_chain_rule(v, output_name, actions, *, output=None)[source]¶
Apply local chain-rule actions and accumulate by seed leaf.
actionsmaps each input variable name to a function that transforms an incoming sensitivity block for that input into its contribution tooutput_name. Missing input/leaf sensitivities are structural zeros.When two actions contribute to the same seed leaf but with different sub-batch structure — typical of a per-crystal output that mixes a global input (e.g.
d,w) and a per-crystal input (e.g.dp,e) — the accumulator pads the lower-ndim contribution with singleton axes at the start of its sub-batch region so the sum broadcasts correctly. This mirrors the C++chain_rulehelper’sdu_dx_f.intmd_unsqueeze(...)step (src/neml2/tensors/functions/chain_rule.cxx).When
output(the leaf’s forward result wrapper) is supplied, each accumulated contribution is retagged withoutput.sub_batch_ndimso the action body never has to declare sub-batch metadata explicitly. This is the foundational-op equivalent of the C++ side encoding sub-batch entirely indata.shape— the leaf does math, the accumulator owns the metadata.tangents are ordinary typed wrappers with K as the leading batch dim. Seeds that arrive as raw tensors (tests / export seeding) are wrapped as the input variable’s type. Accumulation is plain typed
+(align_sub_batchunder the hood).- Parameters:
v (dict[str, dict[str, TensorWrapper]])
output_name (str)
actions (Mapping[str, Callable[[...], TensorWrapper]])
output (TensorWrapper | None)
- Return type:
dict[str, dict[str, TensorWrapper]]
- apply_chain_rule_2(v, v2, output_name, actions_1, actions_2, vh=None)[source]¶
Propagate a second-order JVP through this leaf’s local Jacobian.
Implements $(g∘f)’’[a, b] = g’’(f) · (f’[a], f’[b]) + g’ · f’’[a, b]$:
The
g''term iterates input pairs(i, j)and combines incoming first-order tangentsv[i][a](slot 1) andvh[j][b](slot 2) into a two-leading-axis typed output tangent viaactions_2[(i, j)]. Whenvhis None it defaults tov(the original symmetric all-pairs behaviour). Whenvhis provided, only(v[i] × vh[j])pairs are iterated — used by Normality to compute Hessian-applied-to-outer directly without materialising the full Hessian.The
g'term applies the existing first-order action to incoming second-order tangentsv2[i][a][b]. Re-usesactions_1so the inner-input → outer-seed contraction matches the first-order path exactly.
Missing input pairs in
actions_2are treated asf''=0; missing entries inv/vh/v2are structural zeros. The resulting dict carries one outer key (output_name).- Parameters:
- Return type:
- call_by_name(state)[source]¶
Call
forward()(pure, nov) with values keyed by variable name.Accepts typed wrappers (preferred per rule 1) or raw tensors (wrapped via the input_spec for caller convenience). Always returns typed wrappers – consumers never have to re-attach metadata.
- Parameters:
state (Mapping[str, TensorWrapper | Tensor])
- Return type:
- declare_typed_buffer(name, spec, type_cls, *, factory=None)[source]¶
Resolve spec as a constant value and register it as a typed buffer.
Buffer-flavored sibling of
declare_typed_parameter(). Accepts the same literal / [Tensors]-cross-ref spec shapes (modes 1 and 2) but not the input-promotion modes (3 / 4): a buffer is a constant baked into the model, so promoting it to a chain-rule input would contradict its semantics.Resolution order:
TensorWrapper/torch.Tensor/float/int— wrap astype_clsand register viaregister_typed_buffer().str:Try parse as a whitespace-separated list of floats — register as a typed buffer (HIT literal).
If a factory is available, try
factory.get_tensor(spec)— register as a typed buffer ([Tensors] cross-ref).
Raises
ValueErroron any string spec that resolves to neither a literal nor a[Tensors]entry.- Parameters:
name (str)
type_cls (type[TensorWrapper])
factory (_NativeInputFile | None)
- Return type:
None
- declare_typed_parameter(name, spec, type_cls, *, factory=None, allow_nonlinear=False)[source]¶
Resolve spec and register it as a parameter or promote it to an input.
Python mirror of C++
ParameterStore::declare_parameter(src/neml2/models/ParameterStore.cxx). Resolution order:TensorWrapper/torch.Tensor/float/int— wrap astype_clsand callregister_typed_parameter()(mode 1 with an already-loaded literal/batched value).str:Try parse as a whitespace-separated list of floats — register as a typed parameter (mode 1, literal HIT value).
If a factory is available, try
factory.get_tensor(spec)— register as a typed parameter (mode 2,[Tensors]cross-ref).If allow_nonlinear: parse the string as a variable specifier (
model_name/model_name.var/var). If it matches a[Models]entry, pull the provider and record the input promotion + provider in_nl_params(mode 3). Otherwise treat the string as a bare variable name and add the input without a provider (mode 4).
The host model’s
input_specis extended in modes 3 + 4 with an entry keyed by the chosen input variable name (the provider’s output name in mode 3, or the bare variable name in mode 4), appended after the fixed structural inputs. Insideforward— declared asforward(self, <structural inputs...>, *nl_params)— fetch the value with_get_param(), which resolves a static slot fromselfor a promoted slot from the*nl_paramspack uniformly.- Parameters:
name (str)
type_cls (type[TensorWrapper])
factory (_NativeInputFile | None)
allow_nonlinear (bool)
- Return type:
None
- classmethod from_hit(node, factory)[source]¶
Construct this model from its declarative
HitSchema.Every schema field — options, dependencies, input/output renames,
var_inputslists, and parameters — flows through_store_schema_valuesduring construction; a leaf whose only state is its schema needs no__init__at all. Models with dynamic I/O or non-trivial construction logic may still override this method.- Parameters:
node (nmhit.Node)
factory (_NativeInputFile)
- Return type:
Any
- input_spec: dict[str, type[TensorWrapper]]¶
- output_priorities: dict[str, str | None] = {}¶
HIT-bound output name → priority claim (
"high"/"low"/None) sourced from eachoutput()field’spriority=kwarg. TheDependencyResolverreads this to lift the duplicate-provider error when sibling models provide the same name with disambiguating priorities, and to addlow → default → highordering edges so the highest-priority writer runs last. Names absent from the dict default toNone.
- output_spec: dict[str, type[TensorWrapper]]¶
- propagate_tangents(v, output_name, actions_1, *, output=None, v2=None, actions_2=None, vh=None)[source]¶
Dispatch
v/v2/vhthrough the local chain-rule actions.Wraps the boilerplate every second-order-aware leaf otherwise has to spell out: call
apply_chain_rule()forv(always),apply_chain_rule_2()forv2/vh(when requested), and return the right-length tuple. The return shape mirrors what the leaf was asked for:v2 is None and vh is None→(v_out,)v2 is set, vh is None→(v_out, v2_out)vh is set(v2may beNone, treated as{}) →(v_out, v2_out, vh_out)
Linear leaves (LinearCombination, YieldFunction, …) call this with no
actions_2— the second-order pass collapses to applyingactions_1tov2entries (g'' = 0). Non-linear leaves (SR2Invariant, …) pass an explicitactions_2map.Usage:
return out, *self.propagate_tangents( v, self._to, actions_1, output=out, v2=v2, vh=vh )
- Parameters:
v (dict[str, dict[str, TensorWrapper]])
output_name (str)
actions_1 (Mapping[str, Callable[[...], TensorWrapper]])
output (TensorWrapper | None)
actions_2 (Mapping[tuple[str, str], Callable[[...], TensorWrapper]] | None)
vh (dict[str, dict[str, TensorWrapper]] | None)
- Return type:
- register_typed_buffer(name, value, persistent=True)[source]¶
Register a typed tensor buffer (no autograd; baked as a constant by AOTI export).
- Parameters:
name (str)
value (TensorWrapper)
persistent (bool)
- Return type:
None
- register_typed_parameter(name, value)[source]¶
Register a typed tensor as a calibration-tracked
nn.Parameter.Mirrors
register_typed_buffer()but stores viann.Module.register_parameter(), so the value appears inmodel.parameters()and PyTorch autograd flows through it in eager mode. AOTI export converts these back to constants before tracing (seeaoti_export._freeze_parameters_to_buffers); the forward-only AOTI graph is unchanged.- Parameters:
name (str)
value (TensorWrapper)
- Return type:
None
- class neml2.models.model.NLParam(input_name, tail_index, provider=None, provider_output=None)[source]¶
Bases:
objectMarker for a parameter resolved to a runtime input (modes 3 + 4).
Records the input variable name added to the host’s
input_spec, the parameter’s position withinforward’s*nl_paramspack, and — for mode 3 — the provider model + its output variable name so the parentComposedModelcan auto-pull the provider into the dependency graph (mirroring the C++_nl_paramsbookkeeping inParameterStore.cxx::resolve_tensor_name).tail_indexis the zero-based slot of this parameter inside the*nl_paramspack passed toModel._get_param(). Promoted parameters are appended toinput_specin declaration order, immediately after the fixed structural inputs, so this index is simply the number of parameters already promoted when this one was declared.For mode 4 (no provider — pure input promotion),
providerisNone.
- neml2.models.model.SecondOrderChainRuleAction¶
alias of
Callable[[…],TensorWrapper]
- neml2.models.model.SecondOrderChainRuleDict¶
- neml2.models.model.TangentAction¶
alias of
TensorWrapper
- neml2.models.model.register_submodule(parent, child, fallback, *, used=None)[source]¶
Add child to parent under its HIT block name if available.
The factory stamps
_hit_nameon every object it constructs; preferring that name over an opaque attribute slot keepsnamed_parameters()readable (elasticity.Einstead of_residual_model.E). Falls back to fallback when the HIT name is missing (direct Python construction), is not a valid Python identifier, would collide with an existing attribute on parent, or is already in used (when a parent registers several children in one pass and must avoid collisions across siblings).Returns the attribute name the child was registered under.