Ways to evaluate a model¶
A NEML2 model is authored once, in Python — a Model is a plain
torch.nn.Module composed from smaller registered pieces, and Python is the
only authoring surface. Once a model exists, the same model can be evaluated
through several different runtimes, depending on whether you are iterating
interactively, training, or deploying into a host application.
Note
If you are an end user of an application built on NEML2, you evaluate models through whatever interface that application exposes — you do not choose a runtime and can stop reading here. This page is for developers integrating NEML2 into their own Python, C++, or command-line workflow. The deployment guides (Python integration, C++ integration, CLI utilities) cover getting set up; the reference pages linked below cover each route’s evaluation API.
All runtimes operate on the same starting point: a
HIT input file that
names one or more models. The minimal example referenced throughout lives at
tutorials/models/running_your_first_model/input.i:
# Minimal hello-world NEML2 input file.
# A single linear isotropic elastic model named "elasticity":
# E = 200 GPa
# nu = 0.3
# Maps a symmetric strain tensor (SR2) to a symmetric stress tensor (SR2).
[Models]
[elasticity]
type = LinearIsotropicElasticity
coefficients = '200e3 0.3'
coefficient_types = 'YOUNGS_MODULUS POISSONS_RATIO'
[]
[]
At a glance¶
Each runtime has a short codename of the form host-mode (py / cpp
crossed with eager / jit / aoti). Use it in bug reports and discussion
threads to say which path you’re on without a paragraph of description — “this
reproduces on cpp-aoti but not py-eager” is unambiguous.
Every runtime supports forward. They differ on sensitivities and on whether
they accept sub-batch models (e.g. crystal plasticity, which carries a per-slip
inner batch dimension):
Codename |
Entry point |
Compile |
Host |
jvp / jacobian |
Sub-batch |
Primarily for |
|---|---|---|---|---|---|---|
|
|
none |
Python |
✓ |
✓ |
dev, testing, autograd training |
|
|
in-process JIT |
Python |
✓ (native) |
✓ |
pyzag training loops |
|
|
offline |
Python (pybind) |
✓ |
✓ |
compiled model from Python |
|
|
offline |
C++ |
✓ |
✓ |
production C++ deployment |
|
|
offline |
C++ |
✓ |
✓ |
multi-device throughput |
|
|
none |
C++ + embedded Python |
✓ |
✗ |
compile-free C++ tests |
The routes¶
Each route has its own reference page with the loading-and-calling API; the deployment guides cover the setup (install / build / artifacts) that comes first.
Python — set up with Python integration:
py-eager — eager Python — load and call the model directly; the default for development, interactive work, and autograd training.
py-jit — in-process torch.compile —
neml2.compileaccelerates the in-process graph, mainly for pyzag.py-aoti — compiled model from Python — load and run a compiled
.pt2package from Python.
C++ — set up with C++ integration:
cpp-aoti — compiled model from C++ — load a compiled
.pt2package vialibneml2.so.Dispatching across devices — the same artifact, chunked across CPU + GPU(s).
Eager evaluation from C++ — run a model from its
.iwith no compile (for C++ tests).
The three compiled routes (py-aoti, cpp-aoti, cpp-dispatch) share one
artifact — see AOTI packages for its format and
Compilation pipeline for how neml2-compile produces it. The command
line is a fourth way to drive a model with no code at all: CLI utilities.
Choosing a runtime¶
Iterating, debugging, or training in Python →
py-eager. Reach forpy-jit(neml2.compile) only inside a pyzag loop where the residual is the hot path.Deploying into a C++ application →
cpp-aoti. Switch tocpp-dispatchwhen you need to saturate multiple GPUs (or CPU + GPU) with one batched call.Calling a compiled model from Python, or reproducing C++ numerics without a NEML2 source dependency →
py-aoti.C++ tests that can’t pay the compile cost →
cpp-eager.Sub-batch models (crystal plasticity and friends) run on every runtime except
cpp-eager. If you need a compiled sub-batch model in C++, usecpp-aoti.
Runtimes vs. consumers¶
Several entry points are not runtimes themselves — they are consumers layered on one of the runtimes above:
neml2-run <input.i> <driver>and theDriverclasses (TransientDriver,ModelUnitTest,TransientRegression,Verification) step a model through a load history onpy-eager. If the input file names anAOTIModel, the same driver runs onpy-aotiinstead.The pyzag adapter (
NEML2PyzagModel) drives calibration onpy-eager, optionally accelerated topy-jitvianeml2.compile.neml2-inspect <input.i> <model>resolves and prints a model’s input/output graph but does not evaluate it.
The full tool reference is in CLI utilities.
See also¶
Python integration / C++ integration / CLI utilities — set up neml2 in a Python app, a C++ build, or from the shell.
AOTI packages — the compiled-package format shared by the AOTI routes.
Compilation pipeline — what
neml2-compiledoes, stage by stage.Tutorials — end-to-end walkthroughs.
Migration guides — what changed across NEML2 versions.