Deterministic material model calibration

This is a complete example of using the pyzag bindings to NEML2 to calibrate a material model against experimental data. demo_model.i defines the constitutive model, which is a structural material model describing the evolution of strain and stress in the material under mechanical load. The particular demonstration is a fairly complex model where the material responds differently as a function of both temperature and strain rate. The material deformation is driven by an axial strain and all the other stress components are zero.

In this example we:

  1. Load the model in from an input file and wrap it for use in pyzag

  2. Setup a grid of “experimental” conditions spanning several strain rates and temperatures

  3. Replace the original model parameters with samples from a narrow normal distrubtion, centered on the orignial model mean, and run the model over the experimental conditions. This then becomes our synthetic input data.

  4. Replace the original model parameters with random initial guesses (taken from a very wide normal distribution around the true values).

  5. Setup the model calibration with a gradient-descent method by scaling the model parameters and resulting gradient values.

  6. Calibrate the model against the synthetic experimental data.

  7. Plot the results and print the calibrated parameter values, to see how close we can come to the true values.

The accuracy of the final model and the calibrated parameter values is heavily dependent on the choice of the normal distributions for the synthetic data and the initial parameter guesses. For narrow distributions for both the model can exactly recover the original parameter values. For wider distributions the calibrated model will not be exact, but will still accurately capture the mean of the synthetic tests.

import torch
import torch.distributions as dist
import neml2
from pyzag import nonlinear, reparametrization, chunktime
import matplotlib.pyplot as plt
from matplotlib.lines import Line2D

Setup the synthetic experimental conditions

Setup the loading conditions for the “experiments” we’re going to run. These will span several strain rates (nrate) and temperatures (ntemperature). Overall, we’ll run nbatch experiments. Also setup the maximum strain to pull the material through max_strain and the number of time steps we’re going to use for integration ntime.

nrate = 5
ntemperature = 5
nbatch = nrate * ntemperature
max_strain = 0.25
ntime = 100
rates = torch.logspace(-6, 0, nrate, device=device)
temperatures = torch.linspace(310.0, 1190.0, ntemperature, device=device)

Define the variability in the synthetic data and for our initial guess at the parameters

These control the variability in the synthetic data (actual_cov) and the variability of the initial guess at the parameter values (guess_cov)

actual_cov = 0.025
guess_cov = 0.2

Setup the actual model

This class is a thin wrapper around the underlying pyzag wrapper for NEML2. All it does is take the input conditions (time, temperature, and strain), combine them into a single tensor, call the pyzag wrapper, and return the stress.

class SolveStrain(torch.nn.Module):
    """Just integrate the model through some strain history

    Args:
        discrete_equations: the pyzag wrapped model
        nchunk (int): number of vectorized time steps
        rtol (float): relative tolerance to use for Newton's method during time integration
        atol (float): absolute tolerance to use for Newton's method during time integration
    """

    def __init__(self, discrete_equations, nchunk=1, rtol=1.0e-6, atol=1.0e-4):
        super().__init__()
        self.discrete_equations = discrete_equations
        self.nchunk = nchunk
        self.cached_solution = None
        self.rtol = rtol
        self.atol = atol

    def forward(self, time, temperature, loading, cache=False):
        """Integrate through some time/temperature/strain history and return stress
        Args:
            time (torch.tensor): batched times
            temperature (torch.tensor): batched temperatures
            loading (torch.tensor): loading conditions, which are the input strain in the first base index and then the stress (zero) in the remainder

        Keyword Args:
            cache (bool): if true, cache the solution and use it as a predictor for the next call.
                This heuristic can speed things up during inference where the model is called repeatedly with similar parameter values.
        """
        if cache and self.cached_solution is not None:
            solver = nonlinear.RecursiveNonlinearEquationSolver(
                self.discrete_equations,
                step_generator=nonlinear.StepGenerator(self.nchunk),
                predictor=nonlinear.FullTrajectoryPredictor(self.cached_solution),
                nonlinear_solver=chunktime.ChunkNewtonRaphson(rtol=self.rtol, atol=self.atol),
            )
        else:
            solver = nonlinear.RecursiveNonlinearEquationSolver(
                self.discrete_equations,
                step_generator=nonlinear.StepGenerator(self.nchunk),
                predictor=nonlinear.PreviousStepsPredictor(),
                nonlinear_solver=chunktime.ChunkNewtonRaphson(rtol=self.rtol, atol=self.atol),
            )

        # We could pass this in as input, but it's easy enough to do here
        control = torch.zeros_like(loading)
        control[..., 1:] = 1.0

        # Build the per-force typed-wrapper dict and pack into a SparseVector
        # whose layout matches the pyzag-wrapped model's force layout. v3
        # SparseVector takes a dict keyed by variable name (the v2 list-by-
        # fvars-order pattern was retired). ``.data`` on the assembled
        # typed Tensor surfaces the raw torch.Tensor that pyzag wants.
        forces = {
            "t": neml2.Scalar(time, 0),
            "temperature": neml2.Scalar(temperature, 0),
            "fixed_values": neml2.SR2(loading, 0),
            "control": neml2.SR2(control, 0),
        }
        forces = (
            neml2.SparseVector(self.discrete_equations.flayout, forces).assemble().tensors[0].data
        )
        state0 = torch.zeros(
            forces.shape[1:-1] + (self.discrete_equations.nstate,), device=forces.device
        )

        result = nonlinear.solve_adjoint(solver, state0, len(forces), forces)

        if cache:
            self.cached_solution = result.detach().clone()

        return result[..., 0:1]

Actually setup the model

Load the NEML model from disk, wrap it in both the pyzag wrapper and our thin wrapper class above. Exclude some of the model parameters we don’t want to calibrate.

We also call neml2.compile on the pyzag-wrapped model. This JIT-compiles the residual/Jacobian evaluation with torch.compile in-process (no AOT export or C++ round-trip), so each Newton and adjoint step in the calibration loop runs a fused compiled graph instead of eager Python. It is a transparent, in-place acceleration — the model is used exactly as before, and gradients still flow for adjoint training.

nmodel = neml2.load_nonlinear_system("demo_model.i", "eq_sys")
nmodel.to(device=device)
pmodel = neml2.pyzag.NEML2PyzagModel(
    nmodel,
    exclude_parameters=[
        # Elasticity coefficients are baked-in physical constants for
        # this demo; we don't calibrate them.
        "elasticity_E",
        "elasticity_nu",
        # Alternate yield surface used internally; its sy=0 literal
        # would also break Normal sampling below if not excluded.
        "yield_zero_sy",
        # Interpolation table abscissae are fixed (temperature knots);
        # only the ordinates are calibration targets.
        "R_abscissa",
        "d_abscissa",
        "mu_abscissa",
        # mu_ordinate is the known shear modulus, not calibrated.
        "mu_ordinate",
        # SR2LinearCombination's (+1, -1, 0) weights/offset are
        # definitional; the offset = 0 also breaks Normal sampling.
        "Eerate_weight_0",
        "Eerate_weight_1",
        "Eerate_offset",
    ],
)

# Accelerate the residual + Jacobian evaluation with in-process torch.compile
# (no AOTI artifact / C++ round-trip). pyzag owns the Newton solve and the
# adjoint; neml2.compile JIT-compiles the feed-forward residual model so every
# solve and adjoint step runs a fused compiled graph. Compilation happens on the
# first call and is reused across iterations, which speeds up the calibration
# loop below substantially (≈8x on GPU here). Autograd flows through the compiled
# graph, so adjoint training is unaffected.
neml2.compile(pmodel)

model = SolveStrain(pmodel)

Create the input tensors

Actually setup the full input tensors based on the parameters above

time = torch.zeros((ntime, nrate, ntemperature), device=device)
loading = torch.zeros((ntime, nrate, ntemperature, 6), device=device)
temperature = torch.zeros((ntime, nrate, ntemperature), device=device)
for i, rate in enumerate(rates):
    time[:, i] = torch.linspace(0, max_strain / rate, ntime, device=device)[:, None]
loading[..., 0] = torch.linspace(0, max_strain, ntime, device=device)[:, None, None]
for i, T in enumerate(temperatures):
    temperature[:, :, i] = T
time = time.reshape((ntime, -1))
temperature = temperature.reshape((ntime, -1))
loading = loading.reshape((ntime, -1, 6))

Replace the model parameters with random values

Sampled from a normal distribution controlled by the actual_cov parameter.

This controls the randomness in the input synthetic test data

# Replace with samples from normal
actual_parameter_values = {}
for n, p in model.named_parameters():
    actual_parameter_values[n] = p.data.detach().clone().cpu()
    ndist = dist.Normal(p.data, torch.abs(p.data) * actual_cov).expand((nbatch,) + p.shape)
    p.data = ndist.sample().to(device)

Run the model to generate the synthetic data

with torch.no_grad():
    data = model(time, temperature, loading)

Plot the synthetic data

plt.plot(loading.cpu()[..., 0], data[..., 0].cpu())
plt.xlabel("Strain (mm/mm)")
plt.ylabel("Stress (MPa)")
plt.title("Input data -- all conditions")
plt.show()
../../../../_images/aac03d844b7766a87bbdce542fab2b6b0f452ec25b6b2fc34f1419f6b6c633d5.png

Setup the model for training

Replace the parameter values with random initial guesses, with variability controlled by the guess_cov parameter.

# Now replace our original parameter with random values over a range
guess_parameter_values = {}
for n, p in model.named_parameters():
    p.data = torch.normal(
        actual_parameter_values[n], torch.abs(actual_parameter_values[n]) * guess_cov
    ).to(device)
    guess_parameter_values[n] = p.data.detach().clone()

Scale the model parameters

Our material model parameters have units. In general, the parameter values will have different magnitudes from each other, which affects the scale of the gradients. Unbalanced gradients in turn affect the convergence of gradient descent optimization methods.

Typically we’d scale the training data to fix this problem. However, again our data has units and a physical meaning we want to preserve.

As an alternative we can scale the parameter values themselves both to clip the values to a physical range and to scale the gradients and hopefully improve the convergence of the optimization step. We do that here, in a way that should be mostly invisible to the training algorithms.

# Scale to get better performance
A_scaler = reparametrization.RangeRescale(
    torch.tensor(-12.0, device=device), torch.tensor(-4.0, device=device)
)
B_scaler = reparametrization.RangeRescale(
    torch.tensor(-1.0, device=device), torch.tensor(-0.5, device=device)
)
C_scaler = reparametrization.RangeRescale(
    torch.tensor(-8.0, device=device), torch.tensor(-3.0, device=device)
)
R_scaler = reparametrization.RangeRescale(
    torch.tensor([0.0, 0.0, 0.0, 0.0], device=device),
    torch.tensor([500.0, 500.0, 500.0, 500.0], device=device),
)
d_scaler = reparametrization.RangeRescale(
    torch.tensor([0.01, 0.01, 0.01, 0.01], device=device),
    torch.tensor([50.0, 50.0, 50.0, 50.0], device=device),
)

# v3 names: VoceIsotropicHardening's saturated_hardening / saturation_rate
# come from ScalarLinearInterpolation, so the calibration knobs are the
# ordinates of those tables (R_ordinate / d_ordinate). The v2 fixture
# called them R_Y / d_Y.
model_reparameterizer = reparametrization.Reparameterizer(
    {
        "discrete_equations.A_value": A_scaler,
        "discrete_equations.B_value": B_scaler,
        "discrete_equations.C_value": C_scaler,
        "discrete_equations.R_ordinate": R_scaler,
        "discrete_equations.d_ordinate": d_scaler,
    },
    error_not_provided=True,
)
model_reparameterizer(model)

Run the model with the initial parameter values

Just to see how far away from the training data we starting from.

# Generate the initial results so we know where we are starting from
with torch.no_grad():
    initial_results = model(time, temperature, loading)
plt.plot(loading.cpu()[..., 0], data.cpu()[..., 0], "k-")
plt.plot(loading.cpu()[..., 0], initial_results.cpu()[..., 0], "k--")
plt.xlabel("Strain (mm/mm)")
plt.ylabel("Stress (MPa)")
plt.title("Initial comparison")
handles = [
    Line2D([], [], linestyle="-", color="k", label="Data"),
    Line2D([], [], linestyle="--", color="k", label="Initial results"),
]
plt.legend(handles=handles)
plt.show()
../../../../_images/93bc972f8cca7e724db270bc2a558998f7723bd902f09a0aacb232d5bcc1af9f.png

Calibrate the model against the synthetic data

Apply a fairly standard gradient-descent algorithm to adjust the model parameters to better match the synthetic data. Plot the loss versus iteration data from training.

niter = 200
lr = 5.0e-3
optimizer = torch.optim.Adam(model.parameters(), lr=lr)
loss_fn = torch.nn.MSELoss()

# Print a status line every ``print_every`` iterations (plus the first and the
# last) instead of a live progress bar. A tqdm-style bar redraws itself in place
# with carriage returns, which a non-interactive execution (nbconvert / the docs
# build) captures as hundreds of repeated frames; periodic prints keep the
# committed notebook output clean and readable.
print_every = 20
loss_history = []
for i in range(niter):
    optimizer.zero_grad()
    res = model(time, temperature, loading, cache=True)
    loss = loss_fn(res, data)
    loss.backward()
    loss_history.append(loss.detach().clone().cpu())
    if i % print_every == 0 or i == niter - 1:
        print(f"Iteration {i + 1:4d}/{niter}    loss = {loss_history[-1]:.3e}")
    optimizer.step()

plt.loglog(loss_history, label="Training")
plt.xlabel("Iteration")
plt.ylabel("MSE")
plt.legend(loc="best")
plt.title("Loss history")
plt.show()
Iteration    1/200    loss = 2.835e+04
Iteration   21/200    loss = 1.242e+04
Iteration   41/200    loss = 1.047e+04
Iteration   61/200    loss = 9.665e+03
Iteration   81/200    loss = 9.441e+03
Iteration  101/200    loss = 9.372e+03
Iteration  121/200    loss = 9.347e+03
Iteration  141/200    loss = 9.333e+03
Iteration  161/200    loss = 9.323e+03
Iteration  181/200    loss = 9.315e+03
Iteration  200/200    loss = 9.309e+03
../../../../_images/db9498d9edc7fcc8368ec718ebff02c223c8e7605be2c0d92bc4e25d95132471.png

Plot the calibrated model predictions

See how accurately the calibrated model recovers the synthetic data.

plt.plot(loading.cpu()[..., 0], data.cpu()[..., 0], "k-")
plt.plot(loading.cpu()[..., 0], res.detach().cpu()[..., 0], "k--")
plt.xlabel("Strain (mm/mm)")
plt.ylabel("Stress (MPa)")
plt.title("Final comparison")
plt.show()
../../../../_images/90d4ff90a637895c707452740a1b8610bbd92dacf8256306cadaed0c26673ada.png