NEML2's broadcasting rules

NEML's broadcasting rules are built on top of NumPy's general broadcasting rules.

When operating on two tensors, NEML2 compares each dimension's size pair one-by-one, from right to left. Before the comparison, the shapes of two tensors are aligned at the final batch dimension. The comparison starts with the trailing (i.e., rightmost) batch dimension and works its way left. Two batch dimensions are compatible (broadcastable) when

Their sizes are equal,
One of them has size one, or
One of them does not exist.

If all dimensions are compatible, the two tensors are batch-broadcastable. Many operations in NEML2 require operands to be batch-broadcastable, and if not, an exception will be thrown (typically only in Debug mode).

The operands do not need to have the same batch dimension (per the third rule) in order to be batch-broadcastable. After broadcasting, each dimension of the operands are effectively expanded to have the same size as larger size of the two.

Note that we have not yet mentioned anything about the broadcasting rules for base dimensions, because the requirement for operands' base shapes largely depend on the operation itself. And for all primitive tensor types, since the base shapes are determined statically, no shape check is needed at runtime.

Example

The utility of broadcasting is best demonstrated with practical examples.

Suppose we are given two samples made of different materials, and for each sample, we are given a number of strain measurements at different locations. Assuming both materials are homogeneous, we could utilize NEML2's broadcasting rules to perform the constitutive update extremely efficiently.

#include <torch/torch.h>

#include "neml2/tensors/SSR4.h"

#include "neml2/tensors/SR2.h"

#include "neml2/tensors/Scalar.h"

int

main()

{

using namespace neml2;

set_default_dtype(kFloat64);

// Number of samples

Size ns = 2;

// Number of strain measurements

Size nm = 1000;

// Elasticity tensor of the two materials

auto youngs_modulus = Scalar::create({1e5, 2e5});

auto poissons_ratio = Scalar::create({0.1, 0.2});

auto C = SSR4::isotropic_E_nu(youngs_modulus, poissons_ratio);

// (Fake) strain measurements

auto strain = SR2(torch::rand({nm, ns, 6}, kFloat64));

// Perform the constitutive update

auto stress = C * strain;

// Do the shapes make sense?

std::cout << " Shape of C:" << C.batch_sizes() << C.base_sizes() << std::endl;

std::cout << "Shape of strain: " << strain.batch_sizes() << strain.base_sizes() << std::endl;

std::cout << "Shape of stress: " << stress.batch_sizes() << stress.base_sizes() << std::endl;

}

neml2::PrimitiveTensor< Scalar >::create
static Scalar create(const TensorDataContainer &data, const TensorOptions &options=default_tensor_options())
Definition PrimitiveTensor.h:158

neml2::SR2
The symmetric second order tensor.
Definition SR2.h:46

neml2::SSR4::isotropic_E_nu
static SSR4 isotropic_E_nu(const Scalar &E, const Scalar &nu)
Create the fourth order elasticity tensor given the Young's modulus and the Poisson's ratio.
Definition SSR4.cxx:117

neml2::TensorBase::batch_sizes
const TraceableTensorShape & batch_sizes() const
Return the batch size.
Definition TensorBaseImpl.h:178

neml2::TensorBase::base_sizes
TensorShapeRef base_sizes() const
Return the base size.
Definition TensorBaseImpl.h:198

neml2
Definition DiagnosticsInterface.cxx:30

neml2::kFloat64
constexpr auto kFloat64
Definition types.h:53

neml2::set_default_dtype
void set_default_dtype(Dtype dtype)
Definition defaults.cxx:32

neml2::Size
int64_t Size
Definition types.h:69

Output:
Shape of C:[2][6, 6]

Shape of strain: [1000, 2][6]

Shape of stress: [1000, 2][6]
import torch

from neml2.tensors import Scalar, SR2, SSR4

# Number of samples

ns = 2

# Number of strain measurements

nm = 1000

# Elasticity tensor of the two materials

youngs_modulus = Scalar(torch.tensor([1e5, 2e5], dtype=torch.float64))

poissons_ratio = Scalar(torch.tensor([0.1, 0.2], dtype=torch.float64))

C = SSR4.isotropic_E_nu(youngs_modulus, poissons_ratio)

# (Fake) strain measurements

strain = SR2(torch.rand(nm, ns, 6, dtype=torch.float64))

# Perform the constitutive update

stress = C * strain

# Do the shapes make sense?

print(" Shape of C:", C.batch.shape, C.base.shape)

print("Shape of strain:", strain.batch.shape, strain.base.shape)

print("Shape of stress:", stress.batch.shape, stress.base.shape)

Output:
Shape of C: (2,) (6, 6)

Shape of strain: (1000, 2) (6,)

Shape of stress: (1000, 2) (6,)

To better understand broadcasting, let us try to apply the broadcasting rules manually:

Align the shapes at the final batch dimension (i.e., align the semicolons):
Shape of C: (2; 6, 6)

Shape of strain: (1000, 2; 6)
Examine batch sizes from right to left:
Shape of C: (2; 6, 6)

Shape of strain: (1000, 2; 6)

▲

The sizes are equal – the two operands are compatible at the final batch dimension.
Continue to the next batch dimension:
Shape of C: (2; 6, 6)

Shape of strain: (1000, 2; 6)

▲

One of the size does not exist – the two operands are compatible at the second last batch dimension.
All batch dimensions are compatible, therefore the two operands are batch-broadcastable.
The resulting batch shapes are effectively the "expand" of the two:
Shape of C: (1000, 2; 6, 6)

▲

expanded (without copying)

Shape of strain: (1000, 2; 6)

Previous	Next
Tensor view	Tensors

Table of Contents

NEML2's broadcasting rules

Example