NEML2 2.0.0
All Classes Namespaces Functions Variables Typedefs Enumerations Enumerator Friends Modules Pages
StaticHybridScheduler Class Reference

A scheduler for multiple devices with static priority management. More...

Detailed Description

A scheduler for multiple devices with static priority management.

The devices could have different priority, batch sizes, and capacities. The priorities are determined at construction time and remain unchanged throughout the lifetime of the scheduler.

#include <StaticHybridScheduler.h>

Inheritance diagram for StaticHybridScheduler:

Classes

struct  DeviceStatus
 

Public Member Functions

 StaticHybridScheduler (const OptionSet &options)
 Construct from options.
 
void setup () override
 
void set_availability_calculator (std::function< double(const DeviceStatus &)>)
 Set a custom availability calculator.
 
const std::vector< DeviceStatus > & status () const
 
std::vector< Devicedevices () const override
 Device options.
 
- Public Member Functions inherited from WorkScheduler
 WorkScheduler (const OptionSet &options)
 Construct a new WorkScheduler object.
 
void schedule_work (Device &, std::size_t &)
 Determine the device and batch size for the next dispatch.
 
void dispatched_work (Device, std::size_t)
 Update the scheduler with the dispatch of the last batch.
 
void completed_work (Device, std::size_t)
 Update the scheduler with the completion of the last batch.
 
void wait_for_completion ()
 Wait for all work to complete.
 
- Public Member Functions inherited from NEML2Object
 NEML2Object ()=delete
 
 NEML2Object (NEML2Object &&)=delete
 
 NEML2Object (const NEML2Object &)=delete
 
NEML2Objectoperator= (NEML2Object &&)=delete
 
NEML2Objectoperator= (const NEML2Object &)=delete
 
virtual ~NEML2Object ()=default
 
 NEML2Object (const OptionSet &options)
 Construct a new NEML2Object object.
 
const OptionSetinput_options () const
 
const std::string & name () const
 A readonly reference to the object's name.
 
const std::string & type () const
 A readonly reference to the object's type.
 
const std::string & path () const
 A readonly reference to the object's path.
 
const std::string & doc () const
 A readonly reference to the object's docstring.
 
template<typename T = NEML2Object>
const T * host () const
 Get a readonly pointer to the host.
 
template<typename T = NEML2Object>
T * host ()
 Get a writable pointer to the host.
 

Static Public Member Functions

static OptionSet expected_options ()
 Options for the scheduler.
 
- Static Public Member Functions inherited from WorkScheduler
static OptionSet expected_options ()
 Options for the scheduler.
 
- Static Public Member Functions inherited from NEML2Object
static OptionSet expected_options ()
 

Protected Member Functions

bool schedule_work_impl (Device &, std::size_t &) const override
 Pick the next device to dispatch work to.
 
void dispatched_work_impl (Device, std::size_t) override
 Update the scheduler with the dispatch of the last batch.
 
void completed_work_impl (Device, std::size_t) override
 Update the scheduler with the completion of the last batch.
 
bool all_work_completed () const override
 Check if all work has been completed.
 

Additional Inherited Members

- Protected Attributes inherited from WorkScheduler
std::condition_variable _condition
 Condition variable for the scheduling thread.
 

Constructor & Destructor Documentation

◆ StaticHybridScheduler()

StaticHybridScheduler ( const OptionSet & options)

Construct from options.

Parameters
options

Member Function Documentation

◆ all_work_completed()

bool all_work_completed ( ) const
overrideprotectedvirtual

Check if all work has been completed.

Implements WorkScheduler.

◆ completed_work_impl()

void completed_work_impl ( Device ,
std::size_t  )
overrideprotectedvirtual

Update the scheduler with the completion of the last batch.

Implements WorkScheduler.

◆ devices()

std::vector< Device > devices ( ) const
inlineoverridevirtual

Device options.

Reimplemented from WorkScheduler.

◆ dispatched_work_impl()

void dispatched_work_impl ( Device ,
std::size_t  )
overrideprotectedvirtual

Update the scheduler with the dispatch of the last batch.

Implements WorkScheduler.

◆ expected_options()

OptionSet expected_options ( )
static

Options for the scheduler.

◆ schedule_work_impl()

bool schedule_work_impl ( Device & device,
std::size_t & n ) const
overrideprotectedvirtual

Pick the next device to dispatch work to.

The function returns the device and the number of batches to dispatch. The device is chosen based on the availability of the available devices. A device is said to be available if (load + batch_size) <= capacity. If multiple devices are available, the device with the highest availability will be chosen.

By default, the availability is the device's priority, a custom function can be set using set_availability_calculator().

Implements WorkScheduler.

◆ set_availability_calculator()

void set_availability_calculator ( std::function< double(const DeviceStatus &)> f)

Set a custom availability calculator.

◆ setup()

void setup ( )
overridevirtual

The setup method retrieves from input options a device list, along with the batch sizes, capacities, and priorities for each device.

The device list should be unique and non-empty. kCPU can appear at most once. When multiple cuda devices are present, each of them must correspond to a specific device ID.

One or more batch size should be provided. If the number of batch sizes is one, the same batch size is associated with all devices. Otherwise, the number of batch sizes should match the number of devices.

Similarly, zero or more capacities should be provided. If the capacity list is empty, the default capacities are the same as those used for batch sizes; if the number of capacities is one, the same capacity is associated with all devices; otherwise, the number of capacities should match the number of devices.

An optional list of priorities can be provided. The number of priorities should match the number of devices. If no priorities are provided, all devices have the same priority. Note that this dispatcher chooses the device to dispatch not only based on the priority but also based on the availability of the device. See next() for more details.

Note
For developers, below is a summary of the construct of Device: Device represents a compute device on which a tensor is located. A device is uniquely identified by a type, which specifies the type of machine it is (e.g. CPU or CUDA GPU), and a device index or ordinal, which identifies the specific compute device when there is more than one of a certain type. The device index is optional, and in its defaulted state represents (abstractly) "the current device". Further, there are two constraints on the value of the device index, if one is explicitly stored:
  1. A negative index represents the current device, a non-negative index represents a specific, concrete device,
  2. When the device type is CPU, the device index must be zero.

Reimplemented from NEML2Object.

◆ status()

const std::vector< DeviceStatus > & status ( ) const
inline