template<typename I, typename O, typename Of = typename std::vector<O>, typename Ip = typename type_identity<I>::type, typename Op = typename type_identity<O>::type>
class neml2::WorkDispatcher< I, O, Of, Ip, Op >
The work dispatcher who dispatches work to a worker and reduces the results.
- Warning
- The dispatcher is designed to be thread safe, but we are currently seeing some issues when the dispatcher interacts with torch::jit::tracer. We do not recommend using the dispatcher with torch::jit::tracer until the issue is resolved.
The work dispatcher coordinates with WorkGenerator and WorkScheduler to dispatch work. The work is generated/loaded by the WorkGenerator; the dispatch is scheduled by a WorkScheduler; the dispatching loop is managed by the WorkDispatcher.
The dispatcher also takes care of preprocessing, postprocessing, and reducing the work. In general, each work dispatch involves four steps:
- Work generation: The work generator generates the next
n
batches of work.
- Preprocessing: The dispatcher preprocesses the work.
- Do work: The worker performs the work.
- Postprocessing: The dispatcher postprocesses the result.
Once all the work has been completed and results have been collected, the dispatcher reduces the results to obtain the final result.
Notes on threading: The dispatcher can run in synchronous or asynchronous mode.
- In synchronous mode, the dispatcher runs in the main thread and dispatches work sequentially. No additional threads are created.
- In asynchronous mode, the dispatcher creates a thread pool where each thread is continuously monitoring the task queue. The main thread adds work to the task queue, and the threads in the pool pick up the task and execute it. The main thread waits for all the work to complete before reducing the results.
Notes on coordination with the scheduler: The dispatcher communicates with the scheduler to schedule work and to notify the scheduler when work has been dispatched and completed.
- In synchronous mode, the dispatcher does not notify the scheduler when work has been dispatched (since the work is dispatched sequentially).
- In asynchronous mode, the dispatcher notifies the scheduler when work has been dispatched (i.e. a task added to the task queue). When the task is completed, the worker notifies the scheduler about work completion.
Notes on thread-device binding: Currently, the implementation assumes that each thread in the thread pool is binded to one device. Based on this assumption, dispatching work to a device is equivalent to dispatching work to a thread, which greatly simplifies the communication between the threads and the task queue. This assumption could be relaxed in the future based on profiling evidence showing that multiple threads dispatching work to the same device has certain advantage.
- Template Parameters
-
I | Input type of the preprocessed work (generated by the generator) |
O | Output type of the result returned by the worker |
Of | Output type of the final result (after reduction) |
Ip | Input type of the work before preprocessing |
Op | Output type of the result after postprocessing |