Processors

Dagger contains a flexible mechanism to represent CPUs, GPUs, and other devices that the scheduler can place user work on. The indiviual devices that are capable of computing a user operation are called "processors", and are subtypes of Dagger.Processor. Processors are automatically detected by Dagger at scheduler initialization, and placed in a hierarchy reflecting the physical (network-, link-, or memory-based) boundaries between processors in the hierarchy. The scheduler uses the information in this hierarchy to efficiently schedule and partition user operations.

Hardware capabilities, topology, and data locality

The processor hierarchy is modeled as a multi-root tree, where each root is an OSProc, which represents a Julia OS process, and the "children" of the root or some other branch in the tree represent the processors which reside on the same logical server as the "parent" branch. All roots are connected to each other directly, in the common case. The processor hierarchy's topology is automatically detected and elaborated by callbacks in Dagger, which users may manipulate to add detection of extra processors.

Movement of data between any two processors is decomposable into a sequence of "moves" between a child and its parent, termed a "generic path move". Movement of data may also take "shortcuts" between nodes in the tree which are not directly connected if enabled by libraries or the user, which may make use of IPC mechanisms to transfer data more directly and efficiently (such as Infiniband, GPU RDMA, NVLINK, etc.). All data is considered local to some processor, and may only be operated on by another processor by first doing an explicit move operation to that processor.

A move between a given pair of processors is implemented as a Julia function dispatching on the types of each processor, as well as the type of the data being moved. Users are permitted to define custom move functions to improve data movement efficiency, perform automatic value conversions, or even make use of special IPC facilities. Custom processors may also be defined by the user to represent a processor type which is not automatically detected by Dagger, such as novel GPUs, special OS process abstractions, FPGAs, etc.

Future: Network Devices and Topology

In the future, users will be able to define network devices attached to a given processor, which provides a direct connection to a network device on another processor, and may be used to transfer data between said processors. Data movement rules will most likely be defined by a similar (or even identical) mechanism to the current processor move mechanism. The multi-root tree will be expanded to a graph to allow representing these network devices (as they may potentially span non-root nodes).

Redundancy

Fault Tolerance

Dagger has a single means for ensuring redundancy, which is currently called "fault tolerance". Said redundancy is only targeted at a specific failure mode, namely the unexpected exit or "killing" of a worker process in the cluster. This failure mode often presents itself when running on a Linux and generating large memory allocations, where the Out Of Memory (OOM) killer process can kill user processes to free their allocated memory for the Linux kernel to use. The fault tolerance system mitigates the damage caused by the OOM killer performing its duties on one or more worker processes by detecting the fault as a process exit exception (generated by Julia), and then moving any "lost" work to other worker processes for re-computation.

Future: Multi-master, Network Failure Correction, etc.

This single redundancy mechanism helps alleviate a common issue among HPC and scientific users, however it does little to help when, for example, the master node exits, or a network link goes down. Such failure modes require a more complicated detection and recovery process, including multiple master processes, a distributed and replicated database such as etcd, and checkpointing of the scheduler to ensure an efficient recovery. Such a system does not yet exist, but contributions for such a change are desired.