Scheduler Visualization with DaggerWebDash

When working with Dagger, especially when working with its scheduler, it can be helpful to visualize what Dagger is doing internally. To assist with this, a web dashboard is available in the DaggerWebDash.jl package. This web dashboard uses a web server running within each Dagger worker, along with event logging information, to expose details about the scheduler. Information like worker and processor saturation, memory allocations, profiling traces, and much more are available in easy-to-interpret plots.

Using the dashboard is relatively simple and straightforward; if you run Dagger's benchmarking script, it's enabled for you automatically if the BENCHMARK_RENDER environment variable is set to webdash. This is the easiest way to get started with the web dashboard for new users.

For manual usage, the following snippet of code will suffice:

ctx = Context() # or `ctx = Dagger.Sch.eager_context()` for eager API usage
ml = Dagger.MultiEventLog()

## Add some logging events of interest

ml[:core] = Dagger.Events.CoreMetrics()
ml[:id] = Dagger.Events.IDMetrics()
ml[:timeline] = Dagger.Events.TimelineMetrics()
# ...

# (Optional) Enable profile flamegraph generation with ProfileSVG
ml[:profile] = DaggerWebDash.ProfileMetrics()
ctx.profile = true

# Create a LogWindow; necessary for real-time event updates
lw = Dagger.Events.LogWindow(20*10^9, :core)
ml.aggregators[:logwindow] = lw

# Create the D3Renderer server on port 8080
d3r = DaggerWebDash.D3Renderer(8080)

## Add some plots! Rendered top-down in order

# Show an overview of all generated events as a Gantt chart
push!(d3r, GanttPlot(:core, :id, :timeline, :esat, :psat, "Overview"))

# Show various numerical events as line plots over time
push!(d3r, LinePlot(:core, :wsat, "Worker Saturation", "Running Tasks"))
push!(d3r, LinePlot(:core, :loadavg, "CPU Load Average", "Average Running Threads"))
push!(d3r, LinePlot(:core, :bytes, "Allocated Bytes", "Bytes"))
push!(d3r, LinePlot(:core, :mem, "Available Memory", "% Free"))

# Show a graph rendering of compute tasks and data movement between them
# Note: Profile events are ignored if absent from the log
push!(d3r, GraphPlot(:core, :id, :timeline, :profile, "DAG"))

# TODO: Not yet functional
#push!(d3r, ProfileViewer(:core, :profile, "Profile Viewer"))

# Add the D3Renderer as a consumer of special events generated by LogWindow
push!(lw.creation_handlers, d3r)
push!(lw.deletion_handlers, d3r)

# D3Renderer is also an aggregator
ml.aggregators[:d3r] = d3r

ctx.log_sink = ml
# ... use `ctx`

Once the server has started, you can browse to http://localhost:8080/ (if running on your local machine) to view the plots in real time. The dashboard also provides options at the top of the page to control the drawing speed, enable and disable reading updates from the server (disabling freezes the display at the current instant), and a selector for which worker to look at. If the connection to the server is lost for any reason, the dashboard will attempt to reconnect at 5 second intervals. The dashboard can usually survive restarts of the server perfectly well, although refreshing the page is usually a good idea. Informational messages are also logged to the browser console for debugging.