JuliaParallel/Dagger.jl

Implement chaos testing framework

Open

#211 opened on Apr 20, 2021

View on GitHub
 (0 comments) (0 reactions) (0 assignees)Julia (706 stars) (86 forks)batch import
cifault handlinghelp wanted

Description

As the scheduler grows more optimizations, options, and supported features, the combination of configurations that the scheduler needs to handle correctly grows exponentially. We could do ourselves a great service by doing automated testing of random configurations as part of CI. We could also do fault injection if this works out well.

Parameters that would vary:

  • DAG size and shape
  • Anonymous and named function thunks
  • Thunk/Scheduler options
  • Processor types (ThreadProc for now)
  • Checkpointing
  • Dynamic DAG extension and querying
  • SIGINT handling

Some metrics we'd want to test:

  • Correctness
  • Total runtime length
  • Per-process memory usage
  • Caching statistics
  • Network transfer statistics
  • Visualization output

Contributor guide