How can you reliably schedule tasks in an unreliable, autoscaling Cloud environment? In this pres...
How can you reliably schedule tasks in an unreliable, autoscaling Cloud environment? In this presentation, we’ll talk about the design of our scheduler built on top of Apache Mesos that serves as the core of our stream-processing platform, Mantis, designed for real-time insights. We’ll focus on the following aspects of the scheduler:
- Coarse-grained vs. fine-grained resource scheduling - Fault tolerance via a combination of task reconciliation and life cycle event processing - Scheduling optimizations for bin packing, for stream locality to reduce network bandwidth usage, for task placement to achieve auto scaling of the cluster size, etc.
This talk will also include detailed information about approaches to scheduling in a distributed, auto-scaling, environment.