Design and Evolution of Juice: A Mesos‑Based Asynchronous Task Scheduling System
The article describes how Hujiang built and iterated a three‑generation asynchronous task scheduling platform—Juice—using Mesos and Docker to achieve on‑demand resource allocation, horizontal scalability, and efficient handling of large multimedia processing workloads.
Introduction
Asynchronous task systems are widely used for long‑running, CPU‑ and memory‑intensive jobs such as video transcoding and scientific computing. Growing video quality and data volume increase task duration, making single‑server solutions insufficient.
Traditional scale‑up approaches (e.g., supercomputers) are costly, so Hujiang adopted a fetch‑join model that splits large jobs into many small ones and runs them in parallel on multiple small servers (scale‑out), which requires sophisticated engineering.
Closed Stage
The first generation (2010) consisted of a Task Center and several Workers, with a simple in‑process ConcurrentLinkedQueue as the message queue. This design could not be horizontally scaled or provide high availability, making it a “closed” system.
Industrial Stage
To enable horizontal scaling, the architecture was refactored into a pipeline model: a Boss receives tasks and pushes them onto a queue, Workers pull tasks, execute them, and push results back. However, this model suffered from inflexible worker scaling and inability to allocate CPU/memory per task.
On‑Demand Allocation Stage
With increasing workload, Hujiang explored an on‑demand model using Apache Mesos, a distributed operating‑system kernel that can schedule resources across hundreds of thousands of machines. By leveraging Mesos’s HTTP API, the new system—named Juice—requests exactly the resources a task needs (e.g., 8 CPU cores, 4 GB RAM, 50 GB storage) and runs the task in a Docker container.
Advantages of this approach include a simple architecture, immediate resource reclamation after task completion, fine‑grained resource allocation per task, and the ability to expand across additional hosts when load spikes.
Juice Architecture
The system is divided into five layers:
API layer – receives external task requests and enqueues them.
Middleware layer – records data and schedules the queue.
Framework layer – calls Mesos‑Master APIs to perform cluster scheduling.
Mesos layer – the official Mesos scheduling framework.
Hardware layer – hosts Mesos‑Agent on each server to execute assigned tasks.
Using Juice
Juice runs only tasks packaged as Docker images. Users submit tasks via a RESTful API (similar to Marathon) specifying the Docker image, and Juice schedules the task onto the appropriate Mesos cluster.
Development Challenges
Early attempts used the Mesos native library, which required a >1 GB Docker image and conflicted with microservice principles. The introduction of Mesos 1.0’s HTTP API allowed a lightweight (~180 MB) implementation without native dependencies.
High availability and task recovery were addressed by launching Juice via Marathon, which ensures a single active scheduler and stores task state in a cache for recovery after agent restarts.
Conclusion
Juice enables on‑demand, fine‑grained resource allocation for asynchronous workloads, reducing waste and contention. Although still early in its open‑source life—lacking a management UI and non‑Docker support—future work will extend its capabilities for both Hujiang and external users.
Hujiang Technology
We focus on the real-world challenges developers face, delivering authentic, practical content and a direct platform for technical networking among developers.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.