Achieving High Availability and Priority Scheduling with Elastic Job Lite in Dual‑Data‑Center Deployments
This article explains how to transform single‑node Elastic Job Lite deployments into high‑availability, dual‑data‑center architectures and implement custom sharding strategies that prioritize execution in a primary site while maintaining failover and optional active‑active traffic distribution.
When using Elastic Job Lite for scheduled tasks, many teams deploy a single node, which is risky for critical jobs; Elastic Job Lite actually supports high availability.
Elastic Job relies on Zookeeper to elect instances for sharding, ensuring only one instance executes a given shard.
Single‑node deployment can be replaced by a distributed setup where any instance can take over if another fails.
Dual‑Data‑Center High Availability
To meet higher availability demands, a same‑city dual‑data‑center architecture can be used, allowing one data center to take over if the other becomes unavailable while keeping a single sharding view.
Priority Scheduling?
When tasks depend on a primary‑secondary data source, writing tasks from the secondary data center can cause cross‑region latency; a strategy is needed to prioritize execution in the primary data center.
Elastic Job Sharding Strategy
Elastic Job provides built‑in sharding strategies and allows custom implementations via the JobShardingStrategy interface and its sharding method.
public Map
> sharding(List
jobInstances, String jobName, int shardingTotalCount)A decorator can filter standby instances before delegating to the default average allocation strategy.
public abstract class JobShardingStrategyActiveStandbyDecorator implements JobShardingStrategy {
private JobShardingStrategy inner = new AverageAllocationJobShardingStrategy();
protected abstract boolean isStandby(JobInstance jobInstance, String jobName);
@Override
public Map
> sharding(List
jobInstances, String jobName, int shardingTotalCount) {
// ... (code omitted for brevity)
}
}Implementations can define active IPs and treat others as standby, achieving locality‑aware scheduling.
public class ActiveStandbyESJobStrategy extends JobShardingStrategyActiveStandbyDecorator {
@Override
protected boolean isStandby(JobInstance jobInstance, String jobName) {
String activeIps = "10.10.10.1,10.10.10.2";
if ("TASK_B_FIRST".equals(jobName)) {
activeIps = "10.11.10.1,10.11.10.2";
}
return !Arrays.asList(activeIps.split(",")).contains(jobInstance.getIp());
}
}When configuring a job, the custom strategy is specified:
JobCoreConfiguration core = JobCoreConfiguration.newBuilder(jobClass.getName(), cron, shardingTotalCount)
.shardingItemParameters(shardingItemParameters).build();
SimpleJobConfiguration jobConfig = new SimpleJobConfiguration(core, jobClass.getCanonicalName());
LiteJobConfiguration liteConfig = LiteJobConfiguration.newBuilder(jobConfig)
.jobShardingStrategyClass("com.xxx.yyy.job.ActiveStandbyESJobStrategy")
.build();This setup provides high availability across two data centers and allows priority execution in the designated primary data center, with the possibility to extend to active‑active traffic distribution.
Architect
Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.