Design and Evolution of an E-commerce Batch Processing System
The article traces the evolution of an e‑commerce batch‑processing system—from an initial centralized workflow with reusable components, through a platform‑driven configuration and SPI registration, to a localized, asynchronous task‑reporting architecture employing priority queues and isolated thread pools—to balance flexibility, scalability, and operational risk.
In e‑commerce platforms, the 80/20 rule leads a small number of high‑value merchants to generate most sales, creating a strong demand for batch operations to manage large volumes of orders, products, and pricing.
Batch operations enable merchants to modify many items quickly (e.g., price adjustments, promotion changes), improving efficiency, reducing errors, and allowing focus on strategic tasks.
The article follows a developer “Xiao Wang” through three evolutionary stages of a batch‑processing system: centralized workflow extension, platform‑based configuration registration, and finally a localized task‑reporting approach.
1. Centralized – workflow extension
Initially, a simple two‑day implementation added a batch‑price‑adjustment flow. Code reuse was introduced by extracting common steps (file download, parsing, result saving) into reusable components and exposing extension points for business‑specific logic.
@Component
public class BpcProcessHandlerFactory {
@Autowired
private ApplicationContext applicationContext;
private static ConcurrentHashMap
templateMap = new ConcurrentHashMap<>();
@PostConstruct
private void init() {
Map
importServiceMap = applicationContext.getBeansOfType(ImportService.class);
for (ImportService importService : importServiceMap.values()) {
initImportService(importService);
}
}
private void initImportService(ImportService importService) { /* ... */ }
public BpcProcessHandler getBpcProcessHandler(String templateCode) {
if (StringUtils.isBlank(templateCode) || !templateMap.containsKey(templateCode)) {
return null;
}
return templateMap.get(templateCode).newProcessHandler();
}
}A simplified service then orchestrates the process:
@Service
public class BpcProcessService {
@Autowired
private BpcProcessHandlerFactory bpcProcessHandlerFactory;
public String doBpcProcess(BpcProcessReq req) throws BpcProcessException {
BpcProcessHandler handler = bpcProcessHandlerFactory.getBpcProcessHandler(req.getTaskTemplateCode());
if (handler == null) {
throw new BpcProcessException("Template not found");
}
createTask();
downloadFromOss();
int loopCnt = 0;
int maxLoopCnt = handler.getMaxLoopCnt();
while (loopCnt++ < maxLoopCnt) {
handler.process();
updateTaskProcess();
}
updateTaskStatus();
return taskId;
}
}2. Platform – configuration registration
To avoid hard‑coding SPI information, a configuration‑driven approach was adopted. Each batch template now includes Excel format, SPI details, and field mappings. An IDE plugin assists developers in uploading these configurations with a single annotation.
Invocation of SPI implementations uses Dubbo generic calls:
@Override
public String invoke(ServiceDefinition serviceDefinition, Object inputParam) {
GenericService genericService = DubboConfig.buildService(serviceDefinition.getInterfaceName(), serviceDefinition.getTimeout());
String[] parameterTypes = new String[]{serviceDefinition.getRequestType().getClassName()};
Object[] args = new Object[]{inputParam};
long startTime = System.currentTimeMillis();
try {
log.info("invoke service={}#{} with request={}", serviceDefinition.getInterfaceName(),
serviceDefinition.getMethod(), JSON.toJSONString(args));
Object result = genericService.$invoke(serviceDefinition.getMethod(), parameterTypes, args);
long endTime = System.currentTimeMillis();
digestLog(serviceDefinition, true, endTime - startTime);
log.info("invoke service={}#{} with result={}", serviceDefinition.getInterfaceName(),
serviceDefinition.getMethod(), JSON.toJSONString(result));
Map resultMap = JSON.parseObject(JSON.toJSONString(result), Map.class);
processError(resultMap);
return JSON.toJSONString(resultMap.get("data"));
} catch (Exception ex) {
long endTime = System.currentTimeMillis();
digestLog(serviceDefinition, false, endTime - startTime);
log.info("failed to dubbo invoke: " + serviceDefinition.getInterfaceName() + "#" +
serviceDefinition.getMethod() + " with error " + ex.getMessage());
throw new DependencyException(ErrorCodeEnum.DEFAULT_DEPENDENCY_ERROR.getCode(), ex.getMessage(), ex);
}
}The execution flow mirrors the earlier version, but the business logic is now plugged in via configuration‑defined SPI.
3. Localization – task reporting
When task volume grew, a single‑node batch system became a bottleneck, leading to OOM incidents. Simple rate‑limiting proved insufficient, so the design shifted to asynchronous scheduling with isolated resources.
Isolation strategies include physical (cluster or machine level) and logical (separate thread pools). The final solution combines multi‑level priority queues with thread‑pool isolation.
Scheduling service example:
@Service
@Slf4j
public class TaskScheduleServiceImpl implements TaskScheduleService {
@Override
@LogAnnotation
public void schedule(int shared, int all) {
StopWatch stopWatch = new StopWatch();
stopWatch.start();
List
highTaskIds = taskInstanceRepository.queryUnstartedTaskIdByPriority(TaskPriorityEnum.HIGH, all * arkConfig.highSize);
highTaskIds = highTaskIds.stream().filter(id -> id % all == shared).collect(Collectors.toList());
log.info("High priority task IDs = {}", highTaskIds);
process(highTaskIds, id -> taskThreadPool.executeHigh(() -> process(id)));
// similar handling for medium and low priority tasks
log.info("Scheduling completed, cost = {}", stopWatch.getTime());
}
private void process(List
idList, Consumer
consumer) {
if (CollectionUtils.isEmpty(idList)) return;
for (Long id : idList) {
consumer.accept(id);
}
}
private void process(Long id) {
// task handling logic
}
}Further refinements added sharding parameters, environment‑specific routing, and a lightweight SDK to move most file‑handling logic out of the batch center, turning it into a simple monitoring UI.
The article concludes that batch‑processing systems evolve from tightly coupled extensions to configurable platforms and finally to localized reporting, each step balancing flexibility, scalability, and operational risk.
DeWu Technology
A platform for sharing and discussing tech knowledge, guiding you toward the cloud of technology.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.