Backend Development 18 min read

Design and Evolution of an E-commerce Batch Processing System

The article traces the evolution of an e‑commerce batch‑processing system—from an initial centralized workflow with reusable components, through a platform‑driven configuration and SPI registration, to a localized, asynchronous task‑reporting architecture employing priority queues and isolated thread pools—to balance flexibility, scalability, and operational risk.

DeWu Technology

Oct 28, 2024

Design and Evolution of an E-commerce Batch Processing System

In e‑commerce platforms, the 80/20 rule leads a small number of high‑value merchants to generate most sales, creating a strong demand for batch operations to manage large volumes of orders, products, and pricing.

Batch operations enable merchants to modify many items quickly (e.g., price adjustments, promotion changes), improving efficiency, reducing errors, and allowing focus on strategic tasks.

The article follows a developer “Xiao Wang” through three evolutionary stages of a batch‑processing system: centralized workflow extension, platform‑based configuration registration, and finally a localized task‑reporting approach.

1. Centralized – workflow extension

Initially, a simple two‑day implementation added a batch‑price‑adjustment flow. Code reuse was introduced by extracting common steps (file download, parsing, result saving) into reusable components and exposing extension points for business‑specific logic.

@Component
public class BpcProcessHandlerFactory {
    @Autowired
    private ApplicationContext applicationContext;
    private static ConcurrentHashMap<String, BpcProcessDefine> templateMap = new ConcurrentHashMap<>();
    @PostConstruct
    private void init() {
        Map<String, ImportService> importServiceMap = applicationContext.getBeansOfType(ImportService.class);
        for (ImportService importService : importServiceMap.values()) {
            initImportService(importService);
        }
    }
    private void initImportService(ImportService importService) { /* ... */ }
    public BpcProcessHandler getBpcProcessHandler(String templateCode) {
        if (StringUtils.isBlank(templateCode) || !templateMap.containsKey(templateCode)) {
            return null;
        }
        return templateMap.get(templateCode).newProcessHandler();
    }
}

A simplified service then orchestrates the process:

@Service
public class BpcProcessService {
    @Autowired
    private BpcProcessHandlerFactory bpcProcessHandlerFactory;
    public String doBpcProcess(BpcProcessReq req) throws BpcProcessException {
        BpcProcessHandler handler = bpcProcessHandlerFactory.getBpcProcessHandler(req.getTaskTemplateCode());
        if (handler == null) {
            throw new BpcProcessException("Template not found");
        }
        createTask();
        downloadFromOss();
        int loopCnt = 0;
        int maxLoopCnt = handler.getMaxLoopCnt();
        while (loopCnt++ < maxLoopCnt) {
            handler.process();
            updateTaskProcess();
        }
        updateTaskStatus();
        return taskId;
    }
}

2. Platform – configuration registration

To avoid hard‑coding SPI information, a configuration‑driven approach was adopted. Each batch template now includes Excel format, SPI details, and field mappings. An IDE plugin assists developers in uploading these configurations with a single annotation.

Invocation of SPI implementations uses Dubbo generic calls:

@Override
public String invoke(ServiceDefinition serviceDefinition, Object inputParam) {
    GenericService genericService = DubboConfig.buildService(serviceDefinition.getInterfaceName(), serviceDefinition.getTimeout());
    String[] parameterTypes = new String[]{serviceDefinition.getRequestType().getClassName()};
    Object[] args = new Object[]{inputParam};
    long startTime = System.currentTimeMillis();
    try {
        log.info("invoke service={}#{} with request={}", serviceDefinition.getInterfaceName(),
                 serviceDefinition.getMethod(), JSON.toJSONString(args));
        Object result = genericService.$invoke(serviceDefinition.getMethod(), parameterTypes, args);
        long endTime = System.currentTimeMillis();
        digestLog(serviceDefinition, true, endTime - startTime);
        log.info("invoke service={}#{} with result={}", serviceDefinition.getInterfaceName(),
                 serviceDefinition.getMethod(), JSON.toJSONString(result));
        Map resultMap = JSON.parseObject(JSON.toJSONString(result), Map.class);
        processError(resultMap);
        return JSON.toJSONString(resultMap.get("data"));
    } catch (Exception ex) {
        long endTime = System.currentTimeMillis();
        digestLog(serviceDefinition, false, endTime - startTime);
        log.info("failed to dubbo invoke: " + serviceDefinition.getInterfaceName() + "#" +
                 serviceDefinition.getMethod() + " with error " + ex.getMessage());
        throw new DependencyException(ErrorCodeEnum.DEFAULT_DEPENDENCY_ERROR.getCode(), ex.getMessage(), ex);
    }
}

The execution flow mirrors the earlier version, but the business logic is now plugged in via configuration‑defined SPI.

3. Localization – task reporting

When task volume grew, a single‑node batch system became a bottleneck, leading to OOM incidents. Simple rate‑limiting proved insufficient, so the design shifted to asynchronous scheduling with isolated resources.

Isolation strategies include physical (cluster or machine level) and logical (separate thread pools). The final solution combines multi‑level priority queues with thread‑pool isolation.

Scheduling service example:

@Service
@Slf4j
public class TaskScheduleServiceImpl implements TaskScheduleService {
    @Override
    @LogAnnotation
    public void schedule(int shared, int all) {
        StopWatch stopWatch = new StopWatch();
        stopWatch.start();
        List<Long> highTaskIds = taskInstanceRepository.queryUnstartedTaskIdByPriority(TaskPriorityEnum.HIGH, all * arkConfig.highSize);
        highTaskIds = highTaskIds.stream().filter(id -> id % all == shared).collect(Collectors.toList());
        log.info("High priority task IDs = {}", highTaskIds);
        process(highTaskIds, id -> taskThreadPool.executeHigh(() -> process(id)));
        // similar handling for medium and low priority tasks
        log.info("Scheduling completed, cost = {}", stopWatch.getTime());
    }
    private void process(List<Long> idList, Consumer<Long> consumer) {
        if (CollectionUtils.isEmpty(idList)) return;
        for (Long id : idList) {
            consumer.accept(id);
        }
    }
    private void process(Long id) {
        // task handling logic
    }
}

Further refinements added sharding parameters, environment‑specific routing, and a lightweight SDK to move most file‑handling logic out of the batch center, turning it into a simple monitoring UI.

The article concludes that batch‑processing systems evolve from tightly coupled extensions to configurable platforms and finally to localized reporting, each step balancing flexibility, scalability, and operational risk.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Java Scalability configuration System Design

Written by

DeWu Technology

A platform for sharing and discussing tech knowledge, guiding you toward the cloud of technology.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.