Building an Algorithm Platform: Deployment Challenges, Feature Processing, and Serviceization
The article describes how Ctrip's algorithm platform was built in three stages to address deployment friction, reusable feature engineering, and model training, detailing the technical problems, Java/Python integration, code interfaces, transform configurations, and the eventual service‑oriented architecture.
The author Zhang Zhongyuan shares experience from Ctrip's algorithm platform built for machine learning services, originally presented at a Ctrip tech salon.
The platform construction is divided into three stages: (1) solving algorithm deployment to reduce cross‑team friction, (2) building a reusable feature set, (3) creating a model training system for rapid evaluation. The article focuses on stage 1.
Deployment problems include lack of standards, language mismatch between Python feature conversion and Java production, duplicated work, versioning, and difficulty monitoring models in production, prompting the need for an automated system that unifies interfaces and supports tracking and debugging.
An algorithm is defined by version, feature resolver, and evaluator. The core interfaces are:
interface Algorithm
{
String getApp();
String getVersion();
FeatureResolver getFeatureResolver();
Evaluator getEvaluator();
ListenableFuture
> eval(Request request);
}
interface AlgorithmFactory {
Algorithm getOrCreate(String app, String version, String filter);
}Model files were first tried with PMML, then switched to Vowpal Wabbit (VW) for performance; a Java implementation loads VW models. Legacy XML‑based DataProc models are also supported, and multiple models can be composed via a thin wrapper.
interface Evaluator
{
ResultValue
eval(F resolved);
}Feature processing is extracted into reusable transforms inspired by Airbnb’s Aerosolve. Features are represented by a FeatureVector and each transform implements a single responsibility. Example configuration for temperature categorization is shown, and the transformation result converts raw temperature into categorical buckets.
# temperature conversion example
category_temperature {
transform: category
keys:[temperature]
output:all
outputKey: $key
outputValue: $category
categories: {
'' : 0
'<10': 1
'<30': 2
'>=30': 3
}
}Common transform types include default value, category normalization, external store lookup, and feature crossing. High‑performance collections from Koloboke/Trove replace standard Java collections, and Hive UDFs expose the transforms to SQL workloads.
The platform was later service‑ified so that business systems can call algorithms like any other service. A debug flag allows request‑level tracing, and logs from all algorithm servers are aggregated for troubleshooting.
AlgorithmBuilder builder = AlgorithmBuilder.create()
.setServer(...)
.setStore(...)
.setLazyInit(true)
.setExecutor(...);
Algorithm algorithm = builder.getOrCreate(app, version, filter);In summary, building an algorithm platform is an iterative engineering effort that integrates feature engineering, model management, and service delivery, aiming to improve productivity while keeping the business at the core.
Ctrip Technology
Official Ctrip Technology account, sharing and discussing growth.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.