Artificial Intelligence 9 min read

Online Learning with Alink Model Flow: From Fundamentals to Model Flow 1.0 and 2.0

This article introduces Alibaba's Alink platform and its online learning capabilities, discusses common challenges in machine‑learning pipelines, explains Alink’s algorithm‑to‑application connection, various computation modes, usage methods, and details the evolution from Model Flow 1.0 to the more versatile Model Flow 2.0, including pipeline integration, incremental training, and embedding prediction services.

DataFunSummit
DataFunSummit
DataFunSummit
Online Learning with Alink Model Flow: From Fundamentals to Model Flow 1.0 and 2.0

The presentation begins with an overview of the challenges faced in practical machine‑learning projects, such as algorithm vs. engineering issues, transitioning experiments to production, feature‑engineer and model version management, and the need for timely model updates in online learning scenarios.

Alink is introduced as a unified platform that connects algorithms to applications, offering over 700 algorithm components, open‑source on GitHub, and supporting multiple invocation methods (Java library, PyAlink, WebUI). It supports four computation modes: Flink batch (BatchOp), Flink streaming (StreamOp), local batch (LocalOp), and local prediction components.

Usage methods are described, including Java APIs, PyAlink, and a visual WebUI for building pipelines. The article then details Model Flow 1.0, which streams models via Kafka and updates predictions in real time using Flink, with examples of FTRL training and prediction.

Model Flow 2.0 extends the concept to incorporate offline batch and incremental training, allowing both batch and streaming components to feed models into the flow, supporting pipeline models for consistent feature‑engineering and model stages, and enabling automatic model updates in prediction services.

Incremental training is illustrated, showing how the latest model from the model flow is used as a starting point, stored in a filesystem, and updated incrementally. Pipeline models are demonstrated by combining standardization, one‑hot encoding, feature merging, and logistic regression into a single deployable model.

Embedding prediction services are explained, requiring the model path and schema information, and automatically loading updated models from the model flow during inference.

The Q&A section confirms that Alink 2.0 is open‑source and can be used within Alibaba Cloud PAI, providing links to the GitHub repository, documentation, tutorials, and community resources.

machine learningFlinkpipelineonline learningincremental trainingAlinkmodel flow
DataFunSummit
Written by

DataFunSummit

Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.