Alibaba Cloud Big Data AI Platform
Author

Alibaba Cloud Big Data AI Platform

The Alibaba Cloud Big Data AI Platform builds on Alibaba’s leading cloud infrastructure, big‑data and AI engineering capabilities, scenario algorithms, and extensive industry experience to offer enterprises and developers a one‑stop, cloud‑native big‑data and AI capability suite. It boosts AI development efficiency, enables large‑scale AI deployment across industries, and drives business value.

457
Articles
0
Likes
598
Views
0
Comments
Recent Articles

Latest from Alibaba Cloud Big Data AI Platform

100 recent articles max
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
May 1, 2026 · Artificial Intelligence

Zero Deployment, Zero Ops: Alibaba Cloud Milvus Embedding Service Makes Vectorization Plug‑and‑Play

The article explains how Alibaba Cloud's Milvus Embedding Service eliminates the need for self‑hosted embedding models by integrating model inference, vector generation and Milvus indexing into a managed pipeline, dramatically reducing deployment complexity, operational overhead, and time‑to‑value for semantic search, RAG and multimodal retrieval use cases.

Alibaba CloudEmbeddingMilvus
0 likes · 19 min read
Zero Deployment, Zero Ops: Alibaba Cloud Milvus Embedding Service Makes Vectorization Plug‑and‑Play
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Apr 30, 2026 · Artificial Intelligence

Reinventing Search: Alibaba Cloud Elasticsearch Introduces Agent‑Native AI Memory Lake

Facing a projected 175ZB of global data by 2025 and 80% unstructured content, Alibaba Cloud Elasticsearch re‑architects its engine to deliver Agent‑native search, offering structured JSON/Markdown results, high‑performance vector indexing, and a unified enterprise knowledge lake for AI agents.

AI SearchAgentCloud AI
0 likes · 9 min read
Reinventing Search: Alibaba Cloud Elasticsearch Introduces Agent‑Native AI Memory Lake
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Apr 28, 2026 · Artificial Intelligence

Zero‑Learning Video to Semantic Vector Pipeline with MaxFrame’s Distributed AI Engine

Faced with exploding video volumes and bottlenecks in frame extraction, labeling, and vector storage, MaxFrame offers a three‑step, end‑to‑end distributed pipeline that turns raw videos into searchable semantic vectors while providing zero‑threshold scaling, transparent OSS mounting, row‑level fault tolerance, and elastic concurrency control.

MaxComputeMaxFrameOSS
0 likes · 6 min read
Zero‑Learning Video to Semantic Vector Pipeline with MaxFrame’s Distributed AI Engine
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Apr 27, 2026 · Information Security

Real-Time Agentic Risk Detection with Flink, Fluss, and Large Language Models

The article presents a Flink‑Fluss‑LLM architecture that captures full‑link agent events via a non‑intrusive hook, combines semantic AI inference with deterministic CEP rules, and delivers millisecond‑level alerts for malicious user detection, tool result poisoning, and chain‑attack risk mitigation.

AI FunctionAgent SecurityFlink
0 likes · 41 min read
Real-Time Agentic Risk Detection with Flink, Fluss, and Large Language Models
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Apr 22, 2026 · Artificial Intelligence

How to Build an End‑to‑End Hand‑Video to VLA Data Pipeline on Alibaba Cloud PAI with Data‑Juicer

This article details a step‑by‑step, distributed pipeline built on Alibaba Cloud PAI using Data‑Juicer and Ray that transforms raw egocentric hand videos into LeRobot v2.0‑compatible Vision‑Language‑Action (VLA) training data, covering video splitting, frame extraction, camera calibration, 3D hand reconstruction, pose estimation, action captioning, and export, with code snippets, performance numbers, and references.

Data-JuicerDistributed computingEmbodied AI
0 likes · 29 min read
How to Build an End‑to‑End Hand‑Video to VLA Data Pipeline on Alibaba Cloud PAI with Data‑Juicer
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Apr 20, 2026 · Cloud Computing

How Alibaba Cloud’s Agentic Search Redefines Enterprise AI Search

The article analyzes Alibaba Cloud Elasticsearch’s shift from keyword‑based to Agent‑native search, detailing the Agent Native architecture, hybrid retrieval 2.0, FalconSeek engine performance gains of up to 300%, cost reductions of 40‑70%, and the ecosystem of ES Skills, cloud‑native enhancements, and observability that together enable a scalable AI search platform for enterprises.

AI SearchAgentic ArchitectureCloud Computing
0 likes · 13 min read
How Alibaba Cloud’s Agentic Search Redefines Enterprise AI Search
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Apr 17, 2026 · Big Data

What Spark 4.0 Brings: VARIANT Type, Native SQL UDFs, and Serverless Enhancements

Apache Spark 4.0 introduces a high‑performance VARIANT data type for semi‑structured JSON, native SQL UDFs that eliminate Python UDF bottlenecks, a richer Python DataSource API, a new pipeline syntax, upgraded Structured Streaming state management, and Alibaba Cloud EMR Serverless optimizations that together deliver up to 30% speed gains and seamless migration from Spark 3.x.

Apache SparkPython APISQL UDF
0 likes · 12 min read
What Spark 4.0 Brings: VARIANT Type, Native SQL UDFs, and Serverless Enhancements
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Apr 16, 2026 · Artificial Intelligence

Build a Full End‑to‑End Embodied AI Workflow with Isaac Lab Arena

This notebook walks through a complete pipeline—from configuring Isaac Lab Arena environments and downloading datasets, to using Mimic for large‑scale data augmentation, fine‑tuning a GR00T‑N1.5 policy, and performing closed‑loop evaluation—demonstrating how to develop and validate embodied AI tasks on PAI‑DSW.

GR00TIsaac LabMimic
0 likes · 14 min read
Build a Full End‑to‑End Embodied AI Workflow with Isaac Lab Arena
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Apr 13, 2026 · Artificial Intelligence

How to Build a Scalable Multimodal Data Pipeline with Alibaba Cloud PAI and DataJuicer

This article details a step‑by‑step guide for constructing a high‑performance multimodal data pipeline—covering video segmentation, duration filtering, frame extraction, safety and aesthetic scoring, and caption generation—using Alibaba Cloud PAI, Paimon, DataJuicer, and distributed frameworks like Ray and Daft, with real‑world performance metrics.

AIAlibaba CloudDaft
0 likes · 30 min read
How to Build a Scalable Multimodal Data Pipeline with Alibaba Cloud PAI and DataJuicer
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Apr 10, 2026 · Artificial Intelligence

How to Supercharge Small LLM Agents with ReAct Data Construction and EasyDistill

This guide explains how to build high‑quality agent training data using ReAct trajectories, synthesize difficult samples with a data‑flywheel, and distill the knowledge into small LLMs on Alibaba Cloud PAI, covering teacher model deployment, EasyDistill installation, data generation, task solving, rubric filtering, and final model deployment.

AgentData GenerationEasyDistill
0 likes · 14 min read
How to Supercharge Small LLM Agents with ReAct Data Construction and EasyDistill