Author

Alibaba Cloud Big Data AI Platform

The Alibaba Cloud Big Data AI Platform builds on Alibaba’s leading cloud infrastructure, big‑data and AI engineering capabilities, scenario algorithms, and extensive industry experience to offer enterprises and developers a one‑stop, cloud‑native big‑data and AI capability suite. It boosts AI development efficiency, enables large‑scale AI deployment across industries, and drives business value.

457

Articles

Likes

599

Views

Comments

Latest from Alibaba Cloud Big Data AI Platform

100 recent articles max

Alibaba Cloud Big Data AI Platform

Dec 30, 2025 · Big Data

How StarRocks and Apache Paimon Unite to Build a True Lakehouse Native Engine

StarRocks and Apache Paimon have been progressively integrated across multiple releases, enabling a unified lakehouse architecture that supports multi-source federated analysis, time-travel queries, native readers/writers, distributed planning, and advanced profiling, while delivering performance gains that bring Paimon query speed on par with native StarRocks tables.

Apache PaimonLakehousePerformance Optimization

0 likes · 9 min read

How StarRocks and Apache Paimon Unite to Build a True Lakehouse Native Engine

Alibaba Cloud Big Data AI Platform

Dec 29, 2025 · Cloud Native

How a Visual Platform Cut Search Costs by 60% with All‑in‑Elasticsearch

This case study details how a major internet visual platform consolidated its log, keyword, and vector search workloads onto Alibaba Cloud Elasticsearch, eliminating three separate pipelines, reducing write‑costs by 60%, cutting storage expenses over 60%, and achieving multi‑fold performance gains through serverless scaling, FalconSeek engine optimizations, and unified monitoring.

ElasticsearchRAGSearch Architecture

0 likes · 10 min read

How a Visual Platform Cut Search Costs by 60% with All‑in‑Elasticsearch

Alibaba Cloud Big Data AI Platform

Dec 24, 2025 · Big Data

How Paimon’s Column‑Separation Architecture Powers Real‑Time Multi‑Modal Lakehouse for AI

This article explains the challenges of frequent column changes in AI feature engineering, introduces Paimon’s column‑separation storage with a global continuous Row ID, details its Blob data type for efficient multi‑modal handling, and outlines production results and future roadmap for building an AI‑native data lakehouse.

Apache PaimonBig DataBlob

0 likes · 11 min read

How Paimon’s Column‑Separation Architecture Powers Real‑Time Multi‑Modal Lakehouse for AI

Alibaba Cloud Big Data AI Platform

Dec 23, 2025 · Artificial Intelligence

How Skrull Boosts Long-Context Fine‑Tuning Speed Up to 7.5×

The Skrull system, accepted at NeurIPS 2025, dynamically schedules long and short sequences during each training iteration, overlapping communication and computation to achieve up to 7.54× speedup for long‑context fine‑tuning of large language models while maintaining stability through load‑balancing and rollback mechanisms.

Dynamic Data SchedulingLong Context Fine-TuningModel Training Optimization

0 likes · 8 min read

How Skrull Boosts Long-Context Fine‑Tuning Speed Up to 7.5×

Alibaba Cloud Big Data AI Platform

Dec 22, 2025 · Cloud Computing

Why Your Elasticsearch Client Doubles Bandwidth and How to Stop It

A hidden authentication step causes Elasticsearch clients to send each request twice—once without credentials and again after a 401 response—doubling bandwidth usage, but configuring pre‑emptive authentication in Java or Python eliminates the waste and cuts traffic costs.

ElasticsearchJavaPreemptive Auth

0 likes · 10 min read

Why Your Elasticsearch Client Doubles Bandwidth and How to Stop It

Alibaba Cloud Big Data AI Platform

Dec 18, 2025 · Databases

Why Hologres Dynamic Table Beats Traditional Full Refresh for Real‑Time Data Warehousing

The article explains how Hologres Dynamic Table uses a stateful incremental refresh model to efficiently handle massive historical data with tiny daily updates, dramatically reducing latency and resource consumption compared with conventional full‑refresh pipelines across several real‑world join and aggregation scenarios.

Dynamic TableHologresIncremental Refresh

0 likes · 18 min read

Why Hologres Dynamic Table Beats Traditional Full Refresh for Real‑Time Data Warehousing

Alibaba Cloud Big Data AI Platform

Dec 16, 2025 · Artificial Intelligence

How CosyVoice 2.0 Cuts First‑Chunk Latency for High‑Fidelity Voice Cloning

CosyVoice 2.0, Alibaba DAMO Academy's next‑gen high‑fidelity speech synthesis model, introduces architecture decoupling, streaming generation, reference‑audio caching and dynamic load balancing to dramatically reduce first‑packet latency and improve real‑time factor while supporting multi‑language voice cloning.

AI model optimizationLow latencyStreaming Inference

0 likes · 9 min read

How CosyVoice 2.0 Cuts First‑Chunk Latency for High‑Fidelity Voice Cloning

Alibaba Cloud Big Data AI Platform

Dec 15, 2025 · Backend Development

Why a Hot‑Word Update Crashed Elasticsearch and How Serverless Index‑Level Dictionaries Fix It

A real‑world incident where adding a hot term to the IK analyzer caused a P0 outage in an e‑commerce search system is dissected, revealing a clash between dynamic dictionary updates and immutable inverted indexes, and showing how Alibaba Cloud Elasticsearch Serverless’s index‑level dictionary isolation eliminates the problem while keeping services uninterrupted.

Hot UpdateIK AnalyzerIndex-level Dictionary

0 likes · 14 min read

Why a Hot‑Word Update Crashed Elasticsearch and How Serverless Index‑Level Dictionaries Fix It

Alibaba Cloud Big Data AI Platform

Dec 5, 2025 · Big Data

How EMR Serverless Spark Cut Batch Processing Time by Over 50% for a 600M‑User Platform

This case study details how Qimao leveraged Alibaba Cloud EMR Serverless Spark with Fusion and Celeborn to overcome multi‑business‑line data‑processing challenges, achieving more than 50% faster batch jobs, significant cost reductions, and improved operational flexibility across its 600 million‑user ecosystem.

Cloud ComputingData WarehousePerformance Optimization

0 likes · 9 min read

How EMR Serverless Spark Cut Batch Processing Time by Over 50% for a 600M‑User Platform

Alibaba Cloud Big Data AI Platform

Dec 1, 2025 · Artificial Intelligence

Build a No‑VNC Powered Isaac Sim Robot Demo with PAI‑DSW

This guide walks through setting up a PAI‑DSW environment, downloading Isaac Sim assets, configuring noVNC, launching a software‑in‑the‑loop robot simulation, and running a perception pipeline that combines FastSAM detection with FoundationPose pose estimation and ICP refinement.

Isaac SimPAI-DSWPerception

0 likes · 10 min read

Build a No‑VNC Powered Isaac Sim Robot Demo with PAI‑DSW