Tagged articles
3684 articles
Page 35 of 37
Baidu Waimai Technology Team
Baidu Waimai Technology Team
Apr 18, 2017 · Industry Insights

Baidu Waimai’s Cloud Migration, AI Logistics, and Architecture – QCon 2017

At QCon Beijing 2017, three senior Baidu Waimai engineers detailed the company’s year‑long migration from IDC to cloud using custom operation platforms, described the AI‑driven, data‑rich logistics scheduling system that outperforms manual dispatch, and shared architectural evolutions that enabled rapid, zero‑downtime scaling of the fast‑growing delivery business.

AI logisticsBig DataOperations
0 likes · 5 min read
Baidu Waimai’s Cloud Migration, AI Logistics, and Architecture – QCon 2017
Meituan Technology Team
Meituan Technology Team
Apr 14, 2017 · Big Data

Practical Experience of HDFS Federation at Meituan: Challenges, Improvements, and Automation

Meituan‑Dianping migrated its 2,000‑node HDFS cluster to Federation by fixing ViewFs compatibility, simplifying mount points, leveraging FastCopy for massive data moves, improving token handling, and automating split‑workflow steps, thereby overcoming single‑NameNode bottlenecks and providing a practical blueprint for large‑scale Hadoop deployments.

Big DataFastCopyFederation
0 likes · 22 min read
Practical Experience of HDFS Federation at Meituan: Challenges, Improvements, and Automation
MaGe Linux Operations
MaGe Linux Operations
Apr 13, 2017 · Big Data

How to Choose the Right Language for Your Big Data Project

This article compares R, Python, Scala, and Java for big‑data projects, outlining each language’s strengths and weaknesses, and offers guidance on selecting the most suitable language based on project requirements, team expertise, and production needs.

Big DataJavaPython
0 likes · 8 min read
How to Choose the Right Language for Your Big Data Project
ITFLY8 Architecture Home
ITFLY8 Architecture Home
Apr 9, 2017 · Fundamentals

Understanding Bloom Filters: Fast, Space-Efficient Membership Tests

Bloom filters are highly space-efficient probabilistic data structures that quickly test set membership using multiple hash functions, guaranteeing no false negatives while allowing a small false positive rate, making them ideal for large-scale applications such as email blacklists and massive URL deduplication.

Big Databloom-filtermembership testing
0 likes · 5 min read
Understanding Bloom Filters: Fast, Space-Efficient Membership Tests
21CTO
21CTO
Apr 4, 2017 · Artificial Intelligence

How Vipshop Evolved Its Real-Time Personalized Recommendation Engine

This article recounts Wu Guanlin’s presentation on the evolution of Vipshop’s personalized recommendation system, detailing the technical challenges of real‑time predictions, the three generations of architecture, the four‑stage recommendation engine, and the VRE platform’s design for scalability and low latency.

Big DataMachine LearningSystem architecture
0 likes · 10 min read
How Vipshop Evolved Its Real-Time Personalized Recommendation Engine
Meituan Technology Team
Meituan Technology Team
Mar 24, 2017 · Artificial Intelligence

Tourism Recommendation System: Strategy Iterations, Architecture, and Future Challenges

The article outlines Meituan‑Dianping’s tourism recommendation system, detailing its evolution from simple hot‑sale recall to sophisticated decay‑based, GPS‑aware, collaborative filtering and XGBoost reranking pipelines, the four‑layer architecture supporting dozens of travel scenarios, and future plans to broaden recall, adopt deep models, and expand multimodal travel recommendations.

Big DataMachine LearningTourism
0 likes · 26 min read
Tourism Recommendation System: Strategy Iterations, Architecture, and Future Challenges
Tongcheng Travel Technology Center
Tongcheng Travel Technology Center
Mar 24, 2017 · Operations

Evolution of Tongcheng Log System Architecture

The article chronicles the development of Tongcheng's centralized log system from early file‑based logging through a MongoDB‑based solution to the current multi‑layer architecture using Flume, Elasticsearch, and Hadoop, highlighting design decisions, challenges, and future improvement plans.

Big DataFlumelog system
0 likes · 7 min read
Evolution of Tongcheng Log System Architecture
Baidu Waimai Technology Team
Baidu Waimai Technology Team
Mar 23, 2017 · Databases

Design and Implementation of the "Little Boy" Greenplum Optimization and Operations Platform

This article introduces the architecture, key modules, and implementation details of the Little Boy platform, a Greenplum optimization and operations system that parses SQL, applies index and distribution‑key tuning, manages resources, and outlines future enhancements for large‑scale data warehouses.

Big DataDatabase OptimizationGreenplum
0 likes · 15 min read
Design and Implementation of the "Little Boy" Greenplum Optimization and Operations Platform
ITPUB
ITPUB
Mar 22, 2017 · Big Data

Why Spark Beats MapReduce: The RDD Story and Spark SQL Evolution

This article walks through Spark’s origins, its core RDD concept, how it improves on Hadoop’s MapReduce, the role of in‑memory processing, functional programming support, and the emergence of Spark SQL with DataFrames and the Catalyst optimizer.

Big DataDistributed computingMapReduce
0 likes · 25 min read
Why Spark Beats MapReduce: The RDD Story and Spark SQL Evolution
StarRing Big Data Open Lab
StarRing Big Data Open Lab
Mar 21, 2017 · Big Data

How Real-Time Data Streaming Is Transforming Industries Today

This article explains how real‑time data streaming turns massive, continuously growing datasets into actionable insights across finance, energy, and e‑commerce, showcasing early adopters like ConocoPhillips and DHL while urging businesses to rethink models for the next wave of data management.

Big DataData Streamingindustry use cases
0 likes · 7 min read
How Real-Time Data Streaming Is Transforming Industries Today
Qunar Tech Salon
Qunar Tech Salon
Mar 12, 2017 · Big Data

Essential Skills and Career Paths for Data Professionals: From Big Data Platforms to AI

The article outlines the key competencies, responsibilities, and career development advice for data professionals across the entire data stack—from building big‑data platforms and data warehouses to visualization, analysis, algorithm engineering, and deep‑learning applications—emphasizing the importance of creating business value with data.

Big DataData AnalystData Engineering
0 likes · 15 min read
Essential Skills and Career Paths for Data Professionals: From Big Data Platforms to AI
21CTO
21CTO
Mar 10, 2017 · Big Data

Inside Tencent Analytics: How TA Handles TB‑Scale Real‑Time Web Data

Tencent Analytics (TA) is a free web analytics platform that processes terabytes of daily data in real time, using a custom architecture featuring JavaScript collection, event streaming, in‑memory computation, and NoSQL storage with Redis and LevelDB, offering site owners instant insights and high availability.

Big DataLevelDBReal-time Processing
0 likes · 12 min read
Inside Tencent Analytics: How TA Handles TB‑Scale Real‑Time Web Data
Efficient Ops
Efficient Ops
Mar 7, 2017 · Big Data

How Tencent Scaled Its TDW to 8,800 Nodes and Mastered Cross-City Data Migration

Tencent’s senior engineer explains how the TDW (Tencent Distributed Data Warehouse) grew from a few hundred to thousands of nodes, the challenges of cross‑city migration, and the modeling, relationship‑chain, dual‑write tables, and platform strategies they built to ensure seamless, low‑impact data and task migration.

Big DataData MigrationTDW
0 likes · 26 min read
How Tencent Scaled Its TDW to 8,800 Nodes and Mastered Cross-City Data Migration
Alibaba Cloud Developer
Alibaba Cloud Developer
Mar 7, 2017 · Big Data

Unified Data Platforms: How UMENG+ Redefines Big Data Strategy

The article explores the evolution of big‑data applications in China, from Oracle’s trend report and the concept of "omni‑domain data" to UMENG+’s technical architecture, unified tech stack, AI integration, and future directions for delivering real customer value.

Big DataData AnalyticsTechnology Architecture
0 likes · 12 min read
Unified Data Platforms: How UMENG+ Redefines Big Data Strategy
Efficient Ops
Efficient Ops
Mar 6, 2017 · Operations

Tencent Game Ops: Turning Service Delivery into Smart, Automated Microservices

This article details how Tencent's game operations team redefined operational services, introduced micro‑service architecture, applied big‑data driven recommendations, and built intelligent, automated pipelines for server opening, merging, version releases, and download services, achieving significant efficiency and cost gains.

AutomationBig DataCloud Native
0 likes · 26 min read
Tencent Game Ops: Turning Service Delivery into Smart, Automated Microservices
Meituan Technology Team
Meituan Technology Team
Mar 2, 2017 · Big Data

Meituan Waimai Feature Archive Platform: Architecture, Tag System, and Data Processing

Meituan Waimai’s Feature Archive platform processes billions of daily orders by managing ~200 user and 400 merchant tags through a three‑layer architecture—Hive, Elasticsearch, HBase, and MySQL—offering visual tag selection, instant self‑service queries, full data extraction, and a predicate‑logic query language, while supporting future extensibility.

Big DataElasticsearchHBase
0 likes · 14 min read
Meituan Waimai Feature Archive Platform: Architecture, Tag System, and Data Processing
AntTech
AntTech
Feb 28, 2017 · Artificial Intelligence

Key Computing Capabilities Driving the Evolution of Digital Financial Services

The talk outlines nine essential computing capabilities—transaction processing, system robustness, connectivity, decision-making, data insight, intelligent services, biometric authentication, blockchain trust, and immersive integration—that have transformed Ant Financial over the past decade and outlines the challenges and strategies for the next ten years.

Artificial IntelligenceBig DataBlockchain
0 likes · 16 min read
Key Computing Capabilities Driving the Evolution of Digital Financial Services
Architecture Digest
Architecture Digest
Feb 28, 2017 · Big Data

Architecture and Real‑Time Processing Design of Tencent Analytics (TA)

This article explains the architecture, real‑time computation framework, and storage solutions of Tencent Analytics, detailing how massive TB‑level web‑traffic data are collected via JavaScript, processed in memory‑centric streaming components, and stored using Redis and LevelDB to achieve second‑level updates.

Big DataLevelDBNoSQL
0 likes · 13 min read
Architecture and Real‑Time Processing Design of Tencent Analytics (TA)
Nightwalker Tech
Nightwalker Tech
Feb 27, 2017 · Big Data

Community Discussion on Learning Paths, Tools, and Applications in Big Data

A diverse group of practitioners share recommendations for books, technologies, real‑world use cases, and practical challenges when learning and applying big‑data processing, covering Hadoop, Spark, data visualization, ETL, and the relationship between data, algorithms, and business value.

Big DataData AnalysisHadoop
0 likes · 16 min read
Community Discussion on Learning Paths, Tools, and Applications in Big Data
Qunar Tech Salon
Qunar Tech Salon
Feb 26, 2017 · Big Data

Comparative Analysis of Big Data Storage and Query Solutions

This article reviews major big‑data storage and query architectures—including HBase, Dremel/Parquet, pre‑aggregation systems, Lucene, and the custom Tindex solution—evaluating their strengths, weaknesses, and suitability for real‑time, high‑volume analytical workloads.

Big DataHBaseParquet
0 likes · 20 min read
Comparative Analysis of Big Data Storage and Query Solutions
Efficient Ops
Efficient Ops
Feb 26, 2017 · Operations

How Alibaba Scales Massive Data Platforms: Lessons in Automated Operations

This article explores the challenges of operating Alibaba's large‑scale data platforms, describes the automation platform built to address them, and shares data‑driven, fine‑grained operational practices that enable stable, efficient, and cost‑effective service delivery.

AutomationBig DataOperations
0 likes · 22 min read
How Alibaba Scales Massive Data Platforms: Lessons in Automated Operations
Qunar Tech Salon
Qunar Tech Salon
Feb 22, 2017 · Big Data

Understanding Ctrip Flight Ticket Tracking System (UBT) and Its Key Metrics

This article explains Ctrip's flight ticket tracking framework (UBT), detailing client‑side and server‑side event collection methods, the purpose and trade‑offs of each tracking type, metric definitions, data association challenges, common pitfalls, and best practices for reliable data‑driven analysis.

AnalyticsBig DataCtrip
0 likes · 20 min read
Understanding Ctrip Flight Ticket Tracking System (UBT) and Its Key Metrics
Alibaba Cloud Developer
Alibaba Cloud Developer
Feb 22, 2017 · Artificial Intelligence

How Alibaba’s AI Powers Real‑Time Customer Segmentation and Personalized Shopping

This article explains how Alibaba leverages AI, big‑data analytics, and advanced recommendation algorithms to enable real‑time visitor clustering, personalized storefronts, and tailored content across its Customer Operation Platform, Double 11 promotion pages, QianNiu headlines, and service market, delivering significant conversion and engagement gains.

AIBig DataRecommendation Systems
0 likes · 18 min read
How Alibaba’s AI Powers Real‑Time Customer Segmentation and Personalized Shopping
Nightwalker Tech
Nightwalker Tech
Feb 20, 2017 · Backend Development

Career Development and Technology Trends for PHP Engineers

The discussion explores how PHP engineers can advance their careers by embracing new technologies such as Go, Python, big data, AI, and cloud computing, while also emphasizing soft‑skill growth, project management, and strategic decision‑making based on business trends and personal goals.

Artificial IntelligenceBig Databackend
0 likes · 9 min read
Career Development and Technology Trends for PHP Engineers
Meituan Technology Team
Meituan Technology Team
Feb 17, 2017 · Big Data

User Profiling and Machine Learning Practices for Food Delivery O2O Platforms

Meituan Delivery’s rapid expansion across multiple categories relies on detailed user profiling and machine‑learning models—such as high‑potential customer prediction, churn risk regression and Cox survival analysis—to personalize acquisition, retention, and scenario‑based cross‑selling, while addressing sparse behavior, unstructured data, and geographic context challenges.

Big DataMachine LearningO2O
0 likes · 13 min read
User Profiling and Machine Learning Practices for Food Delivery O2O Platforms
21CTO
21CTO
Feb 15, 2017 · Fundamentals

How Twitter Evolved Its Search Engine: From MySQL to Earlybird and Beyond

This article explains the fundamentals of search engine architecture, covering text collection, indexing, ranking and evaluation, and then traces Twitter's internal search evolution from MySQL full‑text search to the Earlybird index server, Blender aggregation, and smart memory‑SSD strategies.

Big DataTwitterindexing
0 likes · 8 min read
How Twitter Evolved Its Search Engine: From MySQL to Earlybird and Beyond
Architecture Digest
Architecture Digest
Feb 11, 2017 · Big Data

LeKe Sports Big Data Platform Evolution: From Early ETL Reporting to 2.0 Streaming Architecture

The article describes how LeKe Sports built and continuously upgraded its Hadoop‑based big data platform—from a manual ETL‑to‑Elasticsearch reporting system to a 2.0 architecture featuring Spark Streaming, SQL‑based query layers, Elasticsearch indexing, and cloud‑native storage and backup solutions—to meet rapidly growing PB‑scale data demands.

Big DataData PlatformETL
0 likes · 5 min read
LeKe Sports Big Data Platform Evolution: From Early ETL Reporting to 2.0 Streaming Architecture
Huawei Cloud Developer Alliance
Huawei Cloud Developer Alliance
Feb 7, 2017 · Big Data

What’s New in Apache CarbonData 1.0.0? 80+ Features Boost Big Data Performance

Apache CarbonData 1.0.0, now an Apache incubating project, adds over 80 new features and bug fixes—including a new data loading solution, Spark 2.1 integration, update/delete SQL support, adaptive compression for numeric types, B‑Tree LRU cache, V2 format for faster first‑query performance, vectorized reader, bucket‑table joins, off‑heap memory, single‑pass loading, and pre‑generated dictionaries—aimed at delivering faster, more flexible, and efficient columnar storage for big‑data workloads.

Apache CarbonDataBig DataColumnar Storage
0 likes · 8 min read
What’s New in Apache CarbonData 1.0.0? 80+ Features Boost Big Data Performance
Huawei Cloud Developer Alliance
Huawei Cloud Developer Alliance
Jan 24, 2017 · Big Data

Why Hadoop Remains the Backbone of Big Data: Core Modules, Tools, and Trends

This article provides a comprehensive overview of Hadoop as the leading open‑source platform for big‑data processing, detailing its core components HDFS and MapReduce, the evolution to Hadoop 2.0/YARN, and the extensive ecosystem of tools and commercial solutions that enable scalable storage, analysis, and machine‑learning on massive data sets.

Big DataDistributed computingHDFS
0 likes · 18 min read
Why Hadoop Remains the Backbone of Big Data: Core Modules, Tools, and Trends
21CTO
21CTO
Jan 18, 2017 · Big Data

Build a Lightweight, High‑Availability Real‑Time Stream Processing System

Learn how to construct a simple, high‑availability real‑time stream processing platform using lightweight components such as Kafka, Zookeeper, Thrift/Avro, and optional storage like MongoDB or Elasticsearch, offering a practical alternative to heavyweight frameworks like Storm and Spark Streaming for small‑to‑medium enterprises.

Big DataKafkalightweight architecture
0 likes · 5 min read
Build a Lightweight, High‑Availability Real‑Time Stream Processing System
dbaplus Community
dbaplus Community
Jan 16, 2017 · Backend Development

Scaling a FinTech Platform to $100B Transactions with Four Overhauls

Over three years, a small fintech company transformed its platform from a single‑server PHP/Java stack to a micro‑service‑based Spring Cloud architecture, undergoing four major upgrades that introduced distributed systems, SOA governance, big‑data pipelines, MongoDB replication, Redis caching, and open‑source tools, enabling transaction volumes exceeding one hundred billion.

Big DataFinTechOpen Source
1 likes · 15 min read
Scaling a FinTech Platform to $100B Transactions with Four Overhauls
Alibaba Cloud Developer
Alibaba Cloud Developer
Jan 11, 2017 · R&D Management

How Taobao’s Beehive Platform Powers Content‑Driven Shopping During Double 11

The article explains how Taobao’s content‑centric strategy, embodied in the Beehive platform, builds an end‑to‑end content chain—from creator tools and health scoring to personalized distribution and commerce mechanisms—enabling massive, efficient content production and monetization during the Double 11 shopping festival.

Big DataTaobaocontent platform
0 likes · 17 min read
How Taobao’s Beehive Platform Powers Content‑Driven Shopping During Double 11
Alibaba Cloud Developer
Alibaba Cloud Developer
Jan 9, 2017 · Big Data

How Alibaba Scaled Real‑Time Data Processing for Double 11: Architecture & Lessons

This article details Alibaba's real‑time computing architecture for the 2016 Double 11 event, covering background, core components such as DRC, TT, Galaxy, OTS, XTool and OneService, and explains optimization techniques, fault‑tolerance strategies, stress‑testing practices, and future upgrade plans to handle massive streaming data workloads.

Big DataPerformance OptimizationReal‑Time Computing
0 likes · 14 min read
How Alibaba Scaled Real‑Time Data Processing for Double 11: Architecture & Lessons
dbaplus Community
dbaplus Community
Dec 26, 2016 · Big Data

Why Data Lakes Are Redefining Enterprise Data Architecture

This article explains the origins, core features, logical architecture, and advantages of data lakes, contrasts them with traditional data warehouses, outlines a modern data architecture that combines lakes and warehouses, and introduces the DCE intelligent data lake platform with practical Q&A.

Big DataCloud Computingdata lake
0 likes · 14 min read
Why Data Lakes Are Redefining Enterprise Data Architecture
Tencent Cloud Developer
Tencent Cloud Developer
Dec 23, 2016 · Databases

Analysis of HBase Write-Ahead Log (WAL) Mechanism and Source Code Call Chain

The article explains HBase’s write‑ahead‑log architecture, detailing how client put/delete requests travel through RPC to the RegionServer, are processed by MultiRowMutationService, written to the WAL via FSHLog.append and sync, and finally stored in MemStore, while describing durability options and the underlying source‑code call chain.

Big DataHBaseJava
0 likes · 10 min read
Analysis of HBase Write-Ahead Log (WAL) Mechanism and Source Code Call Chain
Hulu Beijing
Hulu Beijing
Dec 20, 2016 · Big Data

How Hulu Supercharges OLAP Queries with CarbonData: Real‑World Optimizations

This article describes Hulu’s real‑world OLAP query optimization, covering the fundamentals of OLAP, comparisons of row‑ and column‑based storage formats, detailed indexing mechanisms of Parquet, ORC and CarbonData, and the specific schema, shuffle, block size, speculation and GC tuning techniques that enabled CarbonData to dramatically accelerate wide‑table queries on SparkSQL.

Big DataCarbonDataColumnar Storage
0 likes · 17 min read
How Hulu Supercharges OLAP Queries with CarbonData: Real‑World Optimizations
Meituan Technology Team
Meituan Technology Team
Dec 9, 2016 · Big Data

Memory Usage Analysis of HDFS NameNode Core Data Structures

The article quantitatively breaks down HDFS NameNode memory consumption, showing that the Namespace tree and BlocksMap together dominate heap usage (≈53 GB in large clusters), provides detailed per‑object size estimates for NetworkTopology, INode and block structures, and proposes a simple formula to predict total heap requirements and tuning recommendations.

Big DataHDFSNameNode
0 likes · 13 min read
Memory Usage Analysis of HDFS NameNode Core Data Structures
Alibaba Cloud Developer
Alibaba Cloud Developer
Dec 8, 2016 · Artificial Intelligence

How AI Powers Data‑Driven Merchant Success

In this Alibaba Tech Forum talk, senior expert Wei Hu explains how machine learning and big‑data technologies are used to empower merchants with personalized storefronts, intelligent posters, and AI‑driven headlines, boosting their efficiency and sales performance.

AIAlibabaBig Data
0 likes · 2 min read
How AI Powers Data‑Driven Merchant Success
Ctrip Technology
Ctrip Technology
Dec 2, 2016 · Big Data

Design and Architecture of Ctrip's Aegis Risk Control System

This article presents a comprehensive overview of Ctrip's Aegis risk control system, detailing its modular architecture, rule engine, data service layer, Chloro analytics platform, and future directions, while highlighting the use of streaming, big‑data processing, and machine‑learning models for real‑time fraud detection.

Big DataMachine LearningReal-time Processing
0 likes · 13 min read
Design and Architecture of Ctrip's Aegis Risk Control System
Meitu Technology
Meitu Technology
Dec 1, 2016 · Big Data

Multi-dimensional Analysis Platform Based on User Portrait Data

Tencent's Glacier multi‑dimensional analysis platform combines massive user‑portrait tags with routine analytical reports, delivering fast, accurate real‑time queries across countless dimensional combinations, enabling analysts and operators to perform targeted operations and insights as product data continuously evolves.

Big DataData PlatformGlacier
0 likes · 1 min read
Multi-dimensional Analysis Platform Based on User Portrait Data
Meitu Technology
Meitu Technology
Dec 1, 2016 · Big Data

Meitu Internet Technology Salon: Big Data Architecture Evolution and Practice, and Tencent Multi‑Dimensional Analysis Platform

At Meitu’s third Internet Technology Salon in Xiamen on November 26 2016, over 150 senior engineers heard Meitu’s Lu Rongbin detail the company’s progression from simple rsync scripts to a scalable mobile data and open statistical platform, while Tencent’s Zhao Shiyuan showcased the Glacier multi‑dimensional analysis system for fast, tag‑driven queries, underscoring collaborative technical exchange in South China.

AnalyticsBig DataData Platform
0 likes · 6 min read
Meitu Internet Technology Salon: Big Data Architecture Evolution and Practice, and Tencent Multi‑Dimensional Analysis Platform
Alibaba Cloud Developer
Alibaba Cloud Developer
Nov 30, 2016 · Big Data

How Alibaba’s Double 11 Turned Big Data into a Global E‑Commerce Game‑Changer

MIT Technology Review reports that Alibaba’s 2022 Double 11 shopping festival set new e‑commerce records while showcasing the company’s advanced big‑data, AI, and cloud‑computing technologies, highlighting massive transaction volumes, high‑quality data processing, robust security measures, and the strategic push toward global digital infrastructure.

Big DataCloud ComputingData Security
0 likes · 11 min read
How Alibaba’s Double 11 Turned Big Data into a Global E‑Commerce Game‑Changer
Architects' Tech Alliance
Architects' Tech Alliance
Nov 28, 2016 · Big Data

User Profiling: Concepts, Stages, and Data Modeling Methods

This article explains the concept of user profiling, outlines its four-stage construction process, discusses the significance of tagging users, and details practical data modeling techniques—including static and dynamic data sources, weight calculations, and real‑world examples—aimed at improving precision marketing and recommendation systems.

Big DataTaggingbehavior analysis
0 likes · 44 min read
User Profiling: Concepts, Stages, and Data Modeling Methods
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Nov 20, 2016 · Big Data

Alibaba’s Big Data Applications in Urban Governance and Social Risk Prevention

The article describes how Alibaba leverages big data, cloud computing and AI through its “City Brain” project and security platforms to improve urban traffic management, public safety, anti‑fraud measures and e‑commerce risk control, illustrating the transformative impact of data‑driven technologies on modern social governance.

AIBig DataCloud Computing
0 likes · 11 min read
Alibaba’s Big Data Applications in Urban Governance and Social Risk Prevention
StarRing Big Data Open Lab
StarRing Big Data Open Lab
Nov 18, 2016 · Big Data

Unveiling Modern Big Data Architecture: Key Technologies and Trends

This article reviews a comprehensive big‑data lecture covering traditional databases, Hadoop ecosystems, commercial big‑data platforms, computing models, analysis techniques, visualization, and leading vendors, highlighting how these technologies shape today’s data‑driven enterprises.

Big DataData AnalysisData Architecture
0 likes · 14 min read
Unveiling Modern Big Data Architecture: Key Technologies and Trends
Architecture Digest
Architecture Digest
Nov 11, 2016 · Backend Development

High‑Availability Architecture Sessions at the China Software Developers Conference (Nov 18‑20)

The conference featured a series of high‑availability architecture talks covering performance‑driven design, RPC framework resilience, big‑data platform evolution, MySQL cluster consistency, and cloud infrastructure best practices, presented by experts from 58.com, Alibaba, Tencent, Baidu, and others.

Backend ArchitectureBig DataCloud Computing
0 likes · 10 min read
High‑Availability Architecture Sessions at the China Software Developers Conference (Nov 18‑20)
StarRing Big Data Open Lab
StarRing Big Data Open Lab
Nov 11, 2016 · Big Data

Why SQL Still Rules Big Data—and How NoSQL & NewSQL Fit In

The article explores the evolution of data processing from Hadoop and Spark to modern SQL, NoSQL, and NewSQL solutions, comparing their architectures, performance trade‑offs, and use‑cases, while illustrating concepts with examples like MapReduce, Hive, Impala, and streaming platforms such as Storm.

Big DataHadoopNewSQL
0 likes · 14 min read
Why SQL Still Rules Big Data—and How NoSQL & NewSQL Fit In
Architects' Tech Alliance
Architects' Tech Alliance
Nov 8, 2016 · Cloud Computing

12 Notable Data Storage Startups to Watch in 2016

Amid rising data‑storage complexity, twelve innovative startups emerged in 2015‑2016, leveraging flash, disk, and cloud technologies to improve data mobility and management across hierarchical storage tiers, offering solutions ranging from cloud‑native storage networks to SAN arrays and virtualization platforms.

Big DataSANStartup
0 likes · 15 min read
12 Notable Data Storage Startups to Watch in 2016
MaGe Linux Operations
MaGe Linux Operations
Nov 7, 2016 · Big Data

How HDFS Achieves Low Cost, High Reliability, and Fault Tolerance

This article explains how HDFS, inspired by Google’s GFS, provides a low‑cost, highly reliable, fault‑tolerant, and high‑performance distributed file system for big‑data workloads by using replication, standby NameNodes, block storage, rack awareness, and compute‑close‑to‑data strategies.

Big DataDistributed File SystemHDFS
0 likes · 7 min read
How HDFS Achieves Low Cost, High Reliability, and Fault Tolerance
Architecture Digest
Architecture Digest
Nov 6, 2016 · Big Data

Evolution of Taobao’s Big Data Platform: From RAC to MaxCompute

The article chronicles Taobao’s 13‑year evolution of its big data platform, detailing three phases—from a single‑node Oracle setup and the Tianwang scheduler, through a Hadoop‑based “Cloud Ladder 1” architecture with real‑time analytics, to the current MaxCompute/ODPS era with cross‑region projects and advanced data services.

Big DataData PlatformData Warehouse
0 likes · 11 min read
Evolution of Taobao’s Big Data Platform: From RAC to MaxCompute
Architects' Tech Alliance
Architects' Tech Alliance
Nov 4, 2016 · Big Data

The Seven Camps of the Global Big Data Ecosystem

The article outlines how mobile Internet merges the data‑driven society with the physical world to create a new big‑data architecture and describes the seven distinct camps—Infrastructure, Analytics, Applications, Cross‑Domain Architecture, Open‑Source, Data Sources & APIs, and Incubator & Training—that together form a comprehensive end‑to‑end big‑data solution ecosystem.

APIAnalyticsApplications
0 likes · 3 min read
The Seven Camps of the Global Big Data Ecosystem
Meituan Technology Team
Meituan Technology Team
Nov 4, 2016 · Big Data

Design and Implementation of a Low-Latency App Exception Monitoring Platform Using Spark Streaming, Kafka, and Elasticsearch

The paper presents a production‑grade, low‑cost mobile‑app exception monitoring platform built on Spark Streaming, Kafka, and Elasticsearch that achieves high availability through exactly‑once processing and checkpointing, minute‑level latency by decoupling raw and symbolized logs, high throughput via reservoir sampling, and dynamic scalability without code changes.

Big DataElasticsearchException Monitoring
0 likes · 11 min read
Design and Implementation of a Low-Latency App Exception Monitoring Platform Using Spark Streaming, Kafka, and Elasticsearch
Architects' Tech Alliance
Architects' Tech Alliance
Nov 3, 2016 · Industry Insights

Scaling Billion‑Level Ads: Architecture Lessons from Sogou’s Senior Engineer

In this interview, Sogou architect Liu Jian shares how his team built a highly available, scalable commercial advertising platform, discusses the evolution of its infrastructure, offers practical advice for engineers aspiring to become architects, and reflects on emerging technologies and time‑management strategies.

Big DataDistributed SystemsPlatform Engineering
0 likes · 10 min read
Scaling Billion‑Level Ads: Architecture Lessons from Sogou’s Senior Engineer
ITFLY8 Architecture Home
ITFLY8 Architecture Home
Oct 31, 2016 · Cloud Computing

How Taobao Scaled from LAMP to Cloud: Lessons in Cloud Migration Architecture

This article examines the evolution of Taobao's technical architecture—from a LAMP stack through Oracle‑based mainframes to a cloud‑native platform—highlighting the performance, scalability, and cost challenges of traditional IT and offering best‑practice strategies for migrating enterprise systems to the cloud.

Big DataCloud ComputingOperations
0 likes · 15 min read
How Taobao Scaled from LAMP to Cloud: Lessons in Cloud Migration Architecture
ITFLY8 Architecture Home
ITFLY8 Architecture Home
Oct 27, 2016 · Big Data

Inside Taobao’s Massive Data Architecture: How 1.5 PB Daily Is Processed and Served

The article explains Taobao’s five‑layer data product architecture—covering data sources, compute, storage, query, and product layers—and describes how massive volumes of data are ingested, processed in batch and streaming, stored in MySQL and HBase clusters, and served efficiently through a unified middle‑layer and sophisticated caching mechanisms.

Big DataCachingDistributed Systems
0 likes · 15 min read
Inside Taobao’s Massive Data Architecture: How 1.5 PB Daily Is Processed and Served
21CTO
21CTO
Oct 21, 2016 · Artificial Intelligence

How Toutiao Dominated Chinese News with AI‑Powered Personalization

This article examines Toutiao’s evolution from a simple news aggregator to a 600‑billion‑RMB valued AI‑driven recommendation platform, detailing its market growth, data‑driven personalization, product features, business model, talent philosophy, and future outlook.

AIBig DataRecommendation Engine
0 likes · 10 min read
How Toutiao Dominated Chinese News with AI‑Powered Personalization
Efficient Ops
Efficient Ops
Oct 20, 2016 · Operations

Transforming Business Operations with Cloud, Big Data, and Integrated IT Management

The article explains how modern business operation management integrates cloud computing, big data analytics, and proactive IT monitoring to shift from traditional infrastructure‑centric maintenance to a user‑experience‑driven, data‑powered approach that boosts performance, accelerates growth, and supports digital transformation.

Big DataCloud ComputingIT monitoring
0 likes · 8 min read
Transforming Business Operations with Cloud, Big Data, and Integrated IT Management
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Oct 17, 2016 · Artificial Intelligence

Wang Jian’s Keynote at the 2016 Hangzhou Yunqi Conference: Data Brain, AI, and Cloud Computing

In his 2016 Yunqi Conference keynote, Wang Jian highlighted how Alibaba’s cloud and AI technologies transform city traffic by linking surveillance cameras to traffic lights, discussed the evolution from Deep Blue to AlphaGo, and reflected on the broader impact of data-driven innovation on society.

AIBig DataCloud Computing
0 likes · 13 min read
Wang Jian’s Keynote at the 2016 Hangzhou Yunqi Conference: Data Brain, AI, and Cloud Computing
Qunar Tech Salon
Qunar Tech Salon
Oct 17, 2016 · Information Security

Design and Implementation of a Cloud‑Based Web Application Firewall at Ctrip

This article describes Ctrip's challenges with web security, evaluates hardware and commercial cloud WAF shortcomings, and presents a low‑cost, low‑risk cloud‑based WAF solution that leverages DNS redirection, closed‑loop rule management, Lua/Tengine deployment, supervised machine‑learning log analysis, and big‑data streaming for real‑time threat detection and mitigation.

Big DataMachine LearningWAF
0 likes · 9 min read
Design and Implementation of a Cloud‑Based Web Application Firewall at Ctrip
ITFLY8 Architecture Home
ITFLY8 Architecture Home
Oct 16, 2016 · Big Data

Mastering Data Sync, Real-Time Analytics, and Scalable Storage for Modern Systems

This article explains how to design and implement heterogeneous data synchronization, leverage batch and stream processing frameworks like Hadoop and Storm for large‑scale analysis, and choose appropriate storage solutions—from in‑memory databases to distributed column‑family stores—while addressing performance, reliability, and monitoring in complex distributed environments.

Big DataDistributed Systemsdata synchronization
0 likes · 26 min read
Mastering Data Sync, Real-Time Analytics, and Scalable Storage for Modern Systems
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Oct 15, 2016 · Artificial Intelligence

Computing Power as the Engine of Digitalization and Intelligent Manufacturing – Insights from Alibaba Cloud Conference

In his keynote at the Alibaba Cloud Conference, CTO Zhang Jianfeng explains how advances in computing power, AI, big data, and IoT are driving the digital transformation of retail, manufacturing, and services, enabling smarter products, personalized experiences, and a fully connected intelligent world.

Big DataCloud ComputingIoT
0 likes · 19 min read
Computing Power as the Engine of Digitalization and Intelligent Manufacturing – Insights from Alibaba Cloud Conference
Alibaba Cloud Developer
Alibaba Cloud Developer
Oct 14, 2016 · Artificial Intelligence

How Alibaba’s CTO Envisions AI‑Driven Smart Manufacturing and the Future of Digitalized Worlds

In his Yunqi Conference keynote, Alibaba CTO Zhang Jianfeng explains how soaring computing power, digitalization, AI, IoT and immersive technologies will transform retail, manufacturing and services into intelligent, personalized ecosystems, illustrating the vision with examples like a smart golf club and a city‑wide data brain.

AIBig DataIoT
0 likes · 21 min read
How Alibaba’s CTO Envisions AI‑Driven Smart Manufacturing and the Future of Digitalized Worlds
StarRing Big Data Open Lab
StarRing Big Data Open Lab
Oct 8, 2016 · Big Data

Evolving Data Warehouses with Hadoop & Spark: Core Technologies

Data warehouses centralize and transform enterprise data for multidimensional analysis, and modern demands have spawned four types—traditional, real‑time, associative discovery, and data marts—each with distinct technical requirements, while Hadoop‑based solutions like Transwarp Data Hub address challenges of scale, variety, latency, and security.

Big DataDistributed computingHadoop
0 likes · 21 min read
Evolving Data Warehouses with Hadoop & Spark: Core Technologies
Java High-Performance Architecture
Java High-Performance Architecture
Sep 27, 2016 · Big Data

Build a Hadoop Cluster with Docker: Step‑by‑Step Guide

Learn how to quickly set up a multi‑node Hadoop cluster on a single machine using Docker containers, covering image preparation, SSH configuration, fixed IP assignment with pipework, and building custom Hadoop images, enabling a lightweight, cost‑effective big‑data environment for development and testing.

Big DataCentOSCluster
0 likes · 9 min read
Build a Hadoop Cluster with Docker: Step‑by‑Step Guide
StarRing Big Data Open Lab
StarRing Big Data Open Lab
Sep 26, 2016 · Operations

Automate Cluster Health Checks with Koalas: Cutting Big Data Downtime

The article introduces Koalas, an automated distributed diagnostic tool for TDH clusters that identifies and resolves computing environment issues—such as network, platform, and system problems—through one‑click checks, detailed reporting, and both preventive and diagnostic use cases.

Big DataCluster MonitoringPerformance Optimization
0 likes · 8 min read
Automate Cluster Health Checks with Koalas: Cutting Big Data Downtime
dbaplus Community
dbaplus Community
Sep 12, 2016 · Big Data

Apache Flume Quickstart: Log Collection and Kafka Integration

This article introduces Apache Flume, explains its design goals of reliability, scalability, manageability and extensibility, outlines core concepts and architecture, provides step‑by‑step configuration using the first mode, demonstrates integration with Zookeeper, Kafka and a shell script, and shows how to launch and verify the agent.

Apache FlumeBig DataKafka Integration
0 likes · 7 min read
Apache Flume Quickstart: Log Collection and Kafka Integration
Ctrip Technology
Ctrip Technology
Sep 10, 2016 · Artificial Intelligence

Deep Learning Anti‑Scam Guide: An Informal Introduction to Neural Networks, Training, and Practical Applications

This article provides a light‑hearted yet thorough overview of deep learning, covering neural network fundamentals, layer construction, back‑propagation, ResNet shortcuts, encoder‑decoder structures, PU‑learning for unlabeled data, GPU acceleration, and practical advice on data size, frameworks, and deployment in financial scenarios.

BackpropagationBig DataGPU
0 likes · 27 min read
Deep Learning Anti‑Scam Guide: An Informal Introduction to Neural Networks, Training, and Practical Applications