Tagged articles

19 articles

Page 1 of 1

May 9, 2026 · Artificial Intelligence

How to Choose Between LangChain and LlamaIndex: Core Use‑Case Comparison for Agent Development

The article analyzes the design philosophies, key components, strengths, and weaknesses of LangChain and LlamaIndex, explains their distinct core scenarios—complex multi‑step agent orchestration versus private‑data RAG—and shows how they can be combined in real projects while outlining emerging ecosystem trends.

AgentLLMLangChain

0 likes · 13 min read

How to Choose Between LangChain and LlamaIndex: Core Use‑Case Comparison for Agent Development

Mingyi World Elasticsearch

Feb 25, 2026 · Databases

How to Accurately Track Document Write Time in Elasticsearch – 3 Practical Methods

Elasticsearch does not store a built‑in write timestamp, so to trace when a document was indexed you must add the field during ingest, using either an Ingest Pipeline, Logstash/Beats configuration, or application‑side code, with guidance on advantages, caveats, and handling historical data.

BeatsElasticsearchIngest Pipeline

0 likes · 5 min read

How to Accurately Track Document Write Time in Elasticsearch – 3 Practical Methods

Test Development Learning Exchange

Nov 9, 2024 · Fundamentals

Comprehensive Guide to Pandas Indexing Methods: loc, iloc, Boolean Indexing, Set/Reset Index, Multi‑Index, Alignment, Sorting, Dropping, and Advanced Techniques

This article provides a comprehensive guide to Pandas indexing in Python, covering basic loc and iloc selection, Boolean indexing, setting and resetting indices, multi‑level indexing, index alignment, sorting, dropping, and advanced methods such as at, iat, and query, with complete code examples.

boolean indexingdata indexingdata-analysis

0 likes · 9 min read

Comprehensive Guide to Pandas Indexing Methods: loc, iloc, Boolean Indexing, Set/Reset Index, Multi‑Index, Alignment, Sorting, Dropping, and Advanced Techniques

Architect

Nov 8, 2024 · Backend Development

How Ctrip Scaled Its Travel Product Log System to Billions of Records

This article traces the evolution of Ctrip’s travel product log platform—from a single‑table DB approach to a platform‑wide ES + HBase solution—detailing the challenges of massive data volume, the architectural decisions, RowKey design, write and query flows, and the subsequent extensions that enabled billion‑scale log storage and fast retrieval.

Backend ArchitectureBig DataCtrip

0 likes · 17 min read

How Ctrip Scaled Its Travel Product Log System to Billions of Records

DataFunTalk

Aug 21, 2024 · Big Data

Apache Paimon: Real‑Time Lakehouse Architecture, Core Technologies, Application Scenarios, and Frontier Features

This article presents a comprehensive overview of Apache Paimon, covering the concept of real‑time lakehouses, the underlying technologies such as LSM and merge‑on‑write, practical application cases across enterprises, and the latest frontier features like tags, branches, and advanced indexing, illustrating how Paimon bridges batch and streaming workloads in modern big‑data ecosystems.

Apache PaimonLSMdata indexing

0 likes · 16 min read

Apache Paimon: Real‑Time Lakehouse Architecture, Core Technologies, Application Scenarios, and Frontier Features

DataFunSummit

Oct 16, 2023 · Big Data

Bilibili's Iceberg‑Based Lakehouse Platform: Technical Practices for Sub‑Second Query Response

This article details Bilibili's implementation of an Iceberg‑based lakehouse platform that unifies storage and analytics, addressing Hive’s performance and latency issues through multidimensional sorting, various file‑level indexes, cube pre‑aggregation, star‑tree structures, and an automated Magnus service for intelligent optimization, achieving near‑second query responses.

Big DataIcebergLakehouse

0 likes · 14 min read

Bilibili's Iceberg‑Based Lakehouse Platform: Technical Practices for Sub‑Second Query Response

Weimob Technology Center

Aug 4, 2023 · Backend Development

How a Scalable Business Search Platform Powers Billions of Queries in WOS

The article outlines the background, design, challenges, and future roadmap of a business search platform within the Weimob Operating System, detailing its architecture, event ingestion, index building, and retrieval services that enable low‑cost, high‑performance search across multiple business domains.

Backend ArchitectureMicroservicesScalability

0 likes · 9 min read

How a Scalable Business Search Platform Powers Billions of Queries in WOS

dbaplus Community

Nov 29, 2022 · Backend Development

How a Mistaken Delete in ElasticSearch Nearly Erased 17 Million Products – Key Lessons

A senior engineer accidentally issued a DELETE request on an ElasticSearch index holding 17 million product records, triggering a massive data loss incident, and the team’s subsequent recovery strategies, scaling challenges, and process improvements are detailed to guide backend developers.

Incident ResponseMicroservicesScaling

0 likes · 14 min read

How a Mistaken Delete in ElasticSearch Nearly Erased 17 Million Products – Key Lessons

JD Retail Technology

Jul 19, 2022 · Backend Development

Design and Architecture of JD Retail Product Selection Platform

This article details the design and implementation of JD Retail’s product selection platform, covering its business background, core data retrieval capabilities, domain model, system architecture—including frontend configurability, backend query engine, ClickHouse indexing, and both offline and real-time data processing pipelines.

Big DataSystem Architecturedata indexing

0 likes · 14 min read

Design and Architecture of JD Retail Product Selection Platform

Python Crawling & Data Mining

May 29, 2020 · Big Data

How to Connect Python to Elasticsearch for Efficient Data Crawling and Search

This guide explains how to install the Elasticsearch Python client, build a wrapper class for index management and CRUD operations, import data from MongoDB, use a Celery‑based crawler to harvest Baidu Baike content, and expose search functionality through Flask or other Python web frameworks.

ElasticsearchFlaskMongoDB

0 likes · 7 min read

How to Connect Python to Elasticsearch for Efficient Data Crawling and Search

DevOps Cloud Academy

Jan 2, 2020 · Big Data

Introduction, Use Cases, Installation, and Basic Operations of Elasticsearch

This article introduces Elasticsearch as a distributed search and analytics engine, outlines its common application scenarios, provides step‑by‑step installation commands, explains core concepts such as documents and indices, and demonstrates basic indexing, retrieval, bulk processing, and aggregation operations.

DistributedElasticsearchLog Analytics

0 likes · 4 min read

Introduction, Use Cases, Installation, and Basic Operations of Elasticsearch

ITPUB

Sep 1, 2019 · Databases

How Elasticsearch Stores and Retrieves Data: Inside Lucene’s Write‑Refresh‑Flush‑Merge Cycle

This article explains the fundamental architecture of Elasticsearch and its underlying Lucene engine, detailing the data model, index hierarchy, and the step‑by‑step write, refresh, flush, and merge processes that enable near‑real‑time search and data durability.

data indexinglucenesearch engine

0 likes · 8 min read

How Elasticsearch Stores and Retrieves Data: Inside Lucene’s Write‑Refresh‑Flush‑Merge Cycle

21CTO

Jun 10, 2019 · Databases

Master Elasticsearch in Python: From Installation to Advanced Queries

This tutorial introduces Elasticsearch, explains its architecture and use cases, walks through installation, index creation, mapping, CRUD operations, and demonstrates how to integrate and query Elasticsearch from Python using both the REST API and the official client library.

ElasticsearchNoSQLPython

0 likes · 13 min read

Master Elasticsearch in Python: From Installation to Advanced Queries

MaGe Linux Operations

Jul 19, 2018 · Databases

Master Elasticsearch with Python: From Installation to Advanced Queries

This tutorial walks you through installing Elasticsearch, creating indices, inserting and updating documents, performing searches via REST API, and integrating Elasticsearch into Python applications using both raw HTTP requests and the official Python client, illustrated with practical examples and screenshots.

ElasticsearchNoSQLREST API

0 likes · 11 min read

Master Elasticsearch with Python: From Installation to Advanced Queries

21CTO

Oct 19, 2017 · Backend Development

Build Your Own Full-Text Search Engine with Elasticsearch: Step‑by‑Step Guide

This tutorial walks you through installing Elasticsearch, configuring Java and network settings, understanding core concepts like nodes, clusters, indices and documents, setting up Chinese analyzers, performing CRUD operations, and executing powerful full‑text queries using the Elasticsearch REST API.

Backend DevelopmentChinese AnalyzerElasticsearch

0 likes · 13 min read

Build Your Own Full-Text Search Engine with Elasticsearch: Step‑by‑Step Guide

Weidian Tech Team

Feb 24, 2017 · Big Data

How We Built a Scalable Dump Index Architecture for 60M Users and 1.3B Products

Facing the challenges of searching across 60 million users and 1.3 billion products, Weidian’s engineering team designed a dump‑based indexing pipeline—Ergate—that consolidates, transforms, version‑controls, and monitors data from MySQL to HBase, enabling fast, flexible, and reliable search across massive datasets.

HBasePlatformizationdata indexing

0 likes · 7 min read

How We Built a Scalable Dump Index Architecture for 60M Users and 1.3B Products

Ctrip Technology

Sep 2, 2016 · Big Data

Why Druid? Architecture, Indexing, Use Cases, and Lessons Learned

This article introduces Druid as an open‑source, distributed column‑store OLAP engine, explains its architecture and indexing mechanisms, discusses real‑time and batch data ingestion for order analytics at Qunar, compares it with other engines, and shares practical tips and pitfalls.

CaravelDruidOLAP

0 likes · 8 min read

Why Druid? Architecture, Indexing, Use Cases, and Lessons Learned

Qunar Tech Salon

Aug 17, 2016 · Big Data

Why Druid? Architecture, Indexing Methods, Use Cases, Pros & Cons, and Integration with Caravel

This article explains Druid’s purpose as a real‑time, distributed, column‑store OLAP engine, details its architecture and indexing techniques, discusses practical use cases and limitations, and shows how Caravel can complement Druid for visual analytics and detailed data access.

CaravelDruidOLAP

0 likes · 8 min read

Why Druid? Architecture, Indexing Methods, Use Cases, Pros & Cons, and Integration with Caravel

Qunar Tech Salon

Feb 14, 2016 · Big Data

Accelerating Real‑Time Data Queries with Solr in Alibaba's Jushita Platform

This article explains how Alibaba's Jushita platform leverages Apache Solr with a wide‑table data model and a custom QParser plugin to achieve real‑time, multi‑dimensional buyer filtering that traditional relational databases cannot handle efficiently in big‑data scenarios.

Big DataReal-time QuerySolr

0 likes · 10 min read

Accelerating Real‑Time Data Queries with Solr in Alibaba's Jushita Platform