Artificial Intelligence 11 min read

Mogujie's Search System Architecture and Online Request Flow

This article introduces Mogujie's end‑to‑end search system architecture, detailing its online and offline components such as Topn, ABTest, QR, fine‑ranking, search engine, UPS, and feature platforms, and then walks through a real‑world online request example to illustrate how queries are processed, rewritten, personalized, and finally ranked.

Architecture Digest
Architecture Digest
Architecture Digest
Mogujie's Search System Architecture and Online Request Flow

The vision of Mogujie is to make half of humanity happier, and enabling every female user to easily find desired products is the goal of its search system. As a critical traffic entry, the search system optimizes merchant traffic distribution and enhances user experience by placing the most relevant and high‑quality items first.

Overall Architecture

The architecture is divided into online and offline parts. The online side handles real‑time requests and includes the business layer, placement layer, fine‑ranking layer, and engine layer, while the offline side covers algorithm training and data pipelines such as ACM data collection and dump processes.

Core Online Components

Topn serves as the unified entry point, abstracting data sources and routing requests to various search engines and ranking systems, while also providing ABTest routing and algorithm configuration.

ABTest implements multiple routing rules (UUID, hash, user tags) and supports layered experiments, offering real‑time effect statistics via a unified console.

QR (Query Rewrite) expands user queries through tokenization, synonym expansion, category relevance prediction, brand weighting, and other plugins, allowing flexible algorithmic extensions.

Fine‑Ranking System performs personalized ranking using richer features and complex models, supporting frequent AB testing and dynamic configuration.

Search Engine is a high‑performance C++ engine built on the proprietary ZIndex framework, offering retrieval, filtering, statistics, and multi‑stage ranking with a plugin‑based architecture.

UPS (User Profile System) stores offline and real‑time user behavior data (clicks, adds to cart, purchases) to provide personalized signals for ranking.

Engine Operations Platform manages engine instance lifecycle, index building, deployment, monitoring, and alerts, leveraging Docker for containerized deployments.

Algorithm Sorting Platform provides a visual backend for creating algorithm scenarios, models, ranking strategies, and evaluation before online deployment.

Dump System standardizes data flow from upstream sources to downstream storage, supporting incremental, full, and mini‑full data syncs.

Feature Management Platform centralizes feature definition, generation, storage, publishing, validation, and monitoring for algorithm developers.

ACM Data Collection System captures, cleans, and aggregates user behavior logs to supply reliable data for model training and real‑time reporting.

Online Search Flow Example

When a user types "nike" in the Mogujie app, the request follows these steps:

Topn receives the query, determines ABTest and routing configurations, and decides whether to invoke UPS, which engine to use, and if fine‑ranking is needed.

QR rewrites the query, adding brand weighting, category relevance, synonym expansion, and tokenization as applicable.

UPS retrieves the user's historical and real‑time behavior data to supply personalization signals.

Search Engine combines the rewritten query and personalization data, executes recall and coarse ranking using configured plugins (e.g., LTRRanker, brand weighting, anti‑spam).

Fine‑Ranking System applies personalized re‑ranking with advanced models, then returns the top‑K results to the frontend.

Conclusion

The article presented Mogujie's comprehensive search system architecture and dissected an online request pipeline, highlighting how each component collaborates to deliver efficient, personalized product search. Future iterations will continue evolving the architecture to better support business and algorithmic needs.

personalizationRankingsearch architecturequery-rewriteMogujieonline flow
Architecture Digest
Written by

Architecture Digest

Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.