Building a Hotel Information Graph with Nebula Graph at Ctrip: Architecture, Challenges, and Solutions
This article details how Ctrip’s hotel tech team built an information‑graph platform using Nebula Graph to link user scenarios with hotel data, addressing challenges of scene‑information mapping, data richness, and operational efficiency through a unified backend architecture, deployment strategies, and schema design.
The Ctrip hotel technology team aims to provide a seamless experience by matching each user query with the appropriate scene and product, but faces three major problems: missing scene‑information mapping, insufficient data richness, and low operational efficiency.
In the current system, hotel listings and detail pages are customized for different user groups and scenarios (e.g., ski vacations, family trips), but the front‑end displays are fragmented and lack a unified way to associate scene tags, rankings, and images with the underlying data.
The Information‑Graph project solves these issues through five modules: unified intent recognition, display‑logic sinking, relation matching, information mining, and data‑source enrichment. By abstracting scenes and data as vertices and edges in a graph database, the project makes relationship maintenance intuitive and efficient.
Front‑end display‑logic sinking consolidates the logic of more than ten display positions (quick filters, short tags, rankings, etc.) into a single back‑end service, allowing each position to be linked to a scene with a defined coupling level (S, A, B). Relation matching then establishes the connections between scenes and the corresponding data sources.
The technical backbone relies on Nebula Graph, which consists of graph, meta, and storage services. Data consistency is ensured by the Raft protocol, with each storage shard forming its own Raft group. The cluster is deployed across three data centers (1n:2n:2n) to achieve high availability, and future plans include independent clusters per site for blue‑green deployments and latency reduction.
Schema design models four major data blocks—tag information, basic information, UGC information, and GEO information—as vertices, with edges linking them to hotels and semantic tags. Hotspot issues (e.g., high‑degree nodes) are mitigated by increasing shard numbers, using prefix‑match Bloom filters, and enlarging block caches.
Performance tests on a 6‑graph, 5‑meta, 10‑storage node cluster (over 2 million vertices and 200 million edges) show 99th‑percentile query latency around 20 ms for 10 k QPS and write latency around 40 ms for 100 k writes, meeting business requirements.
In summary, the project successfully delivers scene‑aware front‑end content, improves data richness, and enhances operational efficiency. Ongoing work focuses on optimizing Nebula deployment, enriching data sources, and expanding scenario coverage.
The team is hiring for backend, front‑end, mobile, data, and algorithm positions. Interested candidates can send their resumes to [email protected] with the subject format "[Name]‑[Ctrip Hotel R&D]‑[Position]".
Ctrip Technology
Official Ctrip Technology account, sharing and discussing growth.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.