Artificial Intelligence 7 min read

Puck: Baidu’s Open‑Source High‑Performance ANN Retrieval Engine

Puck, Baidu’s open‑source Approximate Nearest Neighbor engine built on the proprietary Puck and Tinker algorithms, delivers high recall, accuracy and throughput across tiny to trillion‑scale datasets, outperforms rivals in benchmarks—including first‑place BIGANN 2021—while offering a simple, extensible API, proven reliability in dozens of Baidu services, and an Apache 2.0 license encouraging community contributions.

Baidu Geek Talk
Baidu Geek Talk
Baidu Geek Talk
Puck: Baidu’s Open‑Source High‑Performance ANN Retrieval Engine

Puck is Baidu’s self‑developed open‑source Approximate Nearest Neighbor (ANN) retrieval engine, named after the agile DOTA hero. It is designed to achieve high recall, high accuracy, and high throughput across small, medium, and large data sets.

ANN (Approximate Nearest Neighbor) searches aim to find the top‑K closest vectors in a massive vector space while balancing retrieval quality and computational cost. Since the breakthrough of AlexNet in 2012 and the introduction of Transformers in 2017, ANN has become a foundational technology for search, recommendation, and many other AI‑driven applications.

The Puck project comprises two Baidu‑invented algorithms (Puck & Tinker). Open‑sourced internally in 2019, it now powers dozens of Baidu product lines, handling trillion‑scale indexes and massive query volumes.

Benchmark tests on datasets ranging from ten‑million to one‑billion vectors demonstrate clear performance advantages over competing solutions. In the 2021 BIGANN competition, Puck secured first place in all four participating tracks.

Key advantages include:

Ease of use – a simple API with minimal required parameters, most of which have sensible defaults.

Extensibility – a fully self‑designed index structure that supports a variety of functional extensions and modular redesign.

High performance – consistently superior QPS and recall on benchmark datasets.

Reliability – proven stability in large‑scale production across more than thirty Baidu services.

Additional functional extensions provide real‑time lock‑free insertion, conditional query filtering during index traversal, distributed index construction via map‑reduce, and adaptive parameter tuning that works well out‑of‑the‑box.

Puck is released under the Apache 2.0 license, encouraging community collaboration and knowledge sharing. More details, benchmark results, and the source code are available at the GitHub repository (https://github.com/baidu/puck) and the BIGANN benchmark page.

The community is invited to join the QQ group for support, contribute to the project, and help shape the future of open‑source ANN retrieval.

Vector SearchOpen SourceBenchmarkHigh PerformanceBaiduANN
Baidu Geek Talk
Written by

Baidu Geek Talk

Follow us to discover more Baidu tech insights.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.