Mastering Approximate Top‑K: Scalable Hotspot Detection for Go Backends

When a small fraction of requests overwhelms a system, understanding which endpoints, keys, or users cause the bottleneck is crucial; this article explains why traditional full‑count sorting fails at scale, introduces efficient approximate Top‑K algorithms such as fixed‑size min‑heap and Count‑Min Sketch, and provides production‑ready Go implementations with practical usage patterns and performance benchmarks.

Data StructuresGolangMonitoring

0 likes · 15 min read

Mastering Approximate Top‑K: Scalable Hotspot Detection for Go Backends

NiuNiu MaTe

Dec 31, 2021 · Fundamentals

Master TopK: From Simple Sorts to Heap and QuickSelect Solutions

This article explains the TopK problem, compares sorting‑based O(nk) approaches with heap‑based O(n + k log n) and quick‑select O(n) methods, and provides complete Java implementations for each technique, helping readers ace interview questions on finding the largest K elements.

HeapSortingquickselect

0 likes · 13 min read

Master TopK: From Simple Sorts to Heap and QuickSelect Solutions

Alimama Tech

Sep 8, 2021 · Artificial Intelligence

Engineering Optimizations for Large‑Scale Advertising Recall Models: Full‑Cache Scoring and Index Flattening

Alibaba Mama’s advertising platform modernized its Tree‑based Deep Model by introducing a dual‑tower full‑library DNN with aggressive pre‑filtering and custom GPU TopK kernels, and a flattened‑tree model that retains beam search with multi‑head attention, while applying memory‑aware tricks such as attention swapping, softmax approximation, tiled‑matmul splitting, TensorCore batching, INT8 quantization and cache‑resident ad vectors, enabling multi‑fold latency reductions with minimal recall loss.

Beam SearchGPU AccelerationRecommendation Systems

0 likes · 15 min read

Engineering Optimizations for Large‑Scale Advertising Recall Models: Full‑Cache Scoring and Index Flattening