Mobile Wi‑Fi Identification for Enhanced Network Positioning Using Machine Learning
By replacing rule‑based pipelines with an active‑learning‑driven random‑forest model that extracts clustering, signal, association, IP, and temporal features, Gaode accurately identifies mobile, cloned, and moved Wi‑Fi, cutting large‑error network‑positioning cases by ~18% and boosting overall positioning precision.
With the rapid development of the location industry over the past decade, positioning capabilities have evolved from low‑precision to high‑precision and from specific scenarios to ubiquitous positioning. Positioning technologies now include GNSS, dead‑reckoning (DR), map‑matching (MM), visual positioning, and network positioning.
Network positioning determines a device’s location by scanning surrounding Wi‑Fi and cellular base‑station signals. It serves as a strong complement to GNSS, providing fast location estimates when GNSS signals are unavailable or weak, and is a key reason why Gaode’s network positioning is embedded in many smartphones and apps across travel, social, O2O, P2P, tourism, news, weather, and other domains.
To build a reliable network positioning system, billions of Wi‑Fi and base‑station records must be mined for type, location, and fingerprint information. Historically, this mining relied on handcrafted rule‑based pipelines, which suffered from low precision and recall. A methodological shift is required to improve the system.
Network positioning consists of two stages: offline training and online positioning. Offline training collects GPS‑tagged Wi‑Fi and base‑station data (referred to as APs) and clusters them to produce two data products: an AP database (containing identifiers, coordinates, and other metadata) and a fingerprint database (containing signal‑strength distributions, scan frequencies, and optionally neural‑network‑derived features). Online positioning uses the scanned Wi‑Fi/base‑station list, matches it against the AP and fingerprint databases, and computes a real‑time location.
Typical AP entries only store the physical coordinates of the AP, while fingerprint entries store detailed signal characteristics such as RSSI distributions and scan frequencies. The AP database often models each AP as a point‑circle, which is an idealized representation that ignores real‑world signal blockage and reflection, making it suitable only for coarse positioning. In contrast, fingerprint data captures fine‑grained spatial distributions, enabling precise positioning.
Errors arise when Wi‑Fi devices are mobile, cloned, or have moved locations. Mobile Wi‑Fi (e.g., phone hotspots, 4G routers, Wi‑Fi on buses/trains) changes position frequently, leading to large positioning errors if treated as fixed anchors. Clone Wi‑Fi (different devices sharing the same MAC prefix) and moved Wi‑Fi (devices that have changed location but whose database entries are stale) also cause inaccuracies. These “bad cases” degrade user experience and must be identified and flagged in the AP database.
Manual rule‑based classification of such Wi‑Fi types yields limited precision and recall, prompting the adoption of supervised learning for mobile Wi‑Fi detection.
Sample Extraction (4.1) – Because the AP database contains an enormous number of Wi‑Fi records, random sampling for manual labeling would produce many easy‑to‑label examples and few informative ones. To obtain high‑quality labeled data at low cost, an active‑learning loop is employed: ambiguous samples are iteratively selected, labeled, and used to retrain the model, gradually stabilizing performance.
Ambiguous samples are defined as those where (1) the model’s prediction differs from the previous model’s prediction, (2) the predicted probability is around 0.5, or (3) the prediction fluctuates across training cycles.
Feature Extraction (4.2)
Initial features include clustering‑related metrics such as ratioX (ratio of points within X meters of a cluster center) and areaSquare (area of the rectangular fence covering the points). A table of feature names and descriptions is provided in the source.
During model iteration, confusion between mobile Wi‑Fi and clone/moved Wi‑Fi was observed because both exhibit dispersed location points. To address this, a clustering‑based approach groups locally dense points into clusters, computes dispersion metrics for each cluster, and aggregates these metrics to obtain a global dispersion score.
Additional feature groups were engineered:
Signal‑strength features: after normalizing for device differences, signal‑strength distributions differ between mobile and static Wi‑Fi.
Association features: each Wi‑Fi scan creates pairwise “neighbor” relationships among observed APs; the density of such associations reflects mobility.
IP features: mobile Wi‑Fi typically routes through cellular networks, while static Wi‑Fi uses fixed‑line broadband.
Time features: static Wi‑Fi is usually powered continuously, whereas mobile hotspots appear only intermittently.
For Wi‑Fi entries with insufficient positioning data (“weak‑information Wi‑Fi”), auxiliary signals such as SSID keywords (e.g., “iPhone”, “personal hotspot”, “oppo”) and MAC‑address prefixes (indicating manufacturer) are aggregated. Median and standard‑deviation statistics of top‑N features are computed per SSID/MAC‑prefix group, providing collective cues to infer individual Wi‑Fi attributes.
Application Scenarios (5) – Beyond improving overall network positioning accuracy, mobile Wi‑Fi detection enables hotspot identification, indoor/outdoor discrimination, building‑level and POI‑level positioning, and can be used by video‑streaming apps to decide whether to auto‑play or cache content based on the current network type.
Conclusion (6) – A random‑forest classifier, after feature selection and hyper‑parameter tuning, achieved >99.8% precision for mobile Wi‑Fi detection. Consequently, Gaode’s network positioning accuracy improved significantly, reducing large‑error bad cases by approximately 18%. Network positioning remains a low‑power complementary positioning method, especially valuable in GNSS‑denied environments such as subways and indoor spaces, and its capabilities are expected to further improve with the rollout of 5G technologies.
Amap Tech
Official Amap technology account showcasing all of Amap's technical innovations.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.