Game Development 10 min read

Noise Reduction Techniques for Gaming Voice Chat in Internet Cafés Using Tencent Cloud GME

Tencent Cloud Gaming Multimedia Engine (GME) tackles the intense, non‑stationary background noise of internet cafés by employing a specialized voice‑activity detection algorithm combined with mean removal, low‑pass and numerical filtering, enabling it to isolate a primary speaker’s voice and suppress surrounding chatter, clicks, and music for clear in‑game communication.

Tencent Cloud Developer
Tencent Cloud Developer
Tencent Cloud Developer
Noise Reduction Techniques for Gaming Voice Chat in Internet Cafés Using Tencent Cloud GME

Frequent visitors of gaming forums often notice that many players complain about the poor experience of playing together in an internet café because of excessive background noise. During voice chat, if a teammate is located in a café, the other players' headphones are inevitably filled with various disruptive sounds, which severely degrades the gaming experience and can even affect the whole team's performance. In such a scenario, noise reduction becomes a fundamental operation to improve the experience.

The difficulty of noise reduction in an internet café is usually higher than in ordinary noisy environments. The café’s noise sources are diverse, including many people’s conversations and shouts, loud mouse‑click and keyboard‑click sounds, moving chairs and tables, and sometimes background music or public announcements. Seats are closely spaced, so each player has nearby neighbors, making the mutual interference especially annoying.

Eliminating these complex noises is not simple. The noise in a café is almost always non‑stationary, so traditional noise‑cancellation methods cannot be applied effectively. Tencent Cloud Gaming Multimedia Engine (GME) proposes a complete noise‑reduction solution for the café scenario, capable of minimizing the impact of noise on voice communication even in such complex environments.

How to achieve noise reduction in a complex internet‑café environment?

The requirement for noise reduction in a noisy café is: when a teammate is silent, no other sounds should be heard; when the teammate speaks, the listener should hear only the teammate’s clear voice; and as soon as the teammate stops speaking, all other sounds should become silent.

This problem can be abstracted as handling a single primary speaker’s speech in a noisy environment. To meet the tolerance level, a voice activity detection (VAD) algorithm that can exclude all sounds except the primary speaker is needed. This VAD differs from conventional speech detection because it must also filter out speech from non‑primary speakers; otherwise, nearby voices or distant noisy speech would still be transmitted to the listener.

GME provides such a VAD algorithm. The process flow is shown below:

When judging the nature of a sound, the algorithm computes the correlation of the speech, defined as follows:

where β is the gain factor and N is the analysis frame length. Setting the derivative of the correlation with respect to τ and β to zero yields the following equations:

Consequently, we obtain:

The relative error energy is:

where

Pre‑processing steps required to obtain the above result:

1. Mean removal: When the analysis window contains a non‑zero mean or very low‑frequency noise, the mean ρ(τ) becomes large for all τ, which interferes with the quiet‑segment classification that relies on the mean. The solution is to subtract the mean:

2. Low‑pass filtering: To reduce the influence of high‑frequency resonance peaks and noise, an 800 Hz low‑pass filter is applied, removing most resonance peaks while preserving the first and second harmonics when the fundamental frequency is up to 500 Hz. The design specifications are:

Using the bilinear transformation method, a fifth‑order filter is designed, and its magnitude response is shown below:

3. Numerical filtering: The low‑pass filter effectively removes the third and fourth resonance peaks, but the first two peaks still affect the signal, causing voiced sounds to become blurred. Numerical filtering is applied to eliminate this influence and reveal the signal trend, e.g., the rising edge:

Because this is a non‑causal digital system, it is rewritten for causality as follows:

Note that this process introduces algorithmic delay. Some speech codecs estimate the pitch period from the LPC residual because the residual has been “whitened” to remove resonance peaks. However, LPC analysis is not used here due to its complexity.

Ultimately, we are interested in measuring the periodicity level, defined as:

When this periodicity satisfies the condition, we also check whether the period falls within the typical pitch range of speech (60 Hz – 500 Hz). For an 8 kHz sampling rate, the corresponding sample‑period intervals are [80,147], [40,79], and [20,39]. If both periodicity and pitch range criteria are met, the sound is classified as speech.

Other components such as background‑noise envelope tracking and primary‑speaker energy tracking are omitted for brevity.

With this solution, most noisy sounds are completely excluded over time, as illustrated below:

Top: original audio; Bottom: audio after noise removal.

The effect diagram clearly shows that noise is greatly suppressed without affecting normal voice communication between players, satisfying the café‑noise requirement.

Through self‑developed technology, GME can accurately detect target speech in complex café environments and effectively remove background noise or other players’ noises, delivering an excellent “team‑play” voice experience.

Tencent Cloudgaming audioGMEnoise reductionsignal processingvoice activity detection
Tencent Cloud Developer
Written by

Tencent Cloud Developer

Official Tencent Cloud community account that brings together developers, shares practical tech insights, and fosters an influential tech exchange community.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.