Fundamentals 15 min read

Understanding RTG (Related Thread Group) in the Linux Kernel: Core Selection, Frequency Aggregation, and Busy Hysteresis

The article explains Linux’s Related Thread Group (RTG) mechanism—its struct definition, default Android grouping, core‑selection logic that boosts whole groups to big‑core clusters, load‑aggregation for DCVS frequency decisions, and the newer busy‑hysteresis feature that delays low‑power entry to improve performance and power efficiency.

OPPO Kernel Craftsman
OPPO Kernel Craftsman
OPPO Kernel Craftsman
Understanding RTG (Related Thread Group) in the Linux Kernel: Core Selection, Frequency Aggregation, and Busy Hysteresis

The article introduces the concept of RTG (Related Thread Group) in the Linux kernel, mainly based on the author’s analysis of the CodeAurora MSM‑4.14 and MSM‑5.4 source trees. It starts with a brief background, noting that online information about RTG is scarce.

To illustrate why RTG is needed, a simple scenario is described: two benchmark threads (thread0 and thread1) alternately wake each other and sleep. When the threads run on separate CPUs, each CPU’s utilization stays around 50 %, which is insufficient to trigger DCVS (Dynamic Clock and Voltage Scaling) frequency boost. If both threads run on the same CPU, that CPU’s utilization reaches 100 % and the frequency is raised to the maximum. This demonstrates that thread groups are not isolated; their scheduling and power‑management decisions depend on inter‑thread relationships.

The article then examines the struct related_thread_group definition from the kernel source. Key fields include:

id : unique identifier of the group (default group has ID 1).

tasks : linked list of tasks belonging to the group, managed by add_task_to_group and remove_task_from_group .

list : global list of all groups ( active_related_thread_groups ).

preferred_cluster : pointer to a struct sched_cluster that indicates the CPU cluster the group prefers.

last_update : timestamp of the last _set_preferred_cluster call, used to avoid frequent updates.

In the default Android configuration, a group with id = DEFAULT_CGROUP_COLOC_ID is created and attached to the top‑app cgroup. All threads of the foreground application inherit this group, although the current implementation does not support dynamic updates – only newly created tasks are added to the group.

The function _set_preferred_cluster is the core of the “core selection” feature. It first checks last_update to limit the call frequency, then looks for any task in the group marked with SCHED_BOOST_ON_BIG . If such a task exists, the whole group is boosted to a big‑core cluster. Otherwise, it aggregates the load of all tasks in the group and selects a suitable cluster via best_cluster , which internally uses sched_group_upmigrate and sched_group_downmigrate . The purpose is to keep related threads on the same CPU cluster to exploit shared caches and reduce power consumption.

Beyond core selection, RTG provides “frequency aggregation”. The kernel’s walt_irq_work path accumulates per‑group load into aggr_grp_load using rq->grp_time.prev_runnable_sum . The grp_time structure holds four counters ( curr_runnable_sum , prev_runnable_sum , nt_curr_runnable_sum , nt_prev_runnable_sum ) that separately track current/previous windows and “new” tasks (tasks whose p->ravg.active_windows is less than SCHED_NEW_TASK_WINDOWS ). These counters are duplicated inside each run‑queue ( rq ) and also inside rq->grp_time when a task belongs to an RTG, allowing fast load aggregation for DCVS decisions.

The aggregated load is later used in freq_policy_load (part of the schedutil governor). When the global flag sched_freq_aggr_en is true, the load reported for a run‑queue includes the group’s coloc_boost_load . This mechanism is only enabled when the boost mode is set to FULL_THROTTLE_BOOST or RESTRAINED_BOOST , otherwise the scheduler falls back to the traditional per‑CPU runnable sum.

In newer kernel versions, RTG introduces a third feature called “Busy Hysteresis”. It adds a latency period during which a CPU that has become busy will not immediately enter low‑power (LPM) states. Key variables include:

sysctl_sched_busy_hyst_enable_cpus : bitmap enabling the feature per CPU.

sched_busy_hyst_ns : hysteresis duration in nanoseconds.

coloc_hyst_busy : a threshold derived from CPU capacity and a configurable percentage.

The function sched_update_hyst_times (called from _set_preferred_cluster ) updates these parameters when the group’s skin_min changes. When a CPU’s load drops below the threshold and the dequeue flag is true, busy_hyst_end_time is set, preventing the CPU from entering LPM until the hysteresis expires. This design avoids the power‑waste of frequent C‑state transitions, which is especially important for high‑refresh‑rate displays and low‑latency audio paths.

Finally, the article notes that the current implementation of attaching all top‑app threads to a single RTG is simplistic. A more refined approach would dynamically include system‑server threads that handle binder calls from the foreground app, and remove them when the interaction ends.

Overall, the piece provides a comprehensive overview of RTG’s background, data structures, core‑selection logic, load‑aggregation for DCVS, and the busy‑hysteresis mechanism that together aim to improve performance and power efficiency on Android devices.

CPU schedulingLinux kernelbusy hysteresisDCVSfrequency aggregationRTG
OPPO Kernel Craftsman
Written by

OPPO Kernel Craftsman

Sharing Linux kernel-related cutting-edge technology, technical articles, technical news, and curated tutorials

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.