Skip to content

fix: prevent cgroup isolation churn during WLT polling#116

Open
Karabiner07 wants to merge 1 commit intointel:mainfrom
Karabiner07:fix-cgroup-churn
Open

fix: prevent cgroup isolation churn during WLT polling#116
Karabiner07 wants to merge 1 commit intointel:mainfrom
Karabiner07:fix-cgroup-churn

Conversation

@Karabiner07
Copy link
Copy Markdown

Fix: Prevent cgroup isolation churn during WLT state flapping

Platform

Intel Core Ultra 7 255H (Arrow Lake-H, Family 6 Model 197)
Kernel: 6.17.0-22 — Ubuntu 24.04
CPU layout: 0–5 = P-cores, 6–13 = E-cores, 14–15 = L-cores


Problem

In Balanced (AUTO) mode with Mode=1 (cgroup v2 isolate), P-cores were
not being parked despite the cgroup isolation appearing to apply correctly
at startup. CPUs 0–5 remained active and scheduled work even though the
config specified ActiveCPUs=6-15 for all WLT states.

Power Saver mode (PowersaverDef=1, LPMD_ON) worked correctly — P-cores
parked at 400 MHz as expected. Only Balanced AUTO mode was affected.


Root Cause

process_cpu_isolate() in src/lpmd_cgroup.c unconditionally executes
this sequence every time it is called:

cpuset.cpus.partition = member    ← un-isolates, briefly exposes P-cores
cpuset.cpus.exclusive = 0-5
cpuset.cpus.partition = isolated  ← re-isolates
cpuset.cpus             = 0-5

The partition=member write, even for milliseconds, returns CPUs 0–5 to
the root cgroup's effective CPU set. The kernel scheduler immediately
places waking tasks on them.

In AUTO mode, WLT (Workload Type) hints from the hardware flux rapidly —
multiple transitions per second between UTIL_IDLE, UTIL_IDLE_BURSTY,
and UTIL_IDLE_GFX_BUSY. Each transition causes need_enter() to return
true (it compares state indices, not cpumask content), which calls
enter_state(), which calls process_cgroup(), which calls
process_cpu_isolate() — re-running the destructive write sequence every
time.

In the intel_lpmd_config_F6_M197.xml config (implementing arrow-lake support RN)
and Panther Lake M204 config, all four WLT states declare
identical ActiveCPUs masks. So every WLT flip is pure churn:
the cgroup is torn down and rebuilt withthe same content, many times per second.
The P-cores are never stably parked — every partition=member write opens a
scheduling window before the next partition=isolated closes it.

Power Saver was unaffected because LPMD_ON short-circuits to
DEFAULT_ON in choose_next_state(), bypassing WLT iteration entirely.
One stable cgroup entry, no flapping.


Fix

Added a file-static last_applied_cpumask cache to process_cgroup().
Before executing the cgroup write sequence, the function now compares the
requested cpumask against the last successfully applied one using the
existing cpumask_equal(). If the content is
identical, the write sequence is skipped entirely.

The cache is reset in cgroup_cleanup() so a subsequent daemon start
is never confused by stale state from a previous run.


Verification

After the fix, in Balanced mode on the 255H:

  • CPUs 0–5 (P-cores) hold at ~400 MHz under idle and light load
  • cat /sys/fs/cgroup/lpm/cpuset.cpus.partitionisolated (stable)
  • Debug log shows frequent Skip cgroup: cpumask unchanged and
    Process Cgroup fires only on genuine mask transitions
  • Profile switches (Performance ↔ Balanced ↔ Power Saver) continue
    to apply correct isolation at each transition
  • intel_lpmd_control ON|AUTO|OFF work continue to apply correct
    isolation at each transition

Signed-off-by: Joy Philip Pilli <joyphilip.p2001@gmail.com>
@psynyde
Copy link
Copy Markdown

psynyde commented May 5, 2026

@Karabiner07 not related to the pr itself but can i ask you how did you manage to make lpmd work under family 6 model 197? for me when i compile the from source and run it says Platform not supported yet!

image

@Karabiner07
Copy link
Copy Markdown
Author

@Karabiner07 not related to the pr itself but can i ask you how did you manage to make lpmd work under family 6 model 197? for me when i compile the from source and run it says Platform not supported yet!

I'll commit the required changes for the ARL 255H this weekend in my fork repo and I'll create a new PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants