News: 0001541423

  ARM Give a man a fire and he's warm for a day, but set fire to him and he's warm for the rest of his life (Terry Pratchett, Jingo)

Intel Xe Driver Adds Fan Speed Reporting For Linux 6.16, BMG Instability Being Debugged

([Intel] 3 Hours Ago Intel Xe Driver)


Back in the Linux 6.12 kernel cycle [1]the Intel i915 kernel graphics driver added fan speed reporting support. Finally for the upcoming Linux 6.16 cycle that fan speed reporting will also be working with the modern Intel Xe kernel graphics driver used by default with Intel's latest integrated and discrete graphics processors.

Similar to the i915 driver implementation, the Intel Xe driver is rolling out fan speed reporting. This fan speed reporting is exposed to user-space via the standard hardware monitoring "HWMON" interfaces. With the initial driver code prepped for Linux 6.16, up to three fan speeds can be reported via the fan1_input, fan2_input and fan3_input sysfs attributes.

That Xe driver fan speed reporting support is the most notable change of this week's [2]drm-xe-next pull request for DRM-Next. Other Xe changes submitted this week include a new SR-IOV workaround, temporarily disabling D3cold on Battlemage graphics cards, initial work on Shared Virtual Memory (SVM) multi-device support, and various other workarounds and improvements.

The D3cold power state support is temporarily disabled for Battlemage while Intel developers work out "many instability cases" attributed to the D3cold to D0 power state transitions:

"Currently, many instability cases related to D3Cold -> D0 transition on BMG are under investigation. Among them some bad cases where the device is lost after 1 to 3 transitions from D3Cold to D0 on the runtime pm, with pcieport upstream bridge port link retrain failure.

In other cases, it works fine, but with some sudden random memory corruptions after D3cold, that could be 0xffff missed ack on GT forcewake or GuC reload related failures.

In some other cases though, D3Cold -> D0 works pretty reliably. It looks like it is a combination of GPU cards and Host boards at this point. So, there is no possible/available quirk at this time.

This patch disables the D3Cold by default on BMG by reducing the vram_d3cold_threshold to 0. Users and developers who wants to enable it are still able to via

$ echo 300 > /sys/bus/pci/devices/ /vram_d3cold_threshold"

We'll see what more Intel Xe kernel driver improvements get queued up in the coming weeks, leading up to the Linux 6.16 merge window opening in late May or early June.



[1] https://www.phoronix.com/news/Intel-GPU-Fan-Speed-Linux

[2] https://lore.kernel.org/dri-devel/aADWaEFKVmxSnDLo@fedora/



pWe00Iri3e7Z9lHOX2Qx

To every Ph.D. there is an equal and opposite Ph.D.
-- B. Duggan