News: 0001550124

  ARM Give a man a fire and he's warm for a day, but set fire to him and he's warm for the rest of his life (Terry Pratchett, Jingo)

Intel SGX With Linux 6.16 Less Likely To Cause Fatal Machine Checks

([Intel] 6 Hours Ago Fatal Machine Check)


Intel's Software Guard Extensions ( [1]SGX ) updates for the in-development [2]Linux 6.16 contain a fix so SGX is now less likely to cause a fatal machine check.

There aren't any shiny new features for SGX with Linux 6.16 but just one important fix. In fact, only two changes at large: switching to using the SHA-256 library API instead of crypto_shash and then the fix that is the focus of this article... Preventing attempts to reclaim poisoned pages within SGX.

The patch from Intel engineer Andrew Zaborowski explains the situation:

"Pages used by an enclave only get epc_page->poison set in arch_memory_failure() but they currently stay on sgx_active_page_list until sgx_encl_release(), with the SGX_EPC_PAGE_RECLAIMER_TRACKED flag untouched.

epc_page->poison is not checked in the reclaimer logic meaning that, if other conditions are met, an attempt will be made to reclaim an EPC page that was poisoned. This is bad because 1. we don't want that page to end up added to another enclave and 2. it is likely to cause one core to shut down and the kernel to panic.

Specifically, reclaiming uses microcode operations including "EWB" which accesses the EPC page contents to encrypt and write them out to non-SGX memory. Those operations cannot handle MCEs in their accesses other than by putting the executing core into a special shutdown state (affecting both threads with HT.) The kernel will subsequently panic on the remaining cores seeing the core didn't enter MCE handler(s) in time.

...

This also doesn't completely close the time window when a memory error notification will be fatal (for a not previously poisoned EPC page) -- the MCE can happen after sgx_reclaim_pages() has selected its candidates or even *inside* a microcode operation (actually easy to trigger due to the amount of time spent in them.)"

At least for Linux 6.16 this effort to prevent attempts to reclaim poisoned pages within SGX enclaves should make "SGX less likely to induce fatal machine checks."

All the details for those interested within [3]this Git pull that is now in Linux 6.16 Git.



[1] https://www.phoronix.com/search/SGX

[2] https://www.phoronix.com/search/Linux+6.16

[3] https://lore.kernel.org/lkml/20250529180352.1935517-1-dave.hansen@linux.intel.com/



phoronix

Felix Catus is your taxonomic nomenclature,
An endothermic quadroped, carnivorous by nature.
Your visual, olfactory, and auditory senses
Contribute to your hunting skills and natural defenses.
I find myself intrigued by your sub-vocal oscillations,
A singular development of cat communications
That obviates your basic hedonistic predelection
For a rhythmic stroking of your fur to demonstrate affection.
A tail is quite essential for your acrobatic talents:
You would not be so agile if you lacked its counterbalance;
And when not being utilitized to aid in locomotion,
It often serves to illustrate the state of your emotion.
Oh Spot, the complex levels of behavior you display
Connote a fairly well-developed cognitive array.
And though you are not sentient, Spot, and do not comprehend,
I nonetheless consider you a true and valued friend.
-- Lt. Cmdr. Data, "An Ode to Spot"