Intel SGX With Linux 6.16 Less Likely To Cause Fatal Machine Checks
- Reference: 0001550124
- News link: https://www.phoronix.com/news/Linux-6.16-SGX-Fatal-MCE
- Source link:
There aren't any shiny new features for SGX with Linux 6.16 but just one important fix. In fact, only two changes at large: switching to using the SHA-256 library API instead of crypto_shash and then the fix that is the focus of this article... Preventing attempts to reclaim poisoned pages within SGX.
The patch from Intel engineer Andrew Zaborowski explains the situation:
"Pages used by an enclave only get epc_page->poison set in arch_memory_failure() but they currently stay on sgx_active_page_list until sgx_encl_release(), with the SGX_EPC_PAGE_RECLAIMER_TRACKED flag untouched.
epc_page->poison is not checked in the reclaimer logic meaning that, if other conditions are met, an attempt will be made to reclaim an EPC page that was poisoned. This is bad because 1. we don't want that page to end up added to another enclave and 2. it is likely to cause one core to shut down and the kernel to panic.
Specifically, reclaiming uses microcode operations including "EWB" which accesses the EPC page contents to encrypt and write them out to non-SGX memory. Those operations cannot handle MCEs in their accesses other than by putting the executing core into a special shutdown state (affecting both threads with HT.) The kernel will subsequently panic on the remaining cores seeing the core didn't enter MCE handler(s) in time.
...
This also doesn't completely close the time window when a memory error notification will be fatal (for a not previously poisoned EPC page) -- the MCE can happen after sgx_reclaim_pages() has selected its candidates or even *inside* a microcode operation (actually easy to trigger due to the amount of time spent in them.)"
At least for Linux 6.16 this effort to prevent attempts to reclaim poisoned pages within SGX enclaves should make "SGX less likely to induce fatal machine checks."
All the details for those interested within [3]this Git pull that is now in Linux 6.16 Git.
[1] https://www.phoronix.com/search/SGX
[2] https://www.phoronix.com/search/Linux+6.16
[3] https://lore.kernel.org/lkml/20250529180352.1935517-1-dave.hansen@linux.intel.com/
phoronix