Linus Torvalds Adds User-Access Fast Validation Via Address Masking To Linux 6.12
([Linux Kernel] 106 Minutes Ago
Torvalds Coding)
- Reference: 0001493288
- News link: https://www.phoronix.com/news/User-Access-Fast-Linux-6.12
- Source link:
In between Linus Torvalds' busy week being in Vienna for the Linux Kernel Maintainer Summit and related Linux Foundation events as well as managing the Linux 6.12 merge window with landing new features like [1]sched_ext and [2]real-time PREEMPT_RT , he also managed to finish up some of his own code for this next kernel version. Being merged today is his own code working on a new user access fast validation path using address masking.
While it's rare these days for Linus Torvalds himself to take to coding any shiny and big new kernel features, in Linux 6.11 [3]he worked out some new ARM64 (AArch64) optimizations and now for Linux 6.12 he's landing this user access fast validation via address masking capability. Initially though this new code for Linux 6.12 is just benefiting x86_64 CPUs.
[4]
The short explanation is this new code can allow bypassing the need for some Spectre V1 speculation barriers and the now-expensive access_ok() calls as a result of those Spectre Variant One mitigations. Torvalds explained in today's [5]Git merge :
"Merge user access fast validation using address masking.
This allows architectures to optionally use a data dependent address masking model instead of a conditional branch for validating user accesses. That avoids the Spectre-v1 speculation barriers.
Right now only x86-64 takes advantage of this, and not all architectures will be able to do it. It requires a guard region between the user and kernel address spaces (so that you can't overflow from one to the other), and an easy way to generate a guaranteed-to-fault address for invalid user pointers.
Also note that this currently assumes that there is no difference between user read and write accesses. If extended to architectures like powerpc, we'll also need to separate out the user read-vs-write cases."
Torvalds [6]explained in more details as part of one of the patches:
"The Spectre-v1 mitigations made "access_ok()" much more expensive, since it has to serialize execution with the test for a valid user address.
All the normal user copy routines avoid this by just masking the user address with a data-dependent mask instead, but the fast "unsafe_user_read()" kind of patterns that were supposed to be a fast case got slowed down.
This introduces a notion of using
src = masked_user_access_begin(src);
to do the user address sanity using a data-dependent mask instead of the more traditional conditional
if (user_read_access_begin(src, len)) {
model.
This model only works for dense accesses that start at 'src' and on architectures that have a guard region that is guaranteed to fault in between the user space and the kernel space area.
With this, the user access doesn't need to be manually checked, because a bad address is guaranteed to fault (by some architecture masking trick: on x86-64 this involves just turning an invalid user address into all ones, since we don't map the top of address space).
This only converts a couple of examples for now. Example x86-64 code generation for loading two words from user space:
stac
mov %rax,%rcx
sar $0x3f,%rcx
or %rax,%rcx
mov (%rcx),%r13
mov 0x8(%rcx),%r14
clac
where all the error handling and -EFAULT is now purely handled out of line by the exception path.
Of course, if the micro-architecture does badly at 'clac' and 'stac', the above is still pitifully slow. But at least we did as well as we could."
This address masking based user-access fast validation is [7]merged now for Linux 6.12 on x86_64 alongside a lot of other interesting new code to land this week. The Linux 6.12 merge window will end in one week in culminating with the Linux 6.12-rc1 release.
[1] https://www.phoronix.com/news/Linux-6.12-Lands-sched-ext
[2] https://www.phoronix.com/news/Linux-6.12-Does-Real-Time
[3] https://www.phoronix.com/news/Torvalds-ARM64-Compress-Kernel
[4] https://www.phoronix.com/image-viewer.php?id=2024&image=fast_access_lrg
[5] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=de5cb0dcb74c294ec527eddfe5094acfdb21ff21
[6] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2865baf54077aa98fcdb478cefe6a42c417b9374
[7] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=de5cb0dcb74c294ec527eddfe5094acfdb21ff21
While it's rare these days for Linus Torvalds himself to take to coding any shiny and big new kernel features, in Linux 6.11 [3]he worked out some new ARM64 (AArch64) optimizations and now for Linux 6.12 he's landing this user access fast validation via address masking capability. Initially though this new code for Linux 6.12 is just benefiting x86_64 CPUs.
[4]
The short explanation is this new code can allow bypassing the need for some Spectre V1 speculation barriers and the now-expensive access_ok() calls as a result of those Spectre Variant One mitigations. Torvalds explained in today's [5]Git merge :
"Merge user access fast validation using address masking.
This allows architectures to optionally use a data dependent address masking model instead of a conditional branch for validating user accesses. That avoids the Spectre-v1 speculation barriers.
Right now only x86-64 takes advantage of this, and not all architectures will be able to do it. It requires a guard region between the user and kernel address spaces (so that you can't overflow from one to the other), and an easy way to generate a guaranteed-to-fault address for invalid user pointers.
Also note that this currently assumes that there is no difference between user read and write accesses. If extended to architectures like powerpc, we'll also need to separate out the user read-vs-write cases."
Torvalds [6]explained in more details as part of one of the patches:
"The Spectre-v1 mitigations made "access_ok()" much more expensive, since it has to serialize execution with the test for a valid user address.
All the normal user copy routines avoid this by just masking the user address with a data-dependent mask instead, but the fast "unsafe_user_read()" kind of patterns that were supposed to be a fast case got slowed down.
This introduces a notion of using
src = masked_user_access_begin(src);
to do the user address sanity using a data-dependent mask instead of the more traditional conditional
if (user_read_access_begin(src, len)) {
model.
This model only works for dense accesses that start at 'src' and on architectures that have a guard region that is guaranteed to fault in between the user space and the kernel space area.
With this, the user access doesn't need to be manually checked, because a bad address is guaranteed to fault (by some architecture masking trick: on x86-64 this involves just turning an invalid user address into all ones, since we don't map the top of address space).
This only converts a couple of examples for now. Example x86-64 code generation for loading two words from user space:
stac
mov %rax,%rcx
sar $0x3f,%rcx
or %rax,%rcx
mov (%rcx),%r13
mov 0x8(%rcx),%r14
clac
where all the error handling and -EFAULT is now purely handled out of line by the exception path.
Of course, if the micro-architecture does badly at 'clac' and 'stac', the above is still pitifully slow. But at least we did as well as we could."
This address masking based user-access fast validation is [7]merged now for Linux 6.12 on x86_64 alongside a lot of other interesting new code to land this week. The Linux 6.12 merge window will end in one week in culminating with the Linux 6.12-rc1 release.
[1] https://www.phoronix.com/news/Linux-6.12-Lands-sched-ext
[2] https://www.phoronix.com/news/Linux-6.12-Does-Real-Time
[3] https://www.phoronix.com/news/Torvalds-ARM64-Compress-Kernel
[4] https://www.phoronix.com/image-viewer.php?id=2024&image=fast_access_lrg
[5] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=de5cb0dcb74c294ec527eddfe5094acfdb21ff21
[6] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2865baf54077aa98fcdb478cefe6a42c417b9374
[7] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=de5cb0dcb74c294ec527eddfe5094acfdb21ff21
phoronix