From: Andy Lutomirski <[email protected]>
To: Ammar Faizi <[email protected]>,
Thomas Gleixner <[email protected]>,
Ingo Molnar <[email protected]>, Borislav Petkov <[email protected]>,
Dave Hansen <[email protected]>,
"H. Peter Anvin" <[email protected]>
Cc: "H.J. Lu" <[email protected]>, Michael Matz <[email protected]>,
GNU/Weeb Mailing List <[email protected]>, x86-ml <[email protected]>,
lkml <[email protected]>, Willy Tarreau <[email protected]>
Subject: Re: [PATCH v1 2/3] x86/entry/64: Add info about registers on exit
Date: Fri, 7 Jan 2022 16:03:46 -0800 [thread overview]
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
On 1/7/22 15:52, Ammar Faizi wrote:
> There was a controversial discussion about the wording in the System
> V ABI document regarding what registers the kernel is allowed to
> clobber when the userspace executes syscall.
>
> The resolution of the discussion was reviewing the clobber list in
> the glibc source. For a historical reason in the glibc source, the
> kernel must restore all registers before returning to the userspace
> (except for rax, rcx and r11).
>
> Link: https://lore.kernel.org/lkml/[email protected]/
> Link: https://gitlab.com/x86-psABIs/x86-64-ABI/-/merge_requests/25
>
> This adds info about registers on exit.
>
> Cc: Andy Lutomirski <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Borislav Petkov <[email protected]>
> Cc: Dave Hansen <[email protected]>
> Cc: "H. Peter Anvin" <[email protected]>
> Cc: Michael Matz <[email protected]>
> Cc: "H.J. Lu" <[email protected]>
> Cc: Willy Tarreau <[email protected]>
> Cc: x86-ml <[email protected]>
> Cc: lkml <[email protected]>
> Cc: GNU/Weeb Mailing List <[email protected]>
> Signed-off-by: Ammar Faizi <[email protected]>
> ---
>
> Quoted the full comment in that file after patched, so it's easier to
> review:
> /*
> * 64-bit SYSCALL instruction entry. Up to 6 arguments in registers.
> *
> * This is the only entry point used for 64-bit system calls. The
> * hardware interface is reasonably well designed and the register to
> * argument mapping Linux uses fits well with the registers that are
> * available when SYSCALL is used.
> *
> * SYSCALL instructions can be found inlined in libc implementations as
> * well as some other programs and libraries. There are also a handful
> * of SYSCALL instructions in the vDSO used, for example, as a
> * clock_gettimeofday fallback.
> *
> * 64-bit SYSCALL saves rip to rcx, clears rflags.RF, then saves rflags to r11,
> * then loads new ss, cs, and rip from previously programmed MSRs.
> * rflags gets masked by a value from another MSR (so CLD and CLAC
> * are not needed). SYSCALL does not save anything on the stack
> * and does not change rsp.
> *
> * Registers on entry:
> * rax system call number
> * rcx return address
> * r11 saved rflags (note: r11 is callee-clobbered register in C ABI)
> * rdi arg0
> * rsi arg1
> * rdx arg2
> * r10 arg3 (needs to be moved to rcx to conform to C ABI)
> * r8 arg4
> * r9 arg5
> * (note: r12-r15, rbp, rbx are callee-preserved in C ABI)
> *
> * Only called from user space.
> *
> * Registers on exit:
> * rax syscall return value
> * rcx return address
> * r11 rflags
> *
> * For a historical reason in the glibc source, the kernel must restore all
> * registers except the rax (syscall return value) before returning to the
> * userspace.
> *
> * In other words, with respect to the userspace, when the kernel returns
> * to the userspace, only 3 registers are clobbered, they are rax, rcx,
> * and r11.
> *
> * When user can change pt_regs->foo always force IRET. That is because
> * it deals with uncanonical addresses better. SYSRET has trouble
> * with them due to bugs in both AMD and Intel CPUs.
> */
>
> ---
>
> arch/x86/entry/entry_64.S | 13 +++++++++++++
> 1 file changed, 13 insertions(+)
>
> diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
> index e432dd075291..1111fff2e05f 100644
> --- a/arch/x86/entry/entry_64.S
> +++ b/arch/x86/entry/entry_64.S
> @@ -79,6 +79,19 @@
> *
> * Only called from user space.
> *
> + * Registers on exit:
> + * rax syscall return value
> + * rcx return address
> + * r11 rflags
> + *
> + * For a historical reason in the glibc source, the kernel must restore all
> + * registers except the rax (syscall return value) before returning to the
> + * userspace.
> + *
> + * In other words, with respect to the userspace, when the kernel returns
> + * to the userspace, only 3 registers are clobbered, they are rax, rcx,
> + * and r11.
> + *
I would say this much more concisely:
The Linux kernel preserves all registers (even C callee-clobbered
registers) except for rax, rcx and r11 across system calls, and existing
user code relies on this behavior.
> * When user can change pt_regs->foo always force IRET. That is because
> * it deals with uncanonical addresses better. SYSRET has trouble
> * with them due to bugs in both AMD and Intel CPUs.
>
--
GWML mailing list
[email protected]
https://gwml.gnuweeb.org/listinfo/gwml
next prev parent reply other threads:[~2022-01-08 0:03 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-01-07 23:52 [PATCH v1 0/3] x86-64 entry documentation and clean up Ammar Faizi
2022-01-07 23:52 ` [PATCH v1 1/3] x86/entry/64: Clean up spaces after the instruction Ammar Faizi
2022-01-07 23:52 ` [PATCH v1 2/3] x86/entry/64: Add info about registers on exit Ammar Faizi
2022-01-08 0:03 ` Andy Lutomirski [this message]
2022-01-08 0:34 ` Ammar Faizi
2022-01-07 23:52 ` [PATCH v1 3/3] Documentation: x86-64: Document registers on entry and exit Ammar Faizi
2022-01-08 0:02 ` Andy Lutomirski
2022-01-08 0:38 ` Ammar Faizi
2022-01-21 13:32 ` Borislav Petkov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
[email protected] \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox