12747 sigsetjmp should allow for 8 byte aligned buffer on amd64

Review Request #2565 — Created May 25, 2020 and updated


This solution fixes libc. No need recompilation of existing binaries.

Before patch:

$ /usr/bin/amd64/mdb
> Segmentation Fault

After patch:

$ LD_PRELOAD=/tmp/libc.so.1 /usr/bin/amd64/mdb 
> ::version
mdb 1.1 (DEBUG)
  • 0
  • 0
  • 6
  • 1
  • 7
Description From Last Updated
  1. This doesn't seem like it is sufficient by itself. While we perhaps consider that mdb will be rebuilt along with a new libc, we cannot force other existing binaries to be recompiled in order to work with a newer libc. It seems like we'll need to (at a minimum) perform some surgery in __csigsetjmp() to deal with an insufficiently aligned pointer?

    1. Yep, I have another fix with changing __csigsetjmp. I am going to update review shortly, when we discuss cons and pros of the current (v1) fix. I proposed this fix because it is small and simple.

      Question is how many applications are shipped without recompilation? If we look at OI or other distros they all do build for whole repo. Or you were asking about another applications?

    2. The setjmp() and sigsetjmp() routines that end up using __csigsetjmp() are a part of our (stable) ABI. We can't change the way those work (as visible from outside), or we will break existing binaries. However it is achieved, those symbols in libc need to continue to work the same way as visible from applications.

      With respect to how many applications are shipped without recompilation, I would venture that it's a lot. Some folks, myself included, have a variety of binaries compiled quite some time ago that we have no need or desire to recompile. On a somewhat more industrial scale, Joyent produces pkgsrc binary packages that work on essentially all illumos distributions. Indeed, even in OpenIndiana, there is a mixture of short and long term usage of binaries that are not or cannot be recompiled; e.g., the closed binaries leftover from OpenSolaris, and some other bits and pieces.

      I'd be inclined to put up the fix for the bug in __csigsetjmp(). Once that fix is in, it's not clear we would need to change the header here -- though depending on how we fix the alignment issue it may still be advantageous for new binaries.

    3. If we're doing an alignment thing here, that suggests that the compiler is emitting some SSE/AVX instructions assuming alignment of a structure that isn't actually aligned. Suggesting that new things potentially align themselves to improve their future performance is fine, but it feels like it's working around the problem, mainly that some function is assuming stricter it's receiving structures with stricter alignment than is actually true.
      One has to assume it'll always be easy for something in another language to cons this up as well, whether that's rust, golang, etc. It seems like fixing it one piece of software to handle this case, libc, is better than trying to fix anything that uses a sigjmp_buf.
    4. @Joshua That makes sense. I am updating review with fix in libc. It does not require re-compilation of exisiting binaries.

      @Robert That is not compiler's issue and problem is almost in the code: it doesn't follow strict aliasing rules
      (cast pointer aligned to long to type aligned to double long) and by C99 standard it has UB: Pointers
      7. A pointer to an object or incomplete type may be converted to a pointer to a different object or incomplete type. If the
      resulting pointer is not correctly aligned57) for the pointed-to type, the behavior is undefined.

    5. Vitaliy, my point wasn't that it was a compiler bug, but rather that we should fix it at the source of the issue, which was in libc. I suspect this isn't going to be the last of these types of bugs in the system that are exposed as one changes optimization or tuning levels.
    6. @Robert, so v2 fix (in libc) is in the right way as you proposed ?

    7. Any comments?

  2. usr/src/lib/libc/amd64/threads/asm_subr.s (Diff revision 2)
    Is the reason that we're not doing any adjustments here an assumption that we're coming in with suitable alignment on the stack already and therefore we have _LONG_ALIGNMENT and don't need to further adjust the stack? It's probably worth a comment stating that we're relying on the stack's natural alignment and calling conventions for this unlike the other places.
    1. Particular this "subq %rsp" doesn't need something else. %rsp must be at least 8-byte aligned here. If we look at compiler option it doesn't even support less that 3 for x86_64 arch:



      Attempt to keep the stack boundary aligned to a 2 raised to num byte boundary. If -mpreferred-stack-boundary is not specified, the default is 4 (16 bytes or 128 bits).
      Warning: When generating code for the x86-64 architecture with SSE extensions disabled, -mpreferred-stack-boundary=3 can be used to keep the stack boundary aligned to 8 byte boundary. Since x86-64 ABI require 16 byte stack alignment, this is ABI incompatible and intended to be used in controlled environment where stack space is important limitation. This option leads to wrong code when functions compiled with 16 byte stack alignment (such as functions from a standard library) are called with misaligned stack. In this case, SSE instructions may lead to misaligned memory access traps. In addition, variable arguments are handled incorrectly for 16 byte aligned objects (including x87 long double and __int128), leading to wrong results. You must build all modules with -mpreferred-stack-boundary=3, including any libraries. This includes the system libraries and startup modules.

      Do you still think that it should be mentioned in the code?

    2. Yes, I think it's worth calling out the differences in expectations. I understand the alignment of how the stack works. The compiler options don't matter here it's the ABI that we'll always have 8 byte alignment no matter what (it calls for %rsp + 8 % 16 == 0 at entry to the function so we can do the pushq %rbp and be 16-byte aligned.

      However, a reader of the code who's just coming into this for the first time, might not know what the alignment is and at what point it has that alignment, e.g. before or after the call instruction. The point is, this particular case is different from all the other callers and that's OK because of the circumstances of howe we're constructing and declaring it on the stack, but that's non-obvious.

    3. Current issue is not about aligment of the stack, but about aligment of argument value (pointer). What's the reason to specify aligment of the stack here if it doesn't matter ? Especially as you said is is always at least 8-byte aligned by ABI.

    4. So, if this had been written in C code, you would have decalared the sigjmp_buf on the stack. Used the SIGJMP2UCONTEXT macro to get a properly aligned ucontext_t and filled it out. However, in this code we're not doing all those steps. We're saying that we know that the P2ROUNDUP will not modify the value which is why we can start filling in the ucontext_t data right at the offset that we have. This property is a side effect of the stack alignment requirements and the pushs that we have done. The subl highlighted just assumes it'll be 16-byte aligned.
      I just thought it was worth calling this out in the comment because it's a non-obvious property. If someone inserted a pushq between the prologue and the subq, this would no longer function correctly.
    5. No, I would not put sigjum_buf on the stack, I would put gregset_t on the stack and that is different thing.

      I did't change gregset_t behaviour and others in the current fix. So please look at description why condition "SIZEOF_SIGJMP_BUF - _LONG_ALIGNMENT < SIZEOF_UCONTEXT_T" is required (above of that condition).

  3. usr/src/lib/libc/inc/sigjmp_struct.h (Diff revision 2)
    Why are both of these here? Shouldn't all we need is __amd64?
    1. I got this from
      uts/common/sys/isa_defs.h:#if defined(__x86_64) || defined(__amd64)

      And illumos-gate has some the same invocaions:

       usr/src/tools/aw/aw.c:                  newae(cpp, "-D__x86_64");

      But it seems you are right and with passed "-D__amd64" to compiler using defined(__amd64) is enough.

    2. sys/isa_defs.h is being conservative for different compilers which may define different things. The more important thing is that right after that in isa_defs.h we actually define __amd64 if it's not to normalize and standardize on one thing in our internal headers.
  4. usr/src/lib/libc/inc/sigjmp_struct.h (Diff revision 2)
    I would add a comment that explains why we need this. It'd be worth understanding if SPARCv9 requires the same thing or not. Is the reason that we're not doing something similar for 32-bit systems because we're assuming that the structure natural alignment of four bytes will be the same as the pointer?
    1. Yes, I will add.

      And as answer is i386 and sparc do not require additional thing because:

      1. SPARC implementation does not put entire ucontext_t into sigjmp_buf and uses sigjmp_struct_t as shorted variant of ucontext_t, i.e. it does not contain float-point registers and other non-related thing. So sigjmp_struct_t uses only pointer and long/int. Therefore alignment of sigjmp_struct_t on SPARCv8 is 4 and for SPARCv9 is 8 (long) and is the same as alignment of sigjmp_buf.

      2. i386 has alignment 4 for all trivial types and as result ucontext_t also has alignment 4 and is the same as alignment of sigjmp_buf.

  1. Apologies for the delay in getting back to you, Vitaliy. The general code in this looks fine. Though I think it's worth improving the comments (and maybe it's worth updating the bug id with a clearer synopsis of what went wrong).

    1. @Rober, Sorry, I am not able to change subject of the bug 12747. Could you change Subject to whetever you want and I will update review ?

      What comments did you mean ? I believe, comment https://www.illumos.org/issues/12747#note-1 describes the issue quite fine.

    2. I've just granted you the Developer role in the illumos-gate project, so you should be able to change things like that now! Sorry about that.

  2. usr/src/lib/libc/inc/sigjmp_struct.h (Diff revision 3)
    Can you please call it amd64 and not x86_64 to match the rest of the system please?
  1. Thanks for the follow ups here. I still believe we should add a comment where we've been discussing.
  1. I added a more detailed explanation for the symptoms to the bug. I've also reworked the comments below, and there's a whitespace nit. Once you fix that up, I think we're good to go! Thanks for the work on this so far.

    1. Thanks, I applied your comments.

  2. I think this comment could be clearer; e.g.,

     * Ensure that a "ucontext_t" will fit within a "sigjmp_buf", including the
     * extra 8 bytes we may need for correct alignment on AMD64.
  3. usr/src/lib/libc/inc/sigjmp_struct.h (Diff revision 7)

    I think this comment could be clearer; e.g.,

     * The "sigjmp_buf" type is an array of long and thus can have 8-byte alignment
     * on AMD64 systems.  The "ucontext_t" type has a stricter 16-byte alignment
     * requirement, so we must round the pointer up when casting.
     * This is not required on other architectures:
     *  - SPARC does not store the ucontext_t in the sigjmp_buf
     *  - i386 only requires 4-byte alignment for ucontext_t
  4. usr/src/lib/libc/inc/sigjmp_struct.h (Diff revision 7)

    Extra space before "sizeof" on this line.

    1. Interesting, neither cstyle nor "git nits/bpchk" haven't mentioned about this extra-space.

    2. I suspect it gets a bit confused in macros, at times.

Review request changed

Change Summary:


  • Correct commenst according to Joshua's notes.

  • Fix csctyle complaint about old code
    "C style: usr/src/lib/libc/amd64/threads/machdep.c: 70:
    continuation line should be indented by 4 spaces"

  • Fix extra space before sizeof




Revision 8 (+26 -7)

Show changes

  1. Thank you for the changes! This looks good to me.

  1. Ship It!