Skip to content

Stack misalignment on AVX-512 with asan + detect_stack_use_after_return=0 #91565

@glandium

Description

@glandium

Self-sufficient reproducer:

void qux(int& y);
struct alignas(64) Foo {
    int a[16] = {};
};
void hoge(Foo x);
void foo() {
    int x, y, z;
    qux(x);
    qux(y);
    qux(z);
    hoge(Foo());
}

Call the above foo.cc, and add the following as main.cc for an executable reproducer:

extern "C" const char *__asan_default_options() {
    return "detect_stack_use_after_return=0";
}

void qux(int& y) {}

struct alignas(64) Foo {
    int a[16] = {};
};

void hoge(Foo x) {}

void foo();

int main() {
    foo();
    return 0;
}

Compile with clang++ -o foo foo.cc main.cc -fsanitize=address -O2 -mavx512f -g. Running the program crashes:

Program received signal SIGSEGV, Segmentation fault.
0x0000559c458fd95c in foo () at foo.cc:14
14          hoge(Foo());
(gdb) disassemble $rip,$rip+1
Dump of assembler code from 0x559c458fd95c to 0x559c458fd95d:
=> 0x0000559c458fd95c <_Z3foov+236>:    vmovaps %zmm0,(%rsp)
End of assembler dump.
(gdb) p/x $rsp
$1 = 0x7ffc7c389ce0

vmovaps with zmm registers requires 64-bytes alignment but rsp does not have enough alignment.

The relevant parts of the assembly code path:

        and     rsp, -64    // function prologue aligns the stack to 64-bytes and
        sub     rsp, 192    // resizes it in a 64-bytes-aligned way
(...)
        mov     r14, rsp
        add     r14, -96    // this is an alloca in the function prologue, triggered
                            // when __asan_stack_malloc_1 returns 0, which it does when
                            // detect_stack_use_after_return=0 or when it's out of fake
                            // stack.
        and     r14, -32    // align the stack to... 32-bytes.
        mov     rsp, r14
(...)
        sub     rsp, 64     // move the stack for argument passing. The stack is still
                            // 32-bytes aligned at this point.
(...)
        vmovaps zmmword ptr [rsp], zmm0  // copy argument to the stack, aka CRASH.

This started with 51fbab1 (#77210) and -mllvm -asan-use-stack-safety=0 fixes it.

This all comes down to the asan pass setting the alloca to 32-bytes alignment:

  %MyAlloca = alloca i8, i64 96, align 32

With -mllvm -asan-use-stack-safety=0, the alloca becomes:

  %MyAlloca = alloca i8, i64 224, align 64

It's worth noting that it's only 32 because that's the default value of asan-realign-stack. With -mllvm -asan-realign-stack=1 it gets down to 16.

Essentially, this kind of worked by chance before because the stack poisoner was doing the argument passing alloca too, so MyAlloca was sized and aligned for it. But now that's not the case anymore with stack-safety, MyAlloca doesn't contain and is not aligned for that alloca.

The function call to hoge has a proper align 64 annotation on the byval. I guess one way to look at the problem is that the function call should take that into account and realign the stack pointer. In theory, I guess this could happen in other scenarios than ASan, but I couldn't get it to happen using an actual alloca() call in C/C++. The alloca for the call to hoge always ends up before the manual alloca() in the IR in my attempts.

Another way to look at it is that MyAlloca should have extra alignment based on all the allocas, including the "non interesting" ones, but that sounds hackish.

Cc: @MaskRay

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions