Description
@BethGriggs picked this up @ nodejs/build#1945 (comment)
We've switched our Alpine 3.9 containers to Ubuntu 18.04 hosts, from 16.04. So that's a ~4.4.0 kernel to a ~4.15.0 and these two tests are now reliably failing with segfaults: test-process-uid-gid.js & test-process-euid-egid.js.
With some effort I've managed to generate a core dump but I don't think it's very helpful:
#0 0x00007f9b32eb1568 in __clone () from /lib/ld-musl-x86_64.so.1
#1 0x00007f9b32eaed56 in ?? () from /lib/ld-musl-x86_64.so.1
#2 0x00007f9b304e4b54 in ?? ()
#3 0x0000000000000000 in ?? ()
The latest Alpine is 3.10 and it's working fine for 10.x (and above), so this is specifically for the last gen Alpine with last gen Node. How much does this matter, and do we have someone with expertise and time to dive into this? I've taken an Alpine 3.9 container out of CI for messing around with this and am happy to give someone access & instructions if you want to toy with it.
If I manage to make a debug build core dump then I'll paste that in, maybe it'll be more interesting?
For my own reference, to help clean up:
- https://ci.nodejs.org/computer/test-digitalocean-alpine39_container-x64-2/ is offline for this
- have forced host core dumps to /var/crash and mounted that in the container
- started container with
--ulimit core=99999999999:99999999999
- installed gdb in the container to inspect
- saved the offending (release build)
node
as~iojs/node-10-segfault
, uid-gid.js core dump is/var/crash/core.node.78
and euid-egid.js core dump is/var/crash/core.node.86
.