Skip to content

*multiThreadedpow2AlignedAlloc/disjoint_w_params* tests fail sporadically #1488

@ldorau

Description

@ldorau

*multiThreadedpow2AlignedAlloc/disjoint_w_params* tests:

  • mallocPoolTest/umfPoolTest.multiThreadedpow2AlignedAlloc/disjoint_w_params_2_umf_ba_global (test_memoryPool) and
  • disjointPoolTests/umfPoolTest.multiThreadedpow2AlignedAlloc/disjoint_w_params_0_umf_ba_global (test_disjoint_pool)

fail sporadically in the following way:
https://github.com/oneapi-src/unified-memory-framework/actions/runs/16843892970/job/47720079405

[ RUN      ] mallocPoolTest/umfPoolTest.multiThreadedpow2AlignedAlloc/disjoint_w_params_2_umf_ba_global
/home/runner/work/unified-memory-framework/unified-memory-framework/test/poolFixtures.hpp:221: Failure
Expected: (ptr) != (nullptr), actual: NULL vs (nullptr)

or:
https://github.com/ldorau/unified-memory-framework/actions/runs/16845396177/job/47724161570

[ RUN      ] disjointPoolTests/umfPoolTest.multiThreadedpow2AlignedAlloc/disjoint_w_params_0_umf_ba_global
/home/testuser/test/poolFixtures.hpp:221: Failure
Expected: (ptr) != (nullptr), actual: NULL vs (nullptr)

Environment Information

  • UMF version (hash commit or a tag): cc0565d
  • OS(es) version(s): Linux

Please provide a reproduction of the bug:

$ while ./test/test_memoryPool --gtest_filter="*multiThreadedpow2AlignedAlloc/disjoint_w_params*" > ./log.txt 2>&1 && ./test/test_disjoint_pool --gtest_filter="*multiThreadedpow2AlignedAllo
c/disjoint_w_params*" > ./log.txt 2>&1 ; do date ; done

How often bug is revealed:

rare

Details

The root cause is pool_register_slab: register failed because the address is already registered!:

[PID:1835396 TID:1835401 ERROR UMF] pool_register_slab: register failed because the address is already registered!
[PID:1835396 TID:1835401 ERROR UMF] bucket_create_slab: slab_reg failed!

More logs:

$ grep -e "ERROR UMF" -e Failure -e 0x7fd07f81e008 ./log.txt
[PID:1835396 TID:1835401 DEBUG UMF] umfMemoryTrackerAddAtLevel: memory region is added, tracker=0x7fd07fe40068, level=0, pool=0x7fd07fe40268, ptr=0x7fd07f81e008, size=4096
[PID:1835396 TID:1835401 DEBUG UMF] pool_register_slab: slab: 0x7fd07fe493e8, start: 0x7fd07f81e008
[PID:1835396 TID:1835400 DEBUG UMF] umfMemoryTrackerRemove: memory region removed: tracker=0x7fd07fe40068, level=0, pool=0x7fd07fe40268, ptr=0x7fd07f81e008, size=4096
[PID:1835396 TID:1835401 DEBUG UMF] umfMemoryTrackerAddAtLevel: memory region is added, tracker=0x7fd07fe40068, level=0, pool=0x7fd07fe40268, ptr=0x7fd07f81e008, size=4096
[PID:1835396 TID:1835401 DEBUG UMF] pool_register_slab: slab: 0x7fd07fe496e8, start: 0x7fd07f81e008
[PID:1835396 TID:1835400 DEBUG UMF] pool_unregister_slab: slab: 0x7fd07fe493e8, start: 0x7fd07f81e008
[PID:1835396 TID:1835401 ERROR UMF] pool_register_slab: register failed because the address is already registered! (slab: 0x7fd07fe496e8, start: 0x7fd07f81e008)
[PID:1835396 TID:1835401 ERROR UMF] bucket_create_slab: slab_reg failed!
[PID:1835396 TID:1835401 DEBUG UMF] umfMemoryTrackerRemove: memory region removed: tracker=0x7fd07fe40068, level=0, pool=0x7fd07fe40268, ptr=0x7fd07f81e008, size=4096
[PID:1835396 TID:1835399 DEBUG UMF] umfMemoryTrackerAddAtLevel: memory region is added, tracker=0x7fd07fe40068, level=0, pool=0x7fd07fe40268, ptr=0x7fd07f81e008, size=4096
[PID:1835396 TID:1835399 DEBUG UMF] pool_register_slab: slab: 0x7fd07fe495e8, start: 0x7fd07f81e008
/home/ldorau/work/unified-memory-framework/test/poolFixtures.hpp:221: Failure

and:

$ grep -e "ERROR UMF" -e Failure -e 0x7f15c647f008 ./log.txt
[PID:772    TID:776    DEBUG UMF] umfMemoryTrackerAddAtLevel: memory region is added, tracker=0x7f15c64b8068, level=0, pool=0x7f15c64b8268, ptr=0x7f15c647f008, size=4096
[PID:772    TID:776    DEBUG UMF] pool_register_slab: slab: 0x7f15c64c1468, start: 0x7f15c647f008
[PID:772    TID:776    DEBUG UMF] umfMemoryTrackerRemove: memory region removed: tracker=0x7f15c64b8068, level=0, pool=0x7f15c64b8268, ptr=0x7f15c647f008, size=4096
[PID:772    TID:773    DEBUG UMF] umfMemoryTrackerAddAtLevel: memory region is added, tracker=0x7f15c64b8068, level=0, pool=0x7f15c64b8268, ptr=0x7f15c647f008, size=4096
[PID:772    TID:773    DEBUG UMF] pool_register_slab: slab: 0x7f15c64c17e8, start: 0x7f15c647f008
[PID:772    TID:773    ERROR UMF] pool_register_slab: register failed because the address is already registered! (slab: 0x7f15c64c17e8, start: 0x7f15c647f008)
[PID:772    TID:773    ERROR UMF] bucket_create_slab: slab_reg failed!
[PID:772    TID:773    DEBUG UMF] umfMemoryTrackerRemove: memory region removed: tracker=0x7f15c64b8068, level=0, pool=0x7f15c64b8268, ptr=0x7f15c647f008, size=4096
[PID:772    TID:776    DEBUG UMF] pool_unregister_slab: slab: 0x7f15c64c1468, start: 0x7f15c647f008
[PID:772    TID:775    DEBUG UMF] umfMemoryTrackerAddAtLevel: memory region is added, tracker=0x7f15c64b8068, level=0, pool=0x7f15c64b8268, ptr=0x7f15c647f008, size=4096
[PID:772    TID:775    DEBUG UMF] pool_register_slab: slab: 0x7f15c64c1668, start: 0x7f15c647f008
/home/ldorau/work/unified-memory-framework/test/poolFixtures.hpp:221: Failure

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions