Optimization Advice for MPI+OpenMP Hybrid Parallelism in AMReX #4497
Unanswered
paradoxknight1
asked this question in
Q&A
Replies: 1 comment 7 replies
-
Is your simulation 3D? Also how big are your boxes? In general, using tiling + OpenMP benefits from large boxes (increase |
Beta Was this translation helpful? Give feedback.
7 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I'm currently evaluating the performance trade-offs between hybrid (MPI+OpenMP) and pure MPI parallelism in AMReX, and the results are quite perplexing.
As I begin working with the AmrLevel interface to develop my solver, initial benchmarks on a dual-socket Xeon Gold 6130 system (32 cores total) reveal that pure MPI consistently outperforms hybrid parallelism. In our Sod shock tube test cases, pure MPI demonstrates a 30% speed advantage over the hybrid approach.
The hybrid configuration was set up as follows:
`
export OMP_NUM_THREADS=16
export OMP_PLACES=cores
export OMP_PROC_BIND=close
mpirun -np 2 --bind-to socket --map-by socket ./main.ex inputs
`
To rule out implementation errors, I benchmarked AMReX's official Advection example and observed even more pronounced differences:
Pure MPI (32 processes): 5.434s per 2000 steps
Hybrid (2 MPI × 16 threads): 10.01s per 2000 steps
It is a quite annoying issue to me.
Looking forward to someone who can give me some advice, really.
Beta Was this translation helpful? Give feedback.
All reactions