`_mm256_bslli_epi128` compiles to the same assembler but the name is more descriptive than `_mm256_slli_si256`: https://github.com/algorithmica-org/algorithmica/blob/ed1945c3d2de8548b1c744d3161eb668703db808/content/english/hpc/algorithms/prefix.md?plain=1#L103