C++ Compilers Generating Rotate Instructions

I just did a quick test in Compiler explorer and it was nice to see both GCC and Clang generating single rotate instructions from these equivalent sequences of operations.

inline uint32_t uint32_rol(uint32_t _a, int _sa)
{
    return ( _a << _sa) | (_a >> (32-_sa) );
}

inline uint32_t uint32_ror(uint32_t _a, int _sa)
{
    return ( _a >> _sa) | (_a << (32-_sa) );
}

E.g., for Clang on x64:

uint32_do_ror(unsigned int, int):
    mov     ecx, esi
    ror     edi, cl
    mov     eax, edi
    ret

uint32_do_rol(unsigned int, int):
    mov     ecx, esi
    rol     edi, cl
    mov     eax, edi
    ret

When inlined into a loop, the rotate was not generated by clang, although it was by gcc. I presume that’s because some other optimisation pass had moved code around enough to stop the rotate pattern being matched. The full example is here on Godbolt. Code courtesy of Mike Acton’s uint32_t header.

TAGS > ,

Post a comment