20
Feb 2018
NO COMMENT
C++ Compilers Generating Rotate Instructions
I just did a quick test in Compiler explorer and it was nice to see both GCC and Clang generating single rotate instructions from these equivalent sequences of operations.
inline uint32_t uint32_rol(uint32_t _a, int _sa)
{
return ( _a << _sa) | (_a >> (32-_sa) );
}
inline uint32_t uint32_ror(uint32_t _a, int _sa)
{
return ( _a >> _sa) | (_a << (32-_sa) );
}
E.g., for Clang on x64:
uint32_do_ror(unsigned int, int):
mov ecx, esi
ror edi, cl
mov eax, edi
ret
uint32_do_rol(unsigned int, int):
mov ecx, esi
rol edi, cl
mov eax, edi
ret
When inlined into a loop, the rotate was not generated by clang, although it was by gcc. I presume that’s because some other optimisation pass had moved code around enough to stop the rotate pattern being matched. The full example is here on Godbolt. Code courtesy of Mike Acton’s uint32_t header.