By using a recognized idiom, gcc can optimize the unaligned little endian
load as a single instruction (actually less than an instruction, as it
combines it with a succeeding arithmetic operation).
C++ doesn't define overflow on signed types, so use unsigned types instead.
Luckily all right shifts were unsigned anyway.
Some signed extension was happening (handling remainders after processing
8-byte chunks) but should still be there.
Caught by debug build.
In the original Java code, MurmurHash was in the "utils" package, not
"util", so move it to a new "utils" directory (and namespace), not
"util".
Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>