On Tue, Jan 13, 2026 at 7:34 PM Aleksander Alekseev
<aleksander@tigerdata.com> wrote:
> Now when all Datums are 64-bit values we can simplify the code by
> using murmurhash64(). This refactoring was previously suggested by
> John Naylor [1].
There's more we can do here. Above the stanzas changed in the patch
there is this, at least for varlena/bytea:
hash = DatumGetUInt32(hash_any((unsigned char *) authoritative_data,
Min(len, PG_CACHE_LINE_SIZE)));
This makes no sense to me: hash_any() calls hash_bytes() and turns the
result into a Datum, and then we just get it right back out of the
Datum again. addHyperLogLog says "typically generated using
hash_any()", but that function takes a uint32, not a Datum, so that
comment should probably be changed. hash_bytes() is global, so we can
use it directly.
if (len > PG_CACHE_LINE_SIZE)
hash ^= DatumGetUInt32(hash_uint32((uint32) len));
Similar here, but instead of hash_bytes_uint32(), we may as well use
mumurhash32().
--
John Naylor
Amazon Web Services