Re: [PATCH] Refactor *_abbrev_convert() functions - Mailing list pgsql-hackers

From John Naylor
Subject Re: [PATCH] Refactor *_abbrev_convert() functions
Date
Msg-id CANWCAZYHK4F1MPBytEKSS8qhi9kiUXhJTZq-rWcyzk6BCOyfYg@mail.gmail.com
Whole thread Raw
In response to [PATCH] Refactor *_abbrev_convert() functions  (Aleksander Alekseev <aleksander@tigerdata.com>)
Responses Re: [PATCH] Refactor *_abbrev_convert() functions
List pgsql-hackers
On Tue, Jan 13, 2026 at 7:34 PM Aleksander Alekseev
<aleksander@tigerdata.com> wrote:
> Now when all Datums are 64-bit values we can simplify the code by
> using murmurhash64(). This refactoring was previously suggested by
> John Naylor [1].

There's more we can do here. Above the stanzas changed in the patch
there is this, at least for varlena/bytea:

hash = DatumGetUInt32(hash_any((unsigned char *) authoritative_data,
                      Min(len, PG_CACHE_LINE_SIZE)));

This makes no sense to me: hash_any() calls hash_bytes() and turns the
result into a Datum, and then we just get it right back out of the
Datum again. addHyperLogLog says "typically generated using
hash_any()", but that function takes a uint32, not a Datum, so that
comment should probably be changed. hash_bytes() is global, so we can
use it directly.

if (len > PG_CACHE_LINE_SIZE)
  hash ^= DatumGetUInt32(hash_uint32((uint32) len));

Similar here, but instead of hash_bytes_uint32(), we may as well use
mumurhash32().

--
John Naylor
Amazon Web Services



pgsql-hackers by date:

Previous
From: Shinya Kato
Date:
Subject: Re: Wake up backends immediately when sync standbys decrease
Next
From: Michael Paquier
Date:
Subject: Re: Add expressions to pg_restore_extended_stats()