Re: PG14: Avoid checking output-buffer-length for every encoded byte during pg_hex_encode - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: PG14: Avoid checking output-buffer-length for every encoded byte during pg_hex_encode
Date
Msg-id YRsYT3Fb82IYr5+F@paquier.xyz
Whole thread Raw
In response to PG14: Avoid checking output-buffer-length for every encoded byte during pg_hex_encode  (Hans Buschmann <buschmann@nidsa.net>)
Responses Re: PG14: Avoid checking output-buffer-length for every encoded byte during pg_hex_encode  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Sun, Aug 15, 2021 at 02:58:59PM +0000, Hans Buschmann wrote:
> If it seems useful somebody could enter it as an open item /
> resolved item for pg14 after beta 3.

Hmm.  Using SQLs like the following to stress pg_hex_encode(), I can
see a measurable difference easily, so no need of pg_dump or an
instance with many LOs:
set work_mem = '1GB';
with hex_values as materialized
  (select repeat(i::text, N)::bytea as i
     from generate_series(1, M) t(i))
  SELECT count(encode(i, 'hex')) from hex_values;

With N = 1000, M = 100000, perf shows a 34.88% use of pg_hex_encode().
On REL_13_STABLE, I get down ~27% for hex_encode(), the same backend
routine.

The runtime numbers are more interesting, HEAD gets up to 1600ms.
With the latest version of the patch, we get down to 1550ms.  Now, a
simple revert of ccf4e277 and aef8948 brings me down to the same
runtimes as ~13, closer to 1450ms.  There seem to be some noise in the
tests, but the difference is clear.

Considering that commit aef8948 also involved a cleanup of
src/common/hex_decode.c, I think that it would be saner to also revert
c3826f83, so as we have a clean state of the code to work on should
the work on the data encryption patch set move on in the future.  It
is not decided that this would actually use any of the hex
decode/encode code, either.

Honestly, I don't like much messing with this stuff after beta3, and
one of the motives of moving the hex handling code to src/common/ was
for the data encryption code whose state is not known as of late.
This does not prevent working on such refactorings in the future, of
course, keeping the performance impact in mind.

In short, I am planning to address this regression with the attached,
for a combined revert of 0d70d30, 5c33ba5 and 92436a7.
--
Michael

Attachment

pgsql-hackers by date:

Previous
From: Peter Smith
Date:
Subject: Re: Added schema level support for publication.
Next
From: Wenjing
Date:
Subject: Re: Is it worth pushing conditions to sublink/subplan?