Re: More speedups for tuple deformation - Mailing list pgsql-hackers

From David Rowley
Subject Re: More speedups for tuple deformation
Date
Msg-id CAApHDvrF6DG7=xD8JGo2HoQKN0LRFNF0ysVt6cKSNPiqbdQOSA@mail.gmail.com
Whole thread Raw
In response to More speedups for tuple deformation  (David Rowley <dgrowleyml@gmail.com>)
List pgsql-hackers
On Sun, 28 Dec 2025 at 22:04, David Rowley <dgrowleyml@gmail.com> wrote:
> Things still to do:
>
> * More benchmarking is needed. I've not yet completed the benchmarks
> on my Zen4 machine.  No Intel hardware has been tested at all. I don't
> really have any good Intel hardware to test with. Maybe someone else
> would like to help? Script is attached.

Please find attached an updated set of patches. A rebase was needed,
plus 0003 had a problem with an Assert not handling the bitmap being a
NULL pointer.

I've done some more performance tests after upgrading my Zen2 machine
to use newer versions of gcc and clang. I've also tested on an Intel
machine now. All the results are attached in a spreadsheet form in the
bzip file. There's also a pg_dump of the results and
analysis_schema.sql, which has an SQL function to extract the data in
a form that's compatible with the spreadsheet's format.

I'd say things are looking generally good for 0001 without the
OPTIMIZE_BYVAL stuff, but the results I got from clang on the
AMD7945hx don't look good at all. I'll run the tests on that again
tonight. The machine is a laptop and I did run the benchmarks on
master first to establish the baseline. I want to ensure there's no
thermal throttling going on. Aside from clang on the 7945hx, there are
a few cases where there's a slight regression in the 0 extra column
tests when a NULL is present. I wonder how much we should care about
this as 1) the regression is small; and, 2) IMO, there's less chance
of there being a NULL in a table with very few columns, in this case,
the table has 3 columns.

The "AMD3990x clang 20.1.8" results in the spreadsheet also look
strange for 0001. It looks good up to 20 columns, then the performance
trend breaks for 30 and 40 columns. I don't have an explanation for
this yet.

I've also attached an updated script to run the tests and output the
results in csv format so that it can be easily imported into Postgres
for analysis or processing.

> * I've not looked at the JIT deforming code. At the moment the code
> won't even compile with LLVM enabled because I've removed the
> TTS_FLAG_SLOW flag. It's possible I'll have to adjust the JIT
> deforming code or consider keeping TTS_FLAG_SLOW.

This part turned out to be easy. The JIT deformer does not pay
attention to the TTS_FLAG_SLOW flag, it just unconditionally turns it
on to force the non-jit deformer into using slow mode. I've deleted
the code that was setting it since slow mode no longer exists in the
patched code.

David

Attachment

pgsql-hackers by date:

Previous
From: David Rowley
Date:
Subject: Re: [PATCH] Typo fix in fk-snapshot-3.spec
Next
From: Alexander Lakhin
Date:
Subject: Re: GNU/Hurd portability patches