Home > mailing lists

Re: autovectorize page checksum code included elsewhere - Mailing list pgsql-hackers

From	Nathan Bossart
Subject	Re: autovectorize page checksum code included elsewhere
Date	November 11, 2023 21:49:43
Msg-id	20231111214943.GA1563304@nathanxps13 Whole thread Raw
In response to	Re: autovectorize page checksum code included elsewhere (John Naylor <johncnaylorls@gmail.com>)
List	pgsql-hackers

Tree view

On Sat, Nov 11, 2023 at 07:38:59PM +0700, John Naylor wrote:
> On Tue, Nov 7, 2023 at 9:47 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
>> Separately, I'm wondering whether we should consider using CFLAGS_VECTORIZE
>> on the whole tree.  Commit fdea253 seems to be responsible for introducing
>> this targeted autovectorization strategy, and AFAICT this was just done to
>> minimize the impact elsewhere while optimizing page checksums.  Are there
>> fundamental problems with adding CFLAGS_VECTORIZE everywhere?  Or is it
>> just waiting on someone to do the analysis/benchmarking?
> 
> It's already the default for gcc 12 with -O2 (looking further in the
> docs, it uses the "very-cheap" vectorization cost model), so it may be
> worth investigating what the effect of that was. I can't quickly find
> the equivalent info for clang.

My x86 machine is using gcc 9.4.0, which isn't even aware of "very-cheap".
I don't see any difference with any of the cost models, though.  It isn't
until I add -O3 that I see things like inlining pg_checksum_block into
pg_checksum_page.  -O3 is generating far more SSE2 instructions, too.

I'll have to check whether gcc 12 is finding anything else within Postgres
to autovectorize with it's "very-cheap" cost model...

> That being the case, if the difference you found was real, it must
> have been due to unrolling loops. What changed in the binary?

For gcc 9.4.0 on x86, the autovectorization flag alone indeed makes no
difference, while the loop unrolling one does.  For Apple clang 14.0.0 on
an M2, both flags seem to generate very different machine code.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

pgsql-hackers by date:

From: Matthias van de Meent
Date: 11 November 2023, 21:08:00
Subject: Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan

From: Thomas Munro
Date: 11 November 2023, 23:17:54
Subject: Re: pgsql: Don't trust unvalidated xl_tot_len.

Re: autovectorize page checksum code included elsewhere - Mailing list pgsql-hackers

Previous

Next