On Tue, Nov 28, 2023 at 7:51 AM Andres Freund <andres@anarazel.de> wrote:
>
> Hi,
>
> On 2023-11-25 14:09:14 +0700, John Naylor wrote:
> > * Note: I have seen the threads with the idea of compiling multiple
> > entire binaries, and switching at postmaster start. I think it's a
> > good idea, but I also suspect detecting flags from the packager is an
> > easier intermediate step.
>
> It's certainly an easier incremental step - but will it get us all that far?
Granted, not much.
(TBH, I'm not sure how to design the multiple-binaries idea, but
surely we're not the first project to think of this...)
On Tue, Nov 28, 2023 at 4:21 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
> soon, but if we'd also like to start adding AVX2 enhancements (and I think
> we will), I'm assuming we'll want to provide an easy way for users to
> declare that they are building for v3+ CPUs.
Yeah, I remember now I saw instruction selection change a single shift
with -v3 which made the UTF-8 DFA significantly faster:
https://www.postgresql.org/message-id/CAFBsxsHR08mHEf06PvrMRstfcyPJLwF69g0r1pvRrxWD4GEVoQ%40mail.gmail.com
I imagine a number of places would get automatic improvements, and I
think others have said the same thing.
On Tue, Nov 28, 2023 at 5:26 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
> > I'm not sure why I thought checking each feature might be necessary.
> > --with-isa-level could essentially just be an alias for adding all the
> > CFLAGS for the extensions provided at that level, and --with-isa-level=auto
> > would just mean -march=native. With those flags set, the ./configure
> > checks would succeed with the base set of compiler flags passed in, which
> > could be used as a heuristic for inlining (like CRC32C does today).
>
> Or, perhaps you mean removing those ./configure checks completely and
> assuming that the compiler knows about the intrinsics required for the
> specified ISA level...
With the multiple-binaries, we might be able to assume, since it'll be
opt-in (default is just use the baseline), but I'm not sure. And to
avoid needing a newish compiler, than we could probably just use the
equivalent, e.g. for -v2:
-march=x86-64 -mmmx -msse -msse2 -mfxsr -msahf -mcx16 -mpopcnt -msse3
-msse4.1 -msse4.2 -mssse3
...and it seems we'll want to make something up for Arm.