Re: [PATCH] Enable using llvm jitlink as an alternative llvm jit linker of old Rtdyld. - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: [PATCH] Enable using llvm jitlink as an alternative llvm jit linker of old Rtdyld.
Date
Msg-id CA+hUKG+RznmqzGibTqy-raZ_k4uHnap49=Ojj=U8sBh5_M6EoQ@mail.gmail.com
Whole thread Raw
In response to Re: [PATCH] Enable using llvm jitlink as an alternative llvm jit linker of old Rtdyld.  (David Rowley <dgrowleyml@gmail.com>)
Responses Re: [PATCH] Enable using llvm jitlink as an alternative llvm jit linker of old Rtdyld.
List pgsql-hackers
On Thu, Nov 24, 2022 at 12:08 AM David Rowley <dgrowleyml@gmail.com> wrote:
> On Wed, 23 Nov 2022 at 23:13, Alex Fan <alex.fan.q@gmail.com> wrote:
> > I am new to the postgres community and apologise for resending this as the previous one didn't include patch
properlyand didn't cc reviewers (maybe the reason it has been buried in mailing list for months)
 
>
> Welcome to the community!

+1

I don't know enough about LLVM or RISCV to have any strong opinions
here, but I have a couple of questions...  It looks like we have two
different things in this patch:

1.  Optionally use JITLink instead of RuntimeDyld for relocation.
From what I can tell from some quick googling, that is necessary for
RISCV because they haven't got around to doing this yet:

https://reviews.llvm.org/D127842

Independently of that, it seems that
https://llvm.org/docs/JITLink.html is the future and RuntimeDyld will
eventually be obsolete, so one question I have is: why should we do
this only for riscv?

You mentioned that this change might be necessary to support COFF and
thus Windows.  I'm not a Windows user and I think it would be beyond
my pain threshold to try to get this working there by using CI alone,
but I'm just curious... wouldn't
https://github.com/llvm/llvm-project/blob/main/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldCOFF.cpp
work for that already?  (I haven't heard about anyone successfully
using PostgreSQL/LLVM on Windows; it would certainly be cool to hear
some more about what would be needed for that.)

2.  Manually adjust the CPU features and ABI/subtarget.

+#if defined(__riscv)
+    /* getHostCPUName returns "generic-rv[32|64]", which lacks all features */
+    Features.AddFeature("m", true);
+    Features.AddFeature("a", true);
+    Features.AddFeature("c", true);
+# if defined(__riscv_float_abi_single)
+    Features.AddFeature("f", true);
+# endif
+# if defined(__riscv_float_abi_double)
+    Features.AddFeature("d", true);
+# endif
+#endif

I'm trying to understand this, and the ABI name selection logic.
Maybe there are two categories of features here?

The ABI bits, "f" and "d" are not just "which instructions can I
used", but they also affect the ABI (I guess something like: where
floats go in the calling convention), and they have to match the ABI
of the main executable to allow linking to succeed, right?  Probably a
stupid question: wouldn't the subtarget/ABI be the same as the one
that the LLVM library itself was compiled for (which must also match
the postgres executable), and doesn't it know that somewhere?  I guess
I'm confused about why we don't need to deal with this kind of manual
subtarget selection on any other architecture: for PPC it
automatically knows whether to be big endian/little endian, 32 or 64
bit, etc.

Then for "m", "a", "c", I guess these are code generation options -- I
think "c" is compressed instructions for example?  Can we get a
comment to say what they are?  Why do you think that all RISCV chips
have these features?  Perhaps these are features that are part of some
kind of server chip profile (ie features not present in a tiny
microcontroller chip found in a toaster, but expected in any system
that would actually run PostgreSQL) -- in which case can we get a
reference to explain that?

I remembered the specific reason why we have that
LLVMGethostCPUFeatures() call: it's because the list of default
features that would apply otherwise based on CPU "name" alone turned
out to assume that all x86 chips had AVX, but some low end parts
don't, so we have to check for AVX etc presence that way.  But your
patch seems to imply that LLVM is not able to get features reliably
for RISCV -- why not, immaturity or technical reason why it can't?

+    assert(ES && "ES must not be null");

We use our own Assert() macro (capital A).



pgsql-hackers by date:

Previous
From: David Christensen
Date:
Subject: Re: [PATCH] Teach pg_waldump to extract FPIs from the WAL
Next
From: Nathan Bossart
Date:
Subject: Re: wake up logical workers after ALTER SUBSCRIPTION