Thread: Query JITing with LLVM ORC

Query JITing with LLVM ORC

From
João Paulo Labegalini de Carvalho
Date:
Hi all,

I am working on a project with LLVM ORC that led us to PostgreSQL as a target application. We were surprised by learning that PGSQL already uses LLVM ORC to JIT certain queries.

I would love to know what motivated this feature and for what it is being currently used for, as it is not enabled by default.

Thanks.

--
João Paulo L. de Carvalho
Ph.D Computer Science |  IC-UNICAMP | Campinas , SP - Brazil
Postdoctoral Research Fellow | University of Alberta | Edmonton, AB - Canada

Re: Query JITing with LLVM ORC

From
Thomas Munro
Date:
On Thu, Sep 22, 2022 at 4:17 AM João Paulo Labegalini de Carvalho
<jaopaulolc@gmail.com> wrote:
> I am working on a project with LLVM ORC that led us to PostgreSQL as a target application. We were surprised by
learningthat PGSQL already uses LLVM ORC to JIT certain queries. 

It JITs expressions but not whole queries.  Query execution at the
tuple-flow level is still done using a C call stack the same shape as
the query plan, but it *could* be transformed to a different control
flow that could be run more efficiently and perhaps JITed.  CCing
Andres who developed all this and had some ideas about that...

> I would love to know what motivated this feature and for what it is being currently used for,

https://www.postgresql.org/docs/current/jit-reason.html

> as it is not enabled by default.

It's enabled by default in v12 and higher (if you built with
--with-llvm, as packagers do), but not always used:

https://www.postgresql.org/docs/current/jit-decision.html



Re: Query JITing with LLVM ORC

From
João Paulo Labegalini de Carvalho
Date:
Hi Thomas,

It JITs expressions but not whole queries.

Thanks for the clarification.
 
Query execution at the
tuple-flow level is still done using a C call stack the same shape as
the query plan, but it *could* be transformed to a different control
flow that could be run more efficiently and perhaps JITed.

I see, so there is room for extending the use of Orc JIT in PGSQL.
 
CCing
Andres who developed all this and had some ideas about that...

Thanks for CCing Andres, it will be great to hear from him.
 
> I would love to know what motivated this feature and for what it is being currently used for,

https://www.postgresql.org/docs/current/jit-reason.html

In that link I found the README under src/backend/jit, which was very helpful. 

> as it is not enabled by default.

It's enabled by default in v12 and higher (if you built with
--with-llvm, as packagers do), but not always used:

https://www.postgresql.org/docs/current/jit-decision.html

Good to know. I compiled from the REL_14_5 tag and did a simple experiment to contrast building with and w/o passing --with-llvm.
I ran the TPC-C benchmark with 1 warehouse, 10 terminals, 20min of ramp-up, and 120 of measurement time.
The number of transactions per minute was about the same with & w/o JITing.
Is this expected? Should I use a different benchmark to observe a performance difference?

Regards,

--
João Paulo L. de Carvalho
Ph.D Computer Science |  IC-UNICAMP | Campinas , SP - Brazil
Postdoctoral Research Fellow | University of Alberta | Edmonton, AB - Canada

Re: Query JITing with LLVM ORC

From
Tom Lane
Date:
=?UTF-8?Q?Jo=C3=A3o_Paulo_Labegalini_de_Carvalho?= <jaopaulolc@gmail.com> writes:
> Good to know. I compiled from the REL_14_5 tag and did a simple experiment
> to contrast building with and w/o passing --with-llvm.
> I ran the TPC-C benchmark with 1 warehouse, 10 terminals, 20min of ramp-up,
> and 120 of measurement time.
> The number of transactions per minute was about the same with & w/o JITing.
> Is this expected? Should I use a different benchmark to observe a
> performance difference?

TPC-C is mostly short queries, so we aren't likely to choose to use JIT
(and if we did, it'd likely be slower).  You need a long query that will
execute the same expressions over and over for it to make sense to
compile them.  Did you check whether any JIT was happening there?

There are a bunch of issues in this area concerning whether our cost
models are good enough to accurately predict whether JIT is a good
idea.  But single-row fetches and updates are basically never going
to use it, nor should they.

            regards, tom lane



Re: Query JITing with LLVM ORC

From
Thomas Munro
Date:
On Thu, Sep 22, 2022 at 10:35 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> =?UTF-8?Q?Jo=C3=A3o_Paulo_Labegalini_de_Carvalho?= <jaopaulolc@gmail.com> writes:
> > Good to know. I compiled from the REL_14_5 tag and did a simple experiment
> > to contrast building with and w/o passing --with-llvm.
> > I ran the TPC-C benchmark with 1 warehouse, 10 terminals, 20min of ramp-up,
> > and 120 of measurement time.
> > The number of transactions per minute was about the same with & w/o JITing.
> > Is this expected? Should I use a different benchmark to observe a
> > performance difference?
>
> TPC-C is mostly short queries, so we aren't likely to choose to use JIT
> (and if we did, it'd likely be slower).  You need a long query that will
> execute the same expressions over and over for it to make sense to
> compile them.  Did you check whether any JIT was happening there?

See also the proposal thread which has some earlier numbers from TPC-H.

https://www.postgresql.org/message-id/flat/20170901064131.tazjxwus3k2w3ybh%40alap3.anarazel.de



Re: Query JITing with LLVM ORC

From
Thomas Munro
Date:
On Thu, Sep 22, 2022 at 10:04 AM João Paulo Labegalini de Carvalho
<jaopaulolc@gmail.com> wrote:
>building with and w/o passing --with-llvm

BTW you can also just turn it off with runtime settings, no need to rebuild.



Re: Query JITing with LLVM ORC

From
João Paulo Labegalini de Carvalho
Date:
Tom & Thomas:

Thank you so much, those a very useful comments.

I noticed that I didn't make my intentions very clear. My teams goal is to evaluate if there are any gains in JITing PostgreSQL itself, or at least parts of it, and not the expressions or parts of a query.

The rationale to use PostgreSQL is because DBs are long running applications and the cost of JITing can be amortized.

We have a prototype LLVM IR pass that outlines functions in a program to JIT and a ORC-based runtime to re-compile functions. Our goal is to see improvements due to target/sub-target specialization.

The reason I was looking at benchmarks is to have a workload to profile PostgreSQL and find its bottlenecks. The hot functions would then be outlined for JITing.



On Wed., Sep. 21, 2022, 4:54 p.m. Thomas Munro, <thomas.munro@gmail.com> wrote:
On Thu, Sep 22, 2022 at 10:04 AM João Paulo Labegalini de Carvalho
<jaopaulolc@gmail.com> wrote:
>building with and w/o passing --with-llvm

BTW you can also just turn it off with runtime settings, no need to rebuild.