Re: Lazy JIT IR code generation to increase JIT speed with partitions - Mailing list pgsql-hackers

From David Geier
Subject Re: Lazy JIT IR code generation to increase JIT speed with partitions
Date
Msg-id CAPsAnr=eP=w7+WSWG5KXBPWgtrfjW0ZR9YGnf9oEJBET-ejPuw@mail.gmail.com
Whole thread Raw
In response to Re: Lazy JIT IR code generation to increase JIT speed with partitions  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
Responses Re: Lazy JIT IR code generation to increase JIT speed with partitions
List pgsql-hackers
Hi Alvaro,

That's a very interesting case and might indeed be fixed or at least improved by this patch. I tried to reproduce this, but at least when running a simple, serial query with increasing numbers of functions, the time spent per function is linear or even slightly sub-linear (same as Tom observed in [1]).

I also couldn't reproduce the JIT runtimes you shared, when running the attached catalog query. The catalog query ran serially for me with the following JIT stats:

 JIT:
   Functions: 169
   Options: Inlining true, Optimization true, Expressions true, Deforming true
   Timing: Generation 12.223 ms, Inlining 17.323 ms, Optimization 388.491 ms, Emission 283.464 ms, Total 701.501 ms

Is it possible that the query ran in parallel for you? For parallel queries, every worker JITs all of the functions it uses. Even though the workers might JIT the functions in parallel, the time reported in the EXPLAIN ANALYZE output is the sum of the time spent by all workers. With this patch applied, the JIT time drops significantly, as many of the generated functions remain unused.

 JIT:
   Modules: 15
   Functions: 26
   Options: Inlining true, Optimization true, Expressions true, Deforming true
   Timing: Generation 1.931 ms, Inlining 0.722 ms, Optimization 67.195 ms, Emission 70.347 ms, Total 140.195 ms

Of course, this does not prove that the nonlinearity that you observed went away. Could you share with me how you ran the query so that I can reproduce your numbers on master to then compare them with the patched version? Also, which LLVM version did you run with? I'm currently running with LLVM 13.

Thanks!

--
David Geier
(ServiceNow)

On Mon, Jun 27, 2022 at 5:37 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
On 2021-Jan-18, Luc Vlaming wrote:

> I would like this topic to somehow progress and was wondering what other
> benchmarks / tests would be needed to have some progress? I've so far
> provided benchmarks for small(ish) queries and some tpch numbers, assuming
> those would be enough.

Hi, some time ago I reported a case[1] where our JIT implementation does
a very poor job and perhaps the changes that you're making could explain
what is going on, and maybe even fix it:

[1] https://postgr.es/m/202111141706.wqq7xoyigwa2@alvherre.pgsql

The query for which I investigated the problem involved some pg_logical
metadata tables, so I didn't post it anywhere public; but the blog post
I found later contains a link to a query that shows the same symptoms,
and which is luckily still available online:
https://gist.github.com/saicitus/251ba20b211e9e73285af35e61b19580
I attach it here in case it goes missing sometime.

--
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/

pgsql-hackers by date:

Previous
From: Jelte Fennema
Date:
Subject: Re: OpenSSL 3.0.0 compatibility
Next
From: Jehan-Guillaume de Rorthais
Date:
Subject: Re: Fix proposal for comparaison bugs in PostgreSQL::Version