Re: bad JIT decision - Mailing list pgsql-general

From David Rowley
Subject Re: bad JIT decision
Date
Msg-id CAApHDvqXc4ZkoCSxxLO_Tf-zNr1yyb_C-d=ncj-Q_pbxVOioFQ@mail.gmail.com
Whole thread Raw
In response to Re: bad JIT decision  (Andres Freund <andres@anarazel.de>)
List pgsql-general
On Wed, 29 Jul 2020 at 09:07, Andres Freund <andres@anarazel.de> wrote:
> On 2020-07-28 11:54:53 +1200, David Rowley wrote:
> > Is there some reason that we can't consider jitting on a more granular
> > basis?
>
> There's a substantial "constant" overhead of doing JIT. And that it's
> nontrival to determine which parts of the query should be JITed in one
> part, and which not.
>
>
> > To me, it seems wrong to have a jit cost per expression and
> > demand that the plan cost > #nexprs * jit_expr_cost before we do jit
> > on anything.  It'll make it pretty hard to predict when jit will occur
> > and doing things like adding new partitions could suddenly cause jit
> > to not enable for some query any more.
>
> I think that's the right answer though:

I'm not quite sure why it would be so hard to do more granularly.

Take this case, for example:

create table listp (a int, b int) partition by list(a);
create table listp1 partition of listp for values in(1);
create table listp2 partition of listp for values in(2);
insert into listp select 1,x from generate_Series(1,1000000) x;

The EXPLAIN looks like:

postgres=# explain select * from listp where b < 100;
                                QUERY PLAN
--------------------------------------------------------------------------
 Append  (cost=0.00..16967.51 rows=853 width=8)
   ->  Seq Scan on listp1 listp_1  (cost=0.00..16925.00 rows=100 width=8)
         Filter: (b < 100)
   ->  Seq Scan on listp2 listp_2  (cost=0.00..38.25 rows=753 width=8)
         Filter: (b < 100)
(5 rows)

For now, if the total cost of the plan exceeded the jit threshold,
then we'd JIT all the expressions. If it didn't, we'd compile none of
them.

What we could do instead would just add the jitFlags field into struct
Plan to indicate the JIT flags on a per plan node level and enable it
as we do now based on the total_cost of that plan node rather than at
the top-level of the plan as we do now in standard_planner(). The
jitFlags setting code would be moved to the end of
create_plan_recurse() instead.

In this case, if we had the threshold set to 10000, then we'd JIT for
listp1 but not for listp2. I don't think this would even require a
signature change in the jit_compile_expr() function as we can get
access to the plan node from state->parent->plan to see which jitFlags
are set, if any.

David



pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: Apparent missed query optimization with self-join and inner grouping
Next
From: Michael Paquier
Date:
Subject: Re: how reliable is pg_rewind?