On Tue, 28 Jul 2020 at 11:00, Andres Freund <andres@anarazel.de> wrote:
>
> On 2020-07-25 10:54:18 -0400, Tom Lane wrote:
> > David Rowley <dgrowleyml@gmail.com> writes:
> > > ... nested at the bottom level join, about 6 joins deep. The lack of
> > > any row being found results in upper level joins not having to do
> > > anything, and the majority of the plan is (never executed).
> >
> > On re-reading this, that last point struck me forcibly. If most of
> > the plan never gets executed, could we avoid compiling it? That is,
> > maybe JIT isn't JIT enough, and we should make compilation happen
> > at first use of an expression not during executor startup.
>
> That unfortunately has its own downsides, in that there's significant
> overhead of emitting code multiple times. I suspect that taking the
> cost of all the JIT emissions together into account is the more
> promising approach.
Is there some reason that we can't consider jitting on a more granular
basis? To me, it seems wrong to have a jit cost per expression and
demand that the plan cost > #nexprs * jit_expr_cost before we do jit
on anything. It'll make it pretty hard to predict when jit will occur
and doing things like adding new partitions could suddenly cause jit
to not enable for some query any more.
ISTM a more granular approach would be better. For example, for the
expression we expect to evaluate once, there's likely little point in
jitting it, but for the one on some other relation that has more rows,
where we expect to evaluate it 1 billion times, there's likely good
reason to jit that. Wouldn't it be better to consider it at the
RangeTblEntry level?
David