Re: JIT compiling with LLVM v12 - Mailing list pgsql-hackers

From Andres Freund
Subject Re: JIT compiling with LLVM v12
Date
Msg-id 20180905183552.442a2vdeerhosmui@alap3.anarazel.de
Whole thread Raw
In response to Re: JIT compiling with LLVM v12  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
Hi,

On 2018-08-25 21:34:22 -0400, Robert Haas wrote:
> On Wed, Aug 22, 2018 at 6:43 PM, Andres Freund <andres@anarazel.de> wrote:
> > Now you can say that'd be solved by bumping the cost up, sure. But
> > obviously the row / cost model is pretty much out of whack here, I don't
> > see how we can make reasonable decisions in a trivial query that has a
> > misestimation by five orders of magnitude.
> 
> Before JIT, it didn't matter whether the costing was wrong, provided
> that the path with the lowest cost was the cheapest path (or at least
> close enough to the cheapest path not to bother anyone).

I don't thinkt that's really true. Due to the cost fuzzing absurdly high
cost very commonly lead to the actually different planning choices to
not have a large enough influence to matter.


> I'd guess that, as you read this, you're thinking, well, but if I'd
> added JIT and non-JIT paths for every option, it would have doubled
> the number of paths, and that would have slowed the planner down way
> too much.  That's certainly true, but my point is just that the
> problem is probably not as simple as "the defaults are too low".  I
> think the problem is more fundamentally that the model you've chosen
> is kinda broken.

Right. And that's why I repeatedly brought up this part in
discussions...  I still think it's a reasonable compromise, but it
certainly has costs.

I'm also doubtful that just adding a separate path for JIT (with a
significantly smaller cpu_*_cost or such) would really have helped in
the cases with borked estimations - we'd *still* end up choosing JITing
if the loop count is absurd, just because the cost is high.

There *are* cases where it helps - if all the cost is incurred, say, due
to random page fetches, then JITing isn't going to help that much.


> I'm not saying I know how you could have done any better, but I do
> think we're going to have to try to figure out something to do about
> it, because saying, "check-pg_upgrade is 4x slower, but that's just
> because of all those bad estimates" is not going to fly.

That I'm unconvinced by however. This was on some quite slow machine
and/or with LLVM assertions enabled - the performance difference on a
normal machine is smaller:

$ PGOPTIONS='-cjit=0' time make -s check
...
5.21user 2.11system 0:24.95elapsed 29%CPU (0avgtext+0avgdata 54212maxresident)k
20976inputs+340848outputs (14major+342228minor)pagefaults 0swaps

$ PGOPTIONS='-cjit=1' time make -s check
...
5.33user 2.01system 0:30.49elapsed 24%CPU (0avgtext+0avgdata 54236maxresident)k
0inputs+340856outputs (0major+342616minor)pagefaults 0swaps


But also importantly, I think there's actual advantages in triggering
JIT in some places in the regression tests. There's buildfarm animals
exercising the path that everything is JITed, but that's not really
helpful during development.


> Those bad estimates were harmlessly bad before,

I think that's not true.


Greetings,

Andres Freund


pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: pg_verify_checksums failure with hash indexes
Next
From: Jesper Pedersen
Date:
Subject: Re: pread() and pwrite()