Re: JIT compiling with LLVM v11 - Mailing list pgsql-hackers

From Andres Freund
Subject Re: JIT compiling with LLVM v11
Date
Msg-id 20180306201601.3o7dgpjyodkqjhk2@alap3.anarazel.de
Whole thread Raw
In response to Re: JIT compiling with LLVM v11  (Peter Eisentraut <peter.eisentraut@2ndquadrant.com>)
Responses Re: JIT compiling with LLVM v11  (Andres Freund <andres@anarazel.de>)
Re: JIT compiling with LLVM v11  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: JIT compiling with LLVM v11  (Peter Eisentraut <peter.eisentraut@2ndquadrant.com>)
List pgsql-hackers
Hi,

On 2018-03-06 10:29:47 -0500, Peter Eisentraut wrote:
> I think taking the total cost as the triggering threshold is probably
> good enough for a start.  The cost modeling can be refined over time.

Cool.


> We should document that both jit_optimize_above_cost and
> jit_inline_above_cost require jit_above_cost to be set, or otherwise
> nothing happens.

Yea, that's a good plan. We could also change it so it would, but I
don't think there's much point?


> One problem I see is that if someone sets things like
> enable_seqscan=off, the artificial cost increase created by those
> settings would quite likely bump the query over the jit threshold, which
> would alter the query performance characteristics in a way that the user
> would not have intended.  I don't have an idea how to address this right
> now.

I'm not too worried about that scenario. If, for a cheap plan, the
planner ends up with a seqscan despite it being disabled, you're pretty
close to randomly choosing plans already, as the pruning doesn't work
well anymore (as the %1 percent fuzz factor in
compare_path_costs_fuzzily() swamps the actual plan costs).


> I ran some performance assessments:
>
> merge base (0b1d1a038babff4aadf0862c28e7b667f1b12a30)
>
> make installcheck  3.14s user 3.34s system 17% cpu 37.954 total
>
> jit branch default settings
>
> make installcheck  3.17s user 3.30s system 13% cpu 46.596 total
>
> jit_above_cost=0
>
> make installcheck  3.30s user 3.53s system 5% cpu 1:59.89 total
>
> jit_optimize_above_cost=0 (and jit_above_cost=0)
>
> make installcheck  3.44s user 3.76s system 1% cpu 8:12.42 total
>
> jit_inline_above_cost=0 (and jit_above_cost=0)
>
> make installcheck  3.32s user 3.62s system 2% cpu 5:35.58 total
>
> One can see the CPU savings quite nicely.

I'm not quite sure what you mean by that.


> One obvious problem is that with the default settings, the test suite
> run gets about 15% slower.  (These figures are reproducible over several
> runs.)  Is there some debugging stuff turned on that would explain this?
>  Or would just loading the jit module in each session cause this?

I suspect it's loading the module.  There's two pretty easy avenues to
improve this:

1) Attempt to load the JIT provider in postmaster, thereby avoiding a
   lot of redundant dynamic linker work if already installed. That's
   ~5-10 lines or such.  I basically refrained from that because it's
   convenient to not have to restart the server during development (one
   can just reconnect and get a newer jit plugin).

2) Don't load the JIT provider until fully needed. Right now
   jit_compile_expr() will load the jit provider even if not really
   needed. We should probably move the first two return blocks in
   llvm_compile_expr() into jit_compile_expr(), to avoid that.


> From the other results, we can see that one clearly needs quite a big
> database to see a solid benefit from this.

Right, until we've got caching this'll only be beneficial for ~1s+
analytics queries. Unfortunately caching requires some larger planner &
executor surgery, so I don't want to go there at the same time (I'm
already insane enough).


> Do you have any information gathered about this so far?  Any scripts
> to create test databases and test queries?

Yes. I've used tpc-h. Not because it's the greatest, but because it's
semi conveniently available and a lot of others have experience with it
already.  Do you mean whether I've run a couple benchmarks? If so, yes.
I'll schedule some more later - am on battery power rn.

Greetings,

Andres Freund


pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: [HACKERS] Partition-wise aggregation/grouping
Next
From: Robert Haas
Date:
Subject: Re: public schema default ACL