Re: Performance difference in accessing differrent columns in aPostgres Table - Mailing list pgsql-performance

From Jeff Janes
Subject Re: Performance difference in accessing differrent columns in aPostgres Table
Date
Msg-id CAMkU=1x41SSMBkGnC5+qfP+Q7n__kVb+4Gq57P_ommpqkibkzQ@mail.gmail.com
Whole thread Raw
In response to Re: Performance difference in accessing differrent columns in aPostgres Table  (Andres Freund <andres@anarazel.de>)
Responses Re: Performance difference in accessing differrent columns in aPostgres Table  (Andres Freund <andres@anarazel.de>)
List pgsql-performance
On Mon, Jul 30, 2018 at 1:23 PM, Andres Freund <andres@anarazel.de> wrote:
On 2018-07-30 07:19:07 -0400, Jeff Janes wrote:

> And indeed, in my hands JIT makes it almost 3 times worse.

Not in my measurement. Your example won't use JIT at all, because it's
below the cost threshold. So I think you might just be seeing cache +
hint bit effects?

No, it is definitely JIT.  The explain plans show it, and the cost of the query is 230,000 while the default setting of jit_above_cost is 100,000.  It is fully reproducible by repeatedly toggling the JIT setting.  It doesn't seem to be the cost of compiling the code that slows it down (I'm assuming the code is compiled once per tuple descriptor, not once per tuple), but rather the efficiency of the compiled code.

 

> Run against ab87b8fedce3fa77ca0d6, I get 12669.619 ms for the 2nd JIT
> execution and 4594.994 ms for the JIT=off.

Even with a debug LLVM build, which greatly increases compilation
overhead, I actually see quite the benefit when I force JIT to be used:

I don't see a change when I compile without --enable-debug, and jit_debugging_support is off, or in 11beta2 nonexistent.  How can I know if I have a debug LLVM build, and turn it off if I do?  
 


postgres[26832][1]=# ;SET jit_above_cost = -1; set jit_optimize_above_cost = 0; set jit_inline_above_cost = 0;
postgres[26832][1]=# explain (analyze, buffers, timing off) select pk, int200 from i200c200;

Lowering jit_optimize_above_cost does redeem this for me.  It brings it back to being a tie with JIT=OFF.  I don't see any further improvement by lowering jit_inline_above_cost, and overall it is just a statistical tie with JIT=off, not an improvement as you get, but at least it isn't a substantial loss.

Under what conditions would I want to do jit without doing optimizations on it?  Is there a rule of thumb that could be documented, or do we just use the experimental method for each query?

I don't know how sensitive JIT is to hardware.  I'm using Ubuntu 16.04 on VirtualBox (running on Windows 10) on an i5-7200U, which might be important.

I had previously done a poor-man's JIT where I created 4 versions of the main 'for' loop in slot_deform_tuple.  I did a branch on "if(hasnulls)", and then each branch had two loops, one for when 'slow' is false, and then one for after 'slow' becomes true so we don't have to keep setting it true again once it already is, in a tight loop.  I didn't see noticeable improvement there (although perhaps I would have on different hardware), so didn't see how JIT could help with this almost-entirely-null case.  I'm not trying to address JIT in general, just as it applies to this particular case.

Unrelated to JIT and relevant to the 'select pk, int199' case but not the 'select pk, int200' case, it seems we have gone to some length to make slot deforming be efficient for incremental use, but then just deform in bulk anyway up to maximum attnum used in the query, at least in this case.  Is that because incremental deforming is not cache efficient?

Cheers,

Jeff

pgsql-performance by date:

Previous
From: Andres Freund
Date:
Subject: Re: Performance difference in accessing differrent columns in aPostgres Table
Next
From: Andres Freund
Date:
Subject: Re: Performance difference in accessing differrent columns in aPostgres Table