Re: JIT compiling with LLVM v9.1 - Mailing list pgsql-hackers

From Andres Freund
Subject Re: JIT compiling with LLVM v9.1
Date
Msg-id 20180203091321.in4zqz3dlc2kkt42@alap3.anarazel.de
Whole thread Raw
In response to Re: JIT compiling with LLVM v9.1  (Jeff Davis <pgsql@j-davis.com>)
Responses Re: JIT compiling with LLVM v9.1
List pgsql-hackers
Hi,

On 2018-02-02 18:21:12 -0800, Jeff Davis wrote:
> On Mon, Jan 29, 2018 at 1:53 AM, Andres Freund <andres@anarazel.de> wrote:
> >>   https://git.postgresql.org/git/users/andresfreund/postgres.git
> 
> There's a patch in there to change the scan order.

Yes - note it's "deactivated" at the moment in the series. I primarily
have it in there because I found profiles to be a lot more useful if
it's enabled, as otherwise the number of cache misses and related stalls
from heap accesses completely swamp everything else.

FWIW, there's http://archives.postgresql.org/message-id/20161030073655.rfa6nvbyk4w2kkpk%40alap3.anarazel.de


> I suggest that you rename the GUC "synchronize_seqscans" to something
> more generic like "optimize_scan_order", and use it to control your
> feature as well (after all, it's the same trade-off: weird scan order
> vs.  performance). Then, go ahead and commit it. FWIW I see about a 7%
> boost on my laptop[1] from that patch on master, without JIT or
> anything else.

Yea, that's roughly the same magnitude of what I'm seeing, some queries
even bigger.

I'm not sure I want to commit this right now - ISTM we couldn't default
this to on without annoying a lot of people, and letting the performance
wins on the table by default seems like a shame.  I think we should
probably either change the order we store things on the page by default
or only use the faster order if the scan above doesn't care about order
- the planner could figure that out easily.

I personally don't think it is necessary to get this committed at the
same time as the JIT stuff, so I'm not planning to push very hard on
that front. Should you be interested in taking it up, please feel
entirely free.


> I also see you dropped "7ae518bf Centralize slot deforming logic a
> bit.". Was that intentional? Do we want it?

The problem is that there's probably some controversial things in
there. I think the checks I dropped largely make no sense, but I don't
really want to push for that hard.  I suspect we probably still want it,
but I do not want to put into the critical path right now.


> I think I saw about a 2% gain here over master, but when I applied it
> on top of the fast scans it did not seem to add anything on top of
> fast scans. Seems reproducible, but I don't have an explanation.

Yea, that makes sense. The primary reason the patch is beneficial is
that it centralizes the place where the HeapTupleHeader is accessed to a
single piece of code (slot_deform_tuple()). In a lot of cases that first
access will result in a cache miss in all layers, requiring a memory
access. In slot_getsomeattrs() there's very little that can be done in
an out-of-order manner, whereas slot_deform_tuple() can continue
execution a bit further. Also, the latter will then go and sequentially
access the rest (or a significant part of) the tuple, so a centralized
access is more prefetchable.


> And you are probably already working on this, but it would be helpful
> to get the following two patches in also:
> * 3c22065f Do execGrouping via expression eval
> * a9dde4aa Allow tupleslots to have a fixed tupledesc

Yes, I plan to resume working in whipping them up into shape as soon as
I've finished the move to a shared library. This weekend I'm at fosdem,
so that's going to be after...

Thanks for looking!

Andres Freund


pgsql-hackers by date:

Previous
From: Craig Ringer
Date:
Subject: Re: pie-in-sky idea: 'sensitive' function parameters
Next
From: Andres Freund
Date:
Subject: Re: JIT compiling with LLVM v9.1