Re: JIT compiling with LLVM v9.1 - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: JIT compiling with LLVM v9.1 |
Date | |
Msg-id | 20180203091321.in4zqz3dlc2kkt42@alap3.anarazel.de Whole thread Raw |
In response to | Re: JIT compiling with LLVM v9.1 (Jeff Davis <pgsql@j-davis.com>) |
Responses |
Re: JIT compiling with LLVM v9.1
|
List | pgsql-hackers |
Hi, On 2018-02-02 18:21:12 -0800, Jeff Davis wrote: > On Mon, Jan 29, 2018 at 1:53 AM, Andres Freund <andres@anarazel.de> wrote: > >> https://git.postgresql.org/git/users/andresfreund/postgres.git > > There's a patch in there to change the scan order. Yes - note it's "deactivated" at the moment in the series. I primarily have it in there because I found profiles to be a lot more useful if it's enabled, as otherwise the number of cache misses and related stalls from heap accesses completely swamp everything else. FWIW, there's http://archives.postgresql.org/message-id/20161030073655.rfa6nvbyk4w2kkpk%40alap3.anarazel.de > I suggest that you rename the GUC "synchronize_seqscans" to something > more generic like "optimize_scan_order", and use it to control your > feature as well (after all, it's the same trade-off: weird scan order > vs. performance). Then, go ahead and commit it. FWIW I see about a 7% > boost on my laptop[1] from that patch on master, without JIT or > anything else. Yea, that's roughly the same magnitude of what I'm seeing, some queries even bigger. I'm not sure I want to commit this right now - ISTM we couldn't default this to on without annoying a lot of people, and letting the performance wins on the table by default seems like a shame. I think we should probably either change the order we store things on the page by default or only use the faster order if the scan above doesn't care about order - the planner could figure that out easily. I personally don't think it is necessary to get this committed at the same time as the JIT stuff, so I'm not planning to push very hard on that front. Should you be interested in taking it up, please feel entirely free. > I also see you dropped "7ae518bf Centralize slot deforming logic a > bit.". Was that intentional? Do we want it? The problem is that there's probably some controversial things in there. I think the checks I dropped largely make no sense, but I don't really want to push for that hard. I suspect we probably still want it, but I do not want to put into the critical path right now. > I think I saw about a 2% gain here over master, but when I applied it > on top of the fast scans it did not seem to add anything on top of > fast scans. Seems reproducible, but I don't have an explanation. Yea, that makes sense. The primary reason the patch is beneficial is that it centralizes the place where the HeapTupleHeader is accessed to a single piece of code (slot_deform_tuple()). In a lot of cases that first access will result in a cache miss in all layers, requiring a memory access. In slot_getsomeattrs() there's very little that can be done in an out-of-order manner, whereas slot_deform_tuple() can continue execution a bit further. Also, the latter will then go and sequentially access the rest (or a significant part of) the tuple, so a centralized access is more prefetchable. > And you are probably already working on this, but it would be helpful > to get the following two patches in also: > * 3c22065f Do execGrouping via expression eval > * a9dde4aa Allow tupleslots to have a fixed tupledesc Yes, I plan to resume working in whipping them up into shape as soon as I've finished the move to a shared library. This weekend I'm at fosdem, so that's going to be after... Thanks for looking! Andres Freund
pgsql-hackers by date: