Re: Delaying/avoiding BTreeTupleGetNAtts() call within _bt_compare() - Mailing list pgsql-hackers

From Mark Dilger
Subject Re: Delaying/avoiding BTreeTupleGetNAtts() call within _bt_compare()
Date
Msg-id 1583D0EB-FCDB-4683-96BF-8D67974F9156@enterprisedb.com
Whole thread Raw
In response to Re: Delaying/avoiding BTreeTupleGetNAtts() call within _bt_compare()  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: RE: Delaying/avoiding BTreeTupleGetNAtts() call within _bt_compare()
Re: Delaying/avoiding BTreeTupleGetNAtts() call within _bt_compare()
List pgsql-hackers

> On Mar 2, 2020, at 5:29 PM, Peter Geoghegan <pg@bowt.ie> wrote:
>
> On Sun, Mar 1, 2020 at 12:15 PM Floris Van Nee <florisvannee@optiver.com> wrote:
>> Minor conflict with that patch indeed. Attached is rebased version. I'm running some tests now - will post the
resultshere when finished. 
>
> Thanks.
>
> We're going to have to go back to my original approach to inlining. At
> least, it seemed to be necessary to do that to get any benefit from
> the patch on my comparatively modest workstation (using a similar
> pgbench SELECT benchmark to the one that you ran). Tom also had a
> concern about the portability of inlining without using "static
> inline" -- that is another reason to avoid the approach to inlining
> taken by v3 + v4.
>
> It's possible (though not very likely) that performance has been
> impacted by the deduplication patch (commit 0d861bbb), since it
> updated the definition of BTreeTupleGetNAtts() itself.
>
> Attached is v5, which inlines in a targeted fashion, pretty much in
> the same way as the earliest version. This is the same as v4 in every
> other way. Perhaps you can test this.

Hi Peter, just a quick code review:

The v5 patch files apply and pass the regression tests. I get no errors.  The performance impact is within the noise.
TheTPS with the patch are higher sometimes and lower other times, looking across the 27 subtests of the "select-only"
benchmark. Which subtest is slower or faster changes per run, so that doesn't seem to be relevant.  I ran the
"select-only"six times (three with the patch, three without).  The change you made to the loop is clearly visible in
thenbtsearch.s output, and the change to inline _bt_compare is even more visible, so there doesn't seem to be any
defeatingof your change due to the compiler ignoring the "inline" or such.  I compiled using gcc -O2 

% gcc --version
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr
--with-gxx-include-dir=/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include/c++/4.2.1
Apple clang version 11.0.0 (clang-1100.0.33.17)
Target: x86_64-apple-darwin19.4.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

2.4 GHz 8-Core Intel Core i9
32 GB 2667 MHz DDR4

Reading this thread, I think the lack of a performance impact on laptop hardware was expected, but perhaps confirmation
thatit does not make things worse is useful? 

Since this patch doesn't seem to do any harm, I would mark it as "ready for committer", except that there doesn't yet
seemto be enough evidence that it is a net win. 

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: Trouble with hashagg spill I/O pattern and costing
Next
From: Alexander Korotkov
Date:
Subject: Re: Operator class parameters and sgml docs