Re: Making all nbtree entries unique by having heap TIDs participatein comparisons - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: Making all nbtree entries unique by having heap TIDs participatein comparisons
Date
Msg-id CAH2-Wzktw6L64=FnNjsDrZGW-hk3wqNx-f4UFrzUdnMSmYNzqg@mail.gmail.com
Whole thread Raw
In response to Re: Making all nbtree entries unique by having heap TIDs participatein comparisons  (Andrey Lepikhov <a.lepikhov@postgrespro.ru>)
Responses Re: Making all nbtree entries unique by having heap TIDs participatein comparisons  (Peter Geoghegan <pg@bowt.ie>)
List pgsql-hackers
On Fri, Sep 28, 2018 at 10:58 PM Andrey Lepikhov
<a.lepikhov@postgrespro.ru> wrote:
> I am reviewing this patch too. And join to Peter Eisentraut opinion
> about splitting the patch into a hierarchy of two or three patches:
> "functional" - tid stuff and "optimizational" - suffix truncation &
> splitting. My reasons are simplification of code review, investigation
> and benchmarking.

As I mentioned to Peter, I don't think that I can split out the heap
TID stuff from the suffix truncation stuff. At least not without
making the patch even more complicated, for no benefit. I will split
out the "brain" of the patch (the _bt_findsplitloc() stuff, which
decides on a split point using sophisticated rules) from the "brawn"
(the actually changes to how index scans work, including the heap TID
stuff, as well as the code for actually physically performing suffix
truncation). The brain of the patch is where most of the complexity
is, as well as most of the code. The brawn of the patch is _totally
unusable_ without intelligence around split points, but I'll split
things up along those lines anyway. Doing so should make the whole
design a little easier to see follow.

> Now benchmarking is not clear. Possible performance degradation from TID
> ordering interfere with positive effects from the optimizations in
> non-trivial way.

Is there any evidence of a regression in the last 2 versions? I've
been using pgbench, which didn't show any. That's not a sympathetic
case for the patch, though it would be nice to confirm if there was
some small improvement there. I've seen contradictory results (slight
improvements and slight regressions), but that was with a much earlier
version, so it just isn't relevant now. pgbench is mostly interesting
as a thing that we want to avoid regressing.

Once I post the next version, it would be great if somebody could use
HammerDB's OLTP test, which seems like the best fair use
implementation of TPC-C that's available. I would like to make that
the "this is why you should care, even if you happen to not believe in
the patch's strategic importance" benchmark. TPC-C is clearly the most
influential database benchmark ever, so I think that that's a fair
request. (See the TPC-C commentary at
https://www.hammerdb.com/docs/ch03s02.html, for example.)

-- 
Peter Geoghegan


pgsql-hackers by date:

Previous
From: Christoph Moench-Tegeder
Date:
Subject: Function for listing archive_status directory
Next
From: David Fetter
Date:
Subject: Re: [RFC] Removing "magic" oids