Re: Making all nbtree entries unique by having heap TIDs participatein comparisons - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: Making all nbtree entries unique by having heap TIDs participatein comparisons
Date
Msg-id CAH2-Wznizred1To2x7uzkRD8ZYOpPbzYaoU5aG_g_UJCsiWAFQ@mail.gmail.com
Whole thread Raw
In response to Re: Making all nbtree entries unique by having heap TIDs participatein comparisons  (Heikki Linnakangas <hlinnaka@iki.fi>)
Responses Re: Making all nbtree entries unique by having heap TIDs participatein comparisons
List pgsql-hackers
On Thu, Mar 14, 2019 at 2:21 AM Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> It doesn't matter how often it happens, the code still needs to deal
> with it. So let's try to make it as readable as possible.

> Well, IMHO holding the buffer and the bounds in the new struct is more
> clean than the savebinsrc/restorebinsrch flags. That's exactly why I
> suggested it. I don't know what else to suggest. I haven't done any
> benchmarking, but I doubt there's any measurable difference.

Fair enough. Attached is v17, which does it using the approach taken
in your earlier prototype. I even came around to your view on
_bt_binsrch_insert() -- I kept that part, too. Note, however, that I
still pass checkingunique to _bt_findinsertloc(), because that's a
distinct condition to whether or not bounds were cached (they happen
to be the same thing right now, but I don't want to assume that).

This revision also integrates your approach to the "continuescan"
optimization patch, with the small tweak I mentioned yesterday (we
also pass ntupatts). I also prefer this approach.

I plan on committing the first few patches early next week, barring
any objections, or any performance problems noticed during an
additional, final round of performance validation. I won't expect
feedback from you until Monday at the earliest. It would be nice if
you could take a look at the amcheck "relocate" patch. My intention is
to push patches up to and including the amcheck "relocate" patch on
the same day (I'll leave a few hours between the first two patches, to
confirm that the first patch doesn't break the buildfarm).

BTW, my multi-day, large BenchmarkSQL benchmark continues, with some
interesting results. The first round of 12 hour long runs showed the
patch nearly 6% ahead in terms of transaction throughput, with a
database that's almost 1 terabyte. The second round, which completed
yesterday and reuses the database initialized for the first round
showed that the patch had 10.7% higher throughput. That's a new record
for the patch. I'm going to leave this benchmark running for a few
more days, at least until it stops being interesting. I wonder how
long it will be before the master branch throughput stops declining
relative to throughput with the patched version. I expect that the
master branch will reach "index bloat saturation point" sooner or
later. Indexes in the patch's data directory continue to get larger,
as expected, but the amount of bloat accumulated over time is barely
noticeable (i.e. the pages are packed tight with tuples, which barely
declines over time).

This version of the patch series has attributions/credits at the end
of the commit messages. I have listed you as a secondary author on a
couple of the patches, where code was lifted from your feedback
patches. Let me know if you think that I have it right.

Thanks
-- 
Peter Geoghegan

Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?
Next
From: Haribabu Kommi
Date:
Subject: Re: Pluggable Storage - Andres's take