Re: "Write amplification" is made worse by "getting tired" whileinserting into nbtree secondary indexes (Was: Why B-Tree suffix truncation matters) - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: "Write amplification" is made worse by "getting tired" whileinserting into nbtree secondary indexes (Was: Why B-Tree suffix truncation matters)
Date
Msg-id CAH2-Wz=JPzTqugerPW8huF35dMU16xj-ko18-xc=d5Qs8PbPww@mail.gmail.com
Whole thread Raw
In response to Re: "Write amplification" is made worse by "getting tired" whileinserting into nbtree secondary indexes (Was: Why B-Tree suffix truncationmatters)  (Peter Eisentraut <peter.eisentraut@2ndquadrant.com>)
List pgsql-hackers
On Tue, Aug 28, 2018 at 11:32 PM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
> Do you plan to submit this patch to the upcoming commit fest perhaps?  I
> have done some testing on it and it seems worth pursuing further.

I should make sure that this makes it into the September 'fest. Thanks
for reminding me. I've been distracted by other responsibilities, but
I think that this project is well worthwhile. I intend to spend lots
of time on it, as I think that it has real strategic importance. I
would be most grateful if you signed up to review the patch. I've been
testing it with quite a variety of real-world data, which seems like
the way to go until the code really matures.

As I indicated to Simon on August 2nd, it seems like I should further
refine my current working draft of the next version to have less
magic. I have a cumbersome test suite that proves that I have
something that improves fan-out on TPC-C and TPC-H indexes by quite a
bit (e.g. the main TPC-C order_line pkey is about 7% smaller after
initial preload, despite not even having smaller pivot tuples due to
alignment). It also proves that the benefits of not "getting tired" in
the event of many duplicates are preserved. We need to have both at
once.

The code to pick a split point during a page split is rather tricky.
That's what's holding the next version up. I can post what I have
right now on the other thread, to give you a better idea of the
direction of the patch. You'll have to hold your nose when you look at
the code that picks a split point, though. Let me know if you think
that that makes sense. I wouldn't want you to spend too much time on
old-ish code.

-- 
Peter Geoghegan


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: buildfarm: could not read block 3 in file "base/16384/2662": read only 0 of 8192 bytes
Next
From: Jerry Jelinek
Date:
Subject: Re: patch to allow disable of WAL recycling