Re: [PROPOSAL] Effective storage of duplicates in B-tree index. - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: [PROPOSAL] Effective storage of duplicates in B-tree index.
Date
Msg-id CAM3SWZR9uvu97Xse8-iJgRzLt8RM0WRHnhZnfE_EMV5+-39VRg@mail.gmail.com
Whole thread Raw
In response to Re: [PROPOSAL] Effective storage of duplicates in B-tree index.  (Peter Geoghegan <pg@heroku.com>)
List pgsql-hackers
On Sun, Sep 27, 2015 at 4:11 PM, Peter Geoghegan <pg@heroku.com> wrote:
> Debugging this stuff is sometimes like keyhole surgery. If you could
> just see at/get to the structure that you care about, it would be 10
> times easier. Hopefully this tool makes it easier to identify problems.

I should add that the way that the L&Y technique works, and the way
that Postgres code is generally very robust/defensive can make direct
testing a difficult thing. I have seen cases where a completely messed
up B-Tree still gave correct results most of the time, and was just
slower. That can happen, for example, because the "move right" thing
results in a degenerate linear scan of the entire index. The
comparisons in the internal pages were totally messed up, but it
"didn't matter" once a scan could get to leaf pages and could move
right and find the value that way.

I wrote amcheck because I thought it was scary how B-Tree indexes
could be *completely* messed up without it being obvious; what hope is
there of a test finding a subtle problem in their structure, then?
Testing the invariants directly seemed like the only way to have a
chance of not introducing bugs when adding new stuff to the B-Tree
code. I believe that adding optimizations to the B-Tree code will be
important in the next couple of years, and there is no other way to
approach it IMV.

-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: [PROPOSAL] Effective storage of duplicates in B-tree index.
Next
From: Gavin Wahl
Date:
Subject: Re: BRIN indexes for MAX, MIN, ORDER BY?