Re: UNDO and in-place update - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: UNDO and in-place update
Date
Msg-id CAEepm=275rxsvaM1pZTRg31=W7QBoWA_ShzH9StDmySC1B5emg@mail.gmail.com
Whole thread Raw
In response to Re: UNDO and in-place update  (Peter Geoghegan <pg@heroku.com>)
Responses Re: UNDO and in-place update  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Wed, Nov 23, 2016 at 6:01 PM, Peter Geoghegan <pg@heroku.com> wrote:
> * Our behavior with many duplicates in secondary indexes is pretty bad
> in general, I suspect.

From the pie-in-the-sky department:  I believe there are
snapshot-based systems that don't ever have more than one entry for a
logical row in a btree.  Instead, they do UNDO-log based snapshot
reconstruction for btrees too, generalising what is being discussed
for heap tables in this thread.  That means that whenever you descend
a btree, at each page you have to be prepared to go and look in the
UNDO log if necessary on a page-by-page basis.  Alternatively, some
systems use a shadow table rather than an UNDO log for the old
entries, but still privilege queries that need the latest version by
keeping only those in the btree.  I don't know the details but suspect
that both of those approaches might require CSN (= LSN?) based
snapshots, so that you can make page-level visibility determinations
while traversing the 'current' btree based on a page header.  That
would reduce both kinds of btree bloat: the primary kind of being
entries for dead tuples that still exist in the heap, and the
secondary kind being the resultant permanent btree fragmentation that
remains even after vacuum removes them.  Presumably once you have all
that, you can allow index-only-scans without a visibility map.  Also,
I suspect that such a "3D time travelling" btree would be a
requirement for index ordered tables in a snapshot-based RDBMS.

-- 
Thomas Munro
http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: DROP FUNCTION of multiple functions
Next
From: Thomas Munro
Date:
Subject: Re: amcheck (B-Tree integrity checking tool)