Home > mailing lists

Re: Protecting against unexpected zero-pages: proposal - Mailing list pgsql-hackers

From	Robert Haas
Subject	Re: Protecting against unexpected zero-pages: proposal
Date	November 9, 2010 21:49:34
Msg-id	AANLkTintPQKjFBEMGc3Ww_db7wmTt8FeF=VqLgeQ69AG@mail.gmail.com Whole thread Raw
In response to	Re: Protecting against unexpected zero-pages: proposal (Greg Stark <gsstark@mit.edu>)
List	pgsql-hackers

Tree view

On Tue, Nov 9, 2010 at 3:05 PM, Greg Stark <gsstark@mit.edu> wrote:
> On Tue, Nov 9, 2010 at 7:37 PM, Josh Berkus <josh@agliodbs.com> wrote:
>> Well, most of the other MVCC-in-table DBMSes simply don't deal with
>> large, on-disk databases.  In fact, I can't think of one which does,
>> currently; while MVCC has been popular for the New Databases, they're
>> all focused on "in-memory" databases.  Oracle and InnoDB use rollback
>> segments.
>
> Well rollback segments are still MVCC. However Oracle's MVCC is
> block-based. So they only have to do the visibility check once per
> block, not once per row. Once they find the right block version they
> can process all the rows on it.
>
> Also Oracle's snapshots are just the log position. Instead of having
> to check whether every transaction committed or not, they just find
> the block version which was last modified before the log position for
> when their transaction started.

That is cool.  One problem is that it might sometimes result in
additional I/O.  A transaction begins and writes a tuple.  We must
write a preimage of the page (or at least, sufficient information to
reconstruct a preimage of the page) to the undo segment.  If the
transaction commits relatively quickly, and all transactions which
took their snapshots before the commit end either by committing or by
aborting, we can discard that information from the undo segment
without ever writing it to disk.  However, if that doesn't happen, the
undo log page may get evicted, and we're now doing three writes (WAL,
page, undo) rather than just two (WAL, page).  That's no worse than an
update where the old and new tuples land on different pages, but it IS
worse than an update where the old and new tuples are on the same
page, or at least I think it is.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

pgsql-hackers by date:

From: Robert Haas
Date: 09 November 2010, 21:35:48
Subject: Re: Protecting against unexpected zero-pages: proposal

From: Cédric Villemain
Date: 09 November 2010, 21:50:15
Subject: Re: multi-platform, multi-locale regression tests

Re: Protecting against unexpected zero-pages: proposal - Mailing list pgsql-hackers

Previous

Next