Re: vacuum, performance, and MVCC - Mailing list pgsql-hackers

From Jonah H. Harris
Subject Re: vacuum, performance, and MVCC
Date
Msg-id 36e682920606222020v55974e31wd537c2bb68f17446@mail.gmail.com
Whole thread Raw
In response to Re: vacuum, performance, and MVCC  (Gavin Sherry <swm@linuxworld.com.au>)
Responses Re: vacuum, performance, and MVCC  (Gavin Sherry <swm@linuxworld.com.au>)
Re: vacuum, performance, and MVCC  (Alvaro Herrera <alvherre@commandprompt.com>)
List pgsql-hackers
On 6/22/06, Gavin Sherry <swm@linuxworld.com.au> wrote:
> There seems to be a bit of confusion about what MVCC is.

Yes, PostgreSQL implements MVTO-style concurrency, Oracle implements
MVRC, ...  Let's not go into theory here, because there's plenty of
papers and books on the subject.

> The other MVCC databases maintain UNDO logs outside of the
> table's data files.

*most* other MVCC systems act this way, but not all.

> When an update occurs, the existing row version is
> copied to te UNDO file

Not in all systems.  A few now perform in-memory UNDO and only write
it to disk if and when it is required.

> Overwriting MVCC comes with its own baggage. Ask any Oracle user about
> error ORA-01555[1]. There's also the added cost of managing the UNDO logs,
> the cost of jumping around between files to get row versions and so on.

This seems to be going in the direction of our common MySQL
discussions; relying on old failures and mistakes to base our
assumptions on the current version.  Please stay apprised of current
developments in other systems.

As you're probably haven't done consulting for Oracle recently, I can
tell you that the once-dreaded ORA-01555 is nearly a thing of the
past.  Continuing to dispense misinformation doesn't help anyone so
let's stop trying to use Oracle's past problems (which are largely due
to improper tuning) as preferring PostgreSQL's methods over others.

> Also, it leads to inefficiency with variable size data types. The new
> version of a row might be longer or shorter than the previous version and
> this has to be causing them a headaches and performance penalties.

I've seen many MVCC implementations using update-in-place (overwriting
storage manager) with plenty of variable-length attributes; not all
have these problems.

> J. Gray & A Reuter, Transaction Processing: Concepts and Techniques

Pretty much older than dirt, discusses locking, and barely touches on
MVCC.  Still has some good concepts though.

> US Patent Number 5,870,758 -- (one of?) Oracle's snapshot isolation patent

Again, not the most optimal implementation of MVCC, but does certainly
have advantages.

> Tom Lane's MVCC talk:
> http://www.postgresql.org/files/developer/transactions.pdf

Always a good read.

I suggest reading Transactional Information Systems: Theory,
Algorithms, and the Practice of Concurrency Control by Weikum and
Vossen... it's basically the new Gray and Reuter.  It covers multiple
MVCC techniques as well as differences between the different UNDO/REDO
database models.

-- 
Jonah H. Harris, Software Architect | phone: 732.331.1300
EnterpriseDB Corporation            | fax: 732.331.1301
33 Wood Ave S, 2nd Floor            | jharris@enterprisedb.com
Iselin, New Jersey 08830            | http://www.enterprisedb.com/


pgsql-hackers by date:

Previous
From: Christopher Browne
Date:
Subject: Re: vacuum, performance, and MVCC
Next
From: Gavin Sherry
Date:
Subject: Re: vacuum, performance, and MVCC