Home > mailing lists

Re: MVCC and index-only read - Mailing list pgsql-general

From	Sam Mason
Subject	Re: MVCC and index-only read
Date	November 18, 2008 16:48:52
Msg-id	20081118174847.GR2459@frubble.xen.chris-lamb.co.uk Whole thread Raw
In response to	MVCC and index-only read (Scara Maccai <m_lists@yahoo.it>)
Responses	Re: MVCC and index-only read (Tom Lane <tgl@sss.pgh.pa.us>) Re: MVCC and index-only read ("Jonah H. Harris" <jonah.harris@gmail.com>)
List	pgsql-general

Tree view

On Tue, Nov 18, 2008 at 04:49:35PM +0000, Scara Maccai wrote:
> if I got it right the reason some aggregates (such as COUNT) using
> only index columns are "slow" on postgresql is that it uses MVCC, so
> it has to read the data as well as the index.

Every aggregate (of which COUNT is just one example) has to read data
from both the index and the table.  The reason is that each row in a
table has two important identifiers; the transaction that created it and
the transaction that killed it.  Every time a query scans the table it
looks to see that both the transaction that created it COMMITed and that
transaction that killed it (if any) didn't COMMIT.  The index doesn't
contain these two identifiers so when scanning the index the code needs
to go and check what these are.

There are various optimizations in PG so that it doesn't need to
actually check the transaction numbers the whole time, thus speeding
things up a bit, but the semantics/behavior is the same.

> It makes sense to me,
> but I don't understand is how other databases (such as Oracle) do it.

I believe Oracle maintains a separate log (not sure how it's structured)
that contains this information and all the data in both the main table
and index can be considered committed.

There are tradeoffs in both directions; PG's implementation allows
greater concurrency, but Oracle's way is more optimized for read access.
Which implementation is better depends a lot on your work load.

There has been talk of adding the transaction identifiers into the
indexes in PG, which would mean that index scans wouldn't need to go
to the table.  The problem is that the indexes would be larger and
modifying data would incur larger overheads as both the data and index
would have to be updated.

I hope someone will point out any mistakes I've made!

  Sam

pgsql-general by date:

From: "Serge Fonville"
Date: 18 November 2008, 16:45:38
Subject: Re: High Availability for PostgreSQL on Windows 2003.

From: "Richard Broersma"
Date: 18 November 2008, 16:50:16
Subject: Re: MS Access and PostgreSQL - a warning to people thinking about it

Re: MVCC and index-only read - Mailing list pgsql-general

Previous

Next