Home > mailing lists

Re: vacuum, performance, and MVCC - Mailing list pgsql-hackers

From	Mark Woodward
Subject	Re: vacuum, performance, and MVCC
Date	June 24, 2006 13:06:05
Msg-id	18926.24.91.171.78.1151155408.squirrel@mail.mohawksoft.com Whole thread Raw
In response to	Re: vacuum, performance, and MVCC (Heikki Linnakangas <hlinnaka@iki.fi>)
Responses	Re: vacuum, performance, and MVCC Re: vacuum, performance, and MVCC Re: vacuum, performance, and MVCC Re: vacuum, performance, and MVCC
List	pgsql-hackers

Tree view

> On Sat, 24 Jun 2006, Mark Woodward wrote:
>
>> I'm probably mistaken, but aren't there already forward references in
>> tuples to later versions? If so, I'm only sugesting reversing the order
>> and referencing the latest version.
>
> I thought I understood your idea, but now you lost me again. I thought
> what you want is that the older heap tuple has a pointer to the
> newer one. Which it already has, it's called t_ctid.

Perfect!
>
> Can you try to explain more carefully how the whole thing would work?
> What would an index tuple point to? What pointers would a heap tuple
> have? What would an index scan do to find the row version it's interested
> in? What exactly would an update do?

Since we already allocate space for some notion of linked list, then all
I'm suggesting is we reverse the order, sort of. Currently it looks like
this:

ver001->ver002->ver003->...-verN

That's what t_ctid does now, right? Well, that's sort of stupid. Why not
have it do this:

ver001->verN->...->ver003->ver002->|^---------------------------------/

This will speed up almost *all* queries when there are more than two
version of rows.

OK, here is the behavior of an update:
(1) Find the latest version of the row
(2) Duplicate row and modify as per plan
(3) Set the t_ctid of the new row to the last "latest"
(4) Set the t_ctid of the first row to that of the new row
(5) Attempt to index the row
(6) If the first version of the row is in the index already (ver001) Don't
modify the index, otherwise, add the new version (just as before)

When you vacuum, simply make the latest version (verN) the key row (ver001).

There are, no doubt, issues that need to be resolved (I can think of a
coouple off the top of my head), but overall I think it is workable and
don't think this will affect performance in the simple case and improve
performance in the cases where there are more than one or two version of a
row.

pgsql-hackers by date:

From: Heikki Linnakangas
Date: 24 June 2006, 12:45:08
Subject: Re: vacuum, performance, and MVCC

From: "Jonah H. Harris"
Date: 24 June 2006, 13:13:35
Subject: Re: vacuum, performance, and MVCC

Re: vacuum, performance, and MVCC - Mailing list pgsql-hackers

Previous

Next