Re: vacuum, performance, and MVCC - Mailing list pgsql-hackers

From Jan Wieck
Subject Re: vacuum, performance, and MVCC
Date
Msg-id 449D8DFA.7010700@Yahoo.com
Whole thread Raw
In response to Re: vacuum, performance, and MVCC  ("Mark Woodward" <pgsql@mohawksoft.com>)
Responses Re: vacuum, performance, and MVCC
List pgsql-hackers
On 6/24/2006 9:23 AM, Mark Woodward wrote:

>> On Sat, 24 Jun 2006, Mark Woodward wrote:
>>
>>> I'm probably mistaken, but aren't there already forward references in
>>> tuples to later versions? If so, I'm only sugesting reversing the order
>>> and referencing the latest version.
>>
>> I thought I understood your idea, but now you lost me again. I thought
>> what you want is that the older heap tuple has a pointer to the
>> newer one. Which it already has, it's called t_ctid.
> 
> Perfect!
>>
>> Can you try to explain more carefully how the whole thing would work?
>> What would an index tuple point to? What pointers would a heap tuple
>> have? What would an index scan do to find the row version it's interested
>> in? What exactly would an update do?
> 
> 
> Since we already allocate space for some notion of linked list, then all
> I'm suggesting is we reverse the order, sort of. Currently it looks like
> this:
> 
> ver001->ver002->ver003->...-verN
> 
> That's what t_ctid does now, right? Well, that's sort of stupid. Why not
> have it do this:
> 
> ver001->verN->...->ver003->ver002->|
>  ^---------------------------------/
> 
> This will speed up almost *all* queries when there are more than two
> version of rows.
> 
> OK, here is the behavior of an update:
> (1) Find the latest version of the row
> (2) Duplicate row and modify as per plan
> (3) Set the t_ctid of the new row to the last "latest"
> (4) Set the t_ctid of the first row to that of the new row
> (5) Attempt to index the row
> (6) If the first version of the row is in the index already (ver001) Don't
> modify the index, otherwise, add the new version (just as before)
> 
> When you vacuum, simply make the latest version (verN) the key row (ver001).

This isn't done "simply". Currently, vacuum collects a trivial array of 
ctid's it is removing and every now and then does a bulk remove of the 
index tuples pointing to them. Now lets consider a table with two 
indexed columns with the following row versions resulting from an insert 
and 3 updates to that same row:
  v1:  a,b  v2:  a,c  v3:  a,d  v4:  b,d

In your new scheme, there would be two index tuples for column 1 (one 
pointing to v1, one pointing to v4) and 3 index tuples for column 2 (one 
for each different value pointing to v1, v2 and v3). Did I get that 
right so far?

If vacuum now can remove v1, it has to update index 1 to point to v2 and 
remove the pointer to v1 from index 2. If it can remove v1 and v2, it 
has to update index 1 to point to v3 and remove v1 and v2 from index 2. 
If it can remove v1, v2 and v3 it must delete the index 1 tuple pointing 
to v1, delete the index 2 entries pointing to v1 and v2 and update the 
index 2 entry for v3 to point to v4. Figuring out which index tuples to 
remove and which ones to update can only be done by comparing each and 
every indexed columns old and new values. To do so, vacuum will have to 
fetch all the row versions, which can be scattered all over the place, 
with all possible locking issues including but not limited to deadlocks.


Jan

-- 
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #


pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: vacuum, performance, and MVCC
Next
From: Hannu Krosing
Date:
Subject: Re: vacuum, performance, and MVCC