Re: vacuum, performance, and MVCC - Mailing list pgsql-hackers

From Jan Wieck
Subject Re: vacuum, performance, and MVCC
Date
Msg-id 449EC18D.2000605@Yahoo.com
Whole thread Raw
In response to Re: vacuum, performance, and MVCC  ("Mark Woodward" <pgsql@mohawksoft.com>)
List pgsql-hackers
On 6/25/2006 6:52 AM, Mark Woodward wrote:
>> On 6/24/2006 9:23 AM, Mark Woodward wrote:
>>
>>>> On Sat, 24 Jun 2006, Mark Woodward wrote:
>>>>
>>>>> I'm probably mistaken, but aren't there already forward references in
>>>>> tuples to later versions? If so, I'm only sugesting reversing the
>>>>> order
>>>>> and referencing the latest version.
>>>>
>>>> I thought I understood your idea, but now you lost me again. I thought
>>>> what you want is that the older heap tuple has a pointer to the
>>>> newer one. Which it already has, it's called t_ctid.
>>>
>>> Perfect!
>>>>
>>>> Can you try to explain more carefully how the whole thing would work?
>>>> What would an index tuple point to? What pointers would a heap tuple
>>>> have? What would an index scan do to find the row version it's
>>>> interested
>>>> in? What exactly would an update do?
>>>
>>>
>>> Since we already allocate space for some notion of linked list, then all
>>> I'm suggesting is we reverse the order, sort of. Currently it looks like
>>> this:
>>>
>>> ver001->ver002->ver003->...-verN
>>>
>>> That's what t_ctid does now, right? Well, that's sort of stupid. Why not
>>> have it do this:
>>>
>>> ver001->verN->...->ver003->ver002->|
>>>  ^---------------------------------/
>>>
>>> This will speed up almost *all* queries when there are more than two
>>> version of rows.
>>>
>>> OK, here is the behavior of an update:
>>> (1) Find the latest version of the row
>>> (2) Duplicate row and modify as per plan
>>> (3) Set the t_ctid of the new row to the last "latest"
>>> (4) Set the t_ctid of the first row to that of the new row
>>> (5) Attempt to index the row
>>> (6) If the first version of the row is in the index already (ver001)
>>> Don't
>>> modify the index, otherwise, add the new version (just as before)
>>>
>>> When you vacuum, simply make the latest version (verN) the key row
>>> (ver001).
>>
>> This isn't done "simply". Currently, vacuum collects a trivial array of
>> ctid's it is removing and every now and then does a bulk remove of the
>> index tuples pointing to them. Now lets consider a table with two
>> indexed columns with the following row versions resulting from an insert
>> and 3 updates to that same row:
>>
>>    v1:  a,b
>>    v2:  a,c
>>    v3:  a,d
>>    v4:  b,d
>>
>> In your new scheme, there would be two index tuples for column 1 (one
>> pointing to v1, one pointing to v4) and 3 index tuples for column 2 (one
>> for each different value pointing to v1, v2 and v3). Did I get that
>> right so far?
>>
>> If vacuum now can remove v1, it has to update index 1 to point to v2 and
>> remove the pointer to v1 from index 2. If it can remove v1 and v2, it
>> has to update index 1 to point to v3 and remove v1 and v2 from index 2.
>> If it can remove v1, v2 and v3 it must delete the index 1 tuple pointing
>> to v1, delete the index 2 entries pointing to v1 and v2 and update the
>> index 2 entry for v3 to point to v4. Figuring out which index tuples to
>> remove and which ones to update can only be done by comparing each and
>> every indexed columns old and new values. To do so, vacuum will have to
>> fetch all the row versions, which can be scattered all over the place,
>> with all possible locking issues including but not limited to deadlocks.
> 
> I'm not sure why vacuum can't run similarly to the way it does now.

What's that supposed to mean?


Jan

-- 
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #


pgsql-hackers by date:

Previous
From: "Jonah H. Harris"
Date:
Subject: Re: vacuum row?
Next
From: Jan Wieck
Date:
Subject: Re: vacuum, performance, and MVCC