Re: vacuum, performance, and MVCC - Mailing list pgsql-hackers

From Mark Woodward
Subject Re: vacuum, performance, and MVCC
Date
Msg-id 18595.24.91.171.78.1151232744.squirrel@mail.mohawksoft.com
Whole thread Raw
In response to Re: vacuum, performance, and MVCC  (Jan Wieck <JanWieck@Yahoo.com>)
Responses Re: vacuum, performance, and MVCC  (Jan Wieck <JanWieck@Yahoo.com>)
Re: vacuum, performance, and MVCC  (Hannu Krosing <hannu@skype.net>)
List pgsql-hackers
> On 6/24/2006 9:23 AM, Mark Woodward wrote:
>
>>> On Sat, 24 Jun 2006, Mark Woodward wrote:
>>>
>>>> I'm probably mistaken, but aren't there already forward references in
>>>> tuples to later versions? If so, I'm only sugesting reversing the
>>>> order
>>>> and referencing the latest version.
>>>
>>> I thought I understood your idea, but now you lost me again. I thought
>>> what you want is that the older heap tuple has a pointer to the
>>> newer one. Which it already has, it's called t_ctid.
>>
>> Perfect!
>>>
>>> Can you try to explain more carefully how the whole thing would work?
>>> What would an index tuple point to? What pointers would a heap tuple
>>> have? What would an index scan do to find the row version it's
>>> interested
>>> in? What exactly would an update do?
>>
>>
>> Since we already allocate space for some notion of linked list, then all
>> I'm suggesting is we reverse the order, sort of. Currently it looks like
>> this:
>>
>> ver001->ver002->ver003->...-verN
>>
>> That's what t_ctid does now, right? Well, that's sort of stupid. Why not
>> have it do this:
>>
>> ver001->verN->...->ver003->ver002->|
>>  ^---------------------------------/
>>
>> This will speed up almost *all* queries when there are more than two
>> version of rows.
>>
>> OK, here is the behavior of an update:
>> (1) Find the latest version of the row
>> (2) Duplicate row and modify as per plan
>> (3) Set the t_ctid of the new row to the last "latest"
>> (4) Set the t_ctid of the first row to that of the new row
>> (5) Attempt to index the row
>> (6) If the first version of the row is in the index already (ver001)
>> Don't
>> modify the index, otherwise, add the new version (just as before)
>>
>> When you vacuum, simply make the latest version (verN) the key row
>> (ver001).
>
> This isn't done "simply". Currently, vacuum collects a trivial array of
> ctid's it is removing and every now and then does a bulk remove of the
> index tuples pointing to them. Now lets consider a table with two
> indexed columns with the following row versions resulting from an insert
> and 3 updates to that same row:
>
>    v1:  a,b
>    v2:  a,c
>    v3:  a,d
>    v4:  b,d
>
> In your new scheme, there would be two index tuples for column 1 (one
> pointing to v1, one pointing to v4) and 3 index tuples for column 2 (one
> for each different value pointing to v1, v2 and v3). Did I get that
> right so far?
>
> If vacuum now can remove v1, it has to update index 1 to point to v2 and
> remove the pointer to v1 from index 2. If it can remove v1 and v2, it
> has to update index 1 to point to v3 and remove v1 and v2 from index 2.
> If it can remove v1, v2 and v3 it must delete the index 1 tuple pointing
> to v1, delete the index 2 entries pointing to v1 and v2 and update the
> index 2 entry for v3 to point to v4. Figuring out which index tuples to
> remove and which ones to update can only be done by comparing each and
> every indexed columns old and new values. To do so, vacuum will have to
> fetch all the row versions, which can be scattered all over the place,
> with all possible locking issues including but not limited to deadlocks.

I'm not sure why vacuum can't run similarly to the way it does now.




pgsql-hackers by date:

Previous
From: Michael Meskes
Date:
Subject: Re: [CORE] GPL Source and Copyright Questions
Next
From: Daniel Xavier de Sousa
Date:
Subject: Re: Buffer for inner and outer table