Re: vacuum, performance, and MVCC - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: vacuum, performance, and MVCC
Date
Msg-id 200606261832.k5QIWxZ12469@momjian.us
Whole thread Raw
In response to Re: vacuum, performance, and MVCC  ("Jim C. Nasby" <jnasby@pervasive.com>)
Responses Re: vacuum, performance, and MVCC  ("Jim C. Nasby" <jnasby@pervasive.com>)
List pgsql-hackers
It is certainly possible to do what you are suggesting, that is have two
index entries point to same chain head, and have the index access
routines figure out if the index qualifications still hold, but that
seems like a lot of overhead.

Also, once there is only one visible row in the chain, removing old
index entries seems quite complex because you have to have vacuum keep
the qualifications of each row to figure out which index tuple is the
valid one (seems messy).

---------------------------------------------------------------------------

Jim C. Nasby wrote:
> On Sun, Jun 25, 2006 at 09:13:48PM +0300, Heikki Linnakangas wrote:
> > >If you can't expire the old row because one of the indexed columns was
> > >modified, I see no reason to try to reduce the additional index entries.
> > 
> > It won't enable early expiration, but it means less work to do on update. 
> > If there's a lot of indexes, not having to add so many index tuples can be 
> > a significant saving.
> 
> While catching up on this thread, the following idea came to me that I
> think would allow for not updating an index on an UPDATE if it's key
> doesn't change. If I understand Bruce's SITC proposal correctly, this
> would differ in that SITC requires that no index keys change.
> 
> My idea is that if an UPDATE places the new tuple on the same page as
> the old tuple, it will not create new index entries for any indexes
> where the key doesn't change. This means that when fetching tuples from
> the index, ctid would have to be followed until you found the version
> you wanted OR you found the first ctid that pointed to a different page
> (because that tuple will have it's own index entry) OR you found a tuple
> with a different value for the key of the index you're using (because
> it'd be invalid, and there'd be a different index entry for it). I
> believe that the behavior of the index hint bits would also have to
> change somewhat, as each index entry would now essentially be pointing
> at all the tuples in the ctid chain that exist on a page, not just
> single tuple.
> 
> In the case of an UPDATE that needs to put the new tuple on a different
> page, our current behavior would be used. This means that the hint bits
> would still be useful in limiting the number of heap pages you hit. I
> also believe this means that we wouldn't suffer any additional overhead
> from our current code when there isn't much free space on pages.
> 
> Since SITC allows for in-page space reuse without vacuuming only when no
> index keys change, it's most useful for very heavily updated tables such
> as session handlers or queue tables, because those tables typically have
> very few indexes, so it's pretty unlikely that an index key will change.
> For more general-purpose tables that have more indexes but still see a
> fair number of updates to a subset of rows, not having to update every
> index would likely be a win. I also don't see any reason why both
> options couldn't be used together.
> -- 
> Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
> Pervasive Software      http://pervasive.com    work: 512-231-6117
> vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461
> 

--  Bruce Momjian   bruce@momjian.us EnterpriseDB    http://www.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: "Truncated" tuples for tuple hash tables
Next
From: Josh Berkus
Date:
Subject: Re: Anyone still care about Cygwin? (was Re: [CORE] GPL