Re: vacuum, performance, and MVCC - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: vacuum, performance, and MVCC
Date
Msg-id 200606260212.k5Q2C1S10045@momjian.us
Whole thread Raw
In response to Re: vacuum, performance, and MVCC  (Jan Wieck <JanWieck@Yahoo.com>)
Responses Re: vacuum, performance, and MVCC  (Jan Wieck <JanWieck@Yahoo.com>)
List pgsql-hackers
Jan Wieck wrote:
> >     [item1]...[tuple1]
> > 
> > becomes on UPDATE:
> >            ---------->
> >     [item1]...[tuple1][tuple2]
> >                       ----->
> > 
> > on another UPDATE, if tuple1 is no longer visible:
> > 
> >            ------------------>
> >     [item1]...[tuple1][tuple2]
> >                       <------
> > 
> >> Another problem with this is that even if you find such row, it doesn't 
> >> spare you the index traversal. The dead row whos item id you're reusing 
> >> might have resulted from an insert that aborted or crashed before it 
> >> finished creating all index entries. Or some of its index entries might 
> >> already be flagged known dead, and you better reset those flags.
> > 
> > You can only reuse heap rows that were created and expired by committed
> > transactions.  In fact, you can only UPDATE a row that was created by a
> > committed transaction.  You cannot _reuse_ any row, but only a row that
> > is being UPDATEd.  Also, it cannot be known dead because it are are in
> > the process of updating it.
> 
> Now you lost me. What do you mean "a row that is being UPDATEd"? The row 
> (version) being UPDATEd right now cannot be expired, or why would you 
> update that one? And if your transaction rolls back later, the row you 
> update right now must be the one surviving.

It can only be a non-visible version of the row earlier in the UPDATE
chain, not the actual one being updated.

> Any row that was created by a committed transaction does indeed have all 
> the index entries created. But if it is deleted and expired, that means 
> that the transaction that stamped xmax has committed and is outside of 
> every existing snapshot. You can only reuse a slot that is used by a 
> tuple that satisfies the vacuum snapshot. And a tuple that satisfies 
> that snapshot has potentially index entries flagged known dead.

When you are using the update chaining, you can't mark that index row as
dead because it actually points to more than one row on the page, some
are non-visible, some are visible.

> > I am thinking my idea was not fully understood.  Hopefully this email
> > helps.
> 
> I must be missing something because I still don't see how it can work.

I just posted pseudo-code.  Hope that helps.

--  Bruce Momjian   bruce@momjian.us EnterpriseDB    http://www.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


pgsql-hackers by date:

Previous
From: Jan Wieck
Date:
Subject: Re: vacuum, performance, and MVCC
Next
From: Jan Wieck
Date:
Subject: Re: vacuum, performance, and MVCC