Re: vacuum, performance, and MVCC - Mailing list pgsql-hackers

From Hannu Krosing
Subject Re: vacuum, performance, and MVCC
Date
Msg-id 1151264521.4096.14.camel@localhost.localdomain
Whole thread Raw
In response to Re: vacuum, performance, and MVCC  (Bruce Momjian <bruce@momjian.us>)
Responses Re: vacuum, performance, and MVCC  (Bruce Momjian <bruce@momjian.us>)
List pgsql-hackers
Ühel kenal päeval, P, 2006-06-25 kell 14:24, kirjutas Bruce Momjian:
> Jan Wieck wrote:
> > >> Sure, but index reuse seems a lot easier, as there is nothing additional
> > >> to remember or clean out when doing it.
> > > 
> > > Yes, seems so.  TODO added:
> > > 
> > >     * Reuse index tuples that point to heap tuples that are not visible to
> > >       anyone?
> > > 
> > >> When reusing a heap tuple you have to clean out all index entries
> > >> pointing to it.
> > > 
> > > Well, not for UPDATE for no key changes on the same page, if we do that.
> > > 
> > 
> > An update that results in all the same values of every indexed column of 
> > a known deleted invisible tuple. This reused tuple can by definition not 
> > be the one currently updated. So unless it is a table without a primary 
> > key, this assumes that at least 3 versions of the same row exist within 
> > the same block. How likely is that to happen?
> 
> Good question.  You take the current tuple, and make another one on the
> same page.  Later, an update can reuse the original tuple if it is no
> longer visible to anyone (by changing the item id), so you only need two
> tuples, not three.  My hope is that a repeated update would eventually
> move to a page that enough free space for two (or more) versions.

I can confirm that this is exactly what happens when running an
update-heavy load with frequent vacuums. Eventually most rows get their
own db pages or share the same page with 2-3 rows. And there will be
lots of unused (filed up, or cleaned and not yet reused) pages.

The overall performance could be made a little better by tuning the
system to not put more than N new rows on the same page at initial
insert or when the row move to a new page during update. Currently
several new rows are initially put on the same page and then move around
during repeated updates until they slow(ish)ly claim their own page.

> Does that help explain it?
> 
-- 
----------------
Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me:  callto:hkrosing
Get Skype for free:  http://www.skype.com




pgsql-hackers by date:

Previous
From: Jan Wieck
Date:
Subject: Re: vacuum, performance, and MVCC
Next
From: Hannu Krosing
Date:
Subject: Re: vacuum, performance, and MVCC