Re: vacuum, performance, and MVCC - Mailing list pgsql-hackers

From Martijn van Oosterhout
Subject Re: vacuum, performance, and MVCC
Date
Msg-id 20060623135627.GE8900@svana.org
Whole thread Raw
In response to Re: vacuum, performance, and MVCC  (Csaba Nagy <nagy@ecircle-ag.com>)
Responses Re: vacuum, performance, and MVCC  ("A.M." <agentm@themactionfaction.com>)
List pgsql-hackers
On Fri, Jun 23, 2006 at 03:30:45PM +0200, Csaba Nagy wrote:
> > It sounds like you have a "big" problem and you need a "big" solution.
>
> Well, Postgres does a decent job as it is. The problem is under peek
> load, sometimes it gets bogged down and the usual things like vacuum
> will not help immediately. I think a few more features like the dead
> space map for quick vacuum and even something like the original post's
> proposition would make postgres fly under heavy load too...

I know there have a been a number of suggestions in the past to deal
with this thing. Some I don't remember being mentioned in this thread
are:

- Once a tuple has been determined to be invisible to everyone,
truncate it to just the header. This would probably work wonders for
frequently updated wide tables. However, this required keeping track of
the oldest active xact, I'm not sure how that works at the moment.

- Have the bgwriter do cleanup work before writing out a block. It
could probably do the truncation bit above, but totally removing old
tuples requires cleaning out the indexes too, which AIUI is the hard
part of vacuuming.

One totally whacked out idea I just thought of: Instead of just
truncating tuples when they're invisible, mark them "dying" and make
the data section store an array of CTIDs pointing to the index tuples
pointing to it. Lookups that find the tuple via an index could store
the CTID of the index tuple before continuing. If the bgwriter sees it
has a full set, it can efficiently remove the tuple straight away.

There are ofcourse drawbacks to this approach, you'd probably need
something like the half-dirty pages to avoid a large increase in write
load. If it's even beneficial at all given concurrency issues.

Still, VACUUM has gotten faster in CVS so the issues are slowly being
addressed...

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

pgsql-hackers by date:

Previous
From: John DeSoi
Date:
Subject: Re: [CORE] GPL Source and Copyright Questions
Next
From: Tom Lane
Date:
Subject: Re: [CORE] GPL Source and Copyright Questions