Thread: Vaccuming dead rows on busy databases

Vaccuming dead rows on busy databases

From

"Greg Sabino Mullane"

Date:

18 December 2006, 14:33:32

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


Is there a general strategy out there for removing dead rows on
busy databases? I've learned that both a vacuum and a vacuum full
will not remove dead rows if there is an open transaction anywhere
on the entire cluster. Closing open transactions is not always a
viable solution, especially for periodic cron-driven vacuums. What
do people generally do to get around this problem? I can think of
a few possible solutions, but wanted to see if there was a best
practice before I went too much further. All my solutions are
external as well: I'd like to see something like VACUUM FULL WAIT. :)

- --
Greg Sabino Mullane greg@turnstep.com
End Point Corporation
PGP Key: 0x14964AC8 200612181331
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8

-----BEGIN PGP SIGNATURE-----

iD8DBQFFht6qvJuQZxSWSsgRArZDAKD9zfrsuysep38o+UaW/2tDL/AdvACguZIO
jd0JERBi0AwBBZj+xu80Qtg=
=vSuy
-----END PGP SIGNATURE-----

Re: Vaccuming dead rows on busy databases

From

Tom Lane

Date:

18 December 2006, 14:54:28

"Greg Sabino Mullane" <greg@turnstep.com> writes:
> I'd like to see something like VACUUM FULL WAIT. :)

Sounds like a deadlock waiting to happen :-(

AFAIK the general practice is to just accept the fact that vacuum can't
remove recently-dead tuples.  You should look into whether you can't
shorten your transactions --- very-long-running transactions create
other performance issues besides vacuum not removing stuff.
        regards, tom lane

Re: Vaccuming dead rows on busy databases

From

Matthew O'Connor

Date:

18 December 2006, 15:40:49

Tom Lane wrote:
> "Greg Sabino Mullane" <greg@turnstep.com> writes:
>> I'd like to see something like VACUUM FULL WAIT. :)
> 
> Sounds like a deadlock waiting to happen :-(
> 
> AFAIK the general practice is to just accept the fact that vacuum can't
> remove recently-dead tuples.  You should look into whether you can't
> shorten your transactions --- very-long-running transactions create
> other performance issues besides vacuum not removing stuff.

It seems to me that the most common support problem I keep seeing on the 
mailing lists is VACUUM not working well because of long running 
transactions.  If I understand it correctly, people have talked about 
reducing the problem by tracking xmin (or something, sorry if I'm 
getting this wrong) on a per table basis rather and per cluster.  Now 
I'm sure this is not simple and I know I don't have the skills to do it, 
but I think it would resolve (or at least significantly mitigate) what I 
perceive as one of the biggest usage problems with PostgreSQL.

Comments?

Re: Vaccuming dead rows on busy databases

From

"Greg Sabino Mullane"

Date:

18 December 2006, 16:21:25

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

...VACUUM FULL WAIT
> Sounds like a deadlock waiting to happen :-(

Yeah, this sounds more and more like a job for a client application.

> AFAIK the general practice is to just accept the fact that vacuum can't
> remove recently-dead tuples.  You should look into whether you can't
> shorten your transactions --- very-long-running transactions create
> other performance issues besides vacuum not removing stuff.

Sure, but does a long-running transaction on a different database in
the same cluster have any other consequences?

At any rate, I suppose this is something I can probably code around. If it
gets too bad, I'll try to coordinate the timing a bit more between the
databases, increase the frequency of vacuum, or simply kill some of the
long-running transactions before kicking off the vacuum.

Brian Hurt wrote:
> My understanding is that vacuum can not delete any row that was deleted
> after the oldest outstanding transaction. [snip]

Thanks, that was a good explanation.

> Which is why having only a single transaction open, but it's been open
> for 24 hours, is a problem.

Well, 24 hours is a bit much :), but perhaps until a database-specific xmin
is enabled, I'll also consider using an entirely different cluster for
databases which do long-runnning queries.

- --
Greg Sabino Mullane greg@turnstep.com
End Point Corporation
PGP Key: 0x14964AC8 200612181419
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iD8DBQFFhvZvvJuQZxSWSsgRArxYAKCh5YUbJosJiMDhon2vghIq0f0yIACeKdjD
0QK0N8P+C4odb7Vfvi5wy/U=
=Cvwh
-----END PGP SIGNATURE-----