Re: Slow count(*) again... - Mailing list pgsql-performance

From Neil Whelchel
Subject Re: Slow count(*) again...
Date
Msg-id 201010131319.06792.neil.whelchel@gmail.com
Whole thread Raw
In response to Re: Slow count(*) again...  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-performance
On Wednesday 13 October 2010 06:27:34 you wrote:
> On Wed, Oct 13, 2010 at 4:38 AM, Neil Whelchel <neil.whelchel@gmail.com>
wrote:
> > There seems to be allot of discussion about VACUUM FULL, and its
> > problems. The overall buzz seems to be that VACUUM FULL is a bad idea (I
> > could be wrong here). It has been some time since I have read the
> > changelogs, but I seem to remember that there have been some major
> > changes to VACUUM FULL recently. Maybe this needs to be re-visited in
> > the documentation.
>
> In 9.0, VACUUM FULL does something similar to what CLUSTER does.  This
> is a much better idea than what it did in 8.4 and prior.
>
> > crash:~# time psql -U test test -c "VACUUM FULL log;"
> > VACUUM
> >
> > real    4m49.055s
> > user    0m0.000s
> > sys     0m0.000s
> >
> > crash:~# time psql -U test test -c "SELECT count(*) FROM log;"
> >  count
> > ----------
> >  10050886
> > (1 row)
> >
> > real    0m9.665s
> > user    0m0.000s
> > sys     0m0.004s
> >
> > A huge improvement from the minute and a half before the VACUUM FULL.
>
> This is a very surprising result that I would like to understand
> better.  Let's assume that your UPDATE statement bloated the table by
> 2x (you could use pg_relation_size to find out exactly; the details
> probably depend on fillfactor which you might want to lower if you're
> going to do lots of updates).  That ought to mean that count(*) has to
> grovel through twice as much data, so instead of taking 9 seconds it
> ought to take 18 seconds.  Where the heck is the other 1:12 going?
> This might sort of make sense if the original table was laid out
> sequentially on disk and the updated table was not, but how and why
> would that happen?
This is likely due to the table not fitting in memory before the VACUUM FULL.
I am glad that you suggested using pg_relation_size, I somehow didn't think of
it at the time. I will redo the test and publish the results of
pg_relation_size.
-Neil-

pgsql-performance by date:

Previous
From: Neil Whelchel
Date:
Subject: Re: Slow count(*) again...
Next
From: Tom Lane
Date:
Subject: Re: Bogus startup cost for WindowAgg