Eliot Gable wrote:
> I have a table which is about 12 GB in size. It has had a vacuum full
> analyze run on it, and then immediately after, I run vacuum analyze and it
> takes about 90 seconds to complete. If I continue to run vacuum analyze on
> it, it continues to take about 90 seconds each time. This system has a
> single 7200 RPM drive in it, so it's not a very fast drive. I was under the
> impression that if nothing had been done to the table since it was last
> vacuumed, that it would return immediately. Further, this is an append-only
> table, so why should it need to be vacuumed at all? We ran into cases where
> after writing to it long enough, the PGSQL autovacuum process would kick in
> and force a vacuum saying something about preventing wrap around. I don't
> understand why it would do this if it is append-only and we are using
> 64-bit sequences as IDs without OIDs turned on. What would be wrapping
> around without a vacuum? We tried to mitigate this by manually running
> vacuum programmatically, but then we end up using all the disk IO just
> running vacuum all the time, because it is constantly running through the
> entire table even though very little (if anything) has been done to it
> since the last vacuum.
Vacuuming a table requires first scanning the heap, then scanning each
indexes. The heap scan can be optimized using the visibility map: pages
can be skipped if they are marked all-visible. Vacuum will mark pages
all-visible, so a second vacuum immediately thereafter would need to
read few pages (and probably modify even fewer). But the index scans
cannot be optimized in that fashion; indexes must be scanned completely
each time. I would bet that that's where the time is going. Also note
that not all pages can be marked all-visible, and that pages are only
skipped if there are enough of them consecutive that it's worth when
considering disk readahead done by the operating system.
--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services