Thread: Still confused about VACUUM vs. VACUUM FULL

Still confused about VACUUM vs. VACUUM FULL

From
Jeff Boes
Date:
[Apologies if you have seen this before. I just discovered that posting
to the group via NNTP, at Teranews.com, apparently isn't working since
my posts aren't showing up in the mailing list archives.]


We are trying to run down some performance differences between our
production system (where we run a VACUUM FULL every night on nearly
every table) and our test system (where we have stopped doing the VACUUM
FULL, but still do VACUUM ANALYZE).  For large (>1 million rows) tables
which have a pretty high turn-over (average life span of a row is 3
days), should there be any query performance differences whether you
VACUUM FULL or not?


--
Jeff Boes                                      vox 269.226.9550 ext 24
Database Engineer                                     fax 269.349.9076
Nexcerpt, Inc.                                 http://www.nexcerpt.com
           ...Nexcerpt... Extend your Expertise


Re: Still confused about VACUUM vs. VACUUM FULL

From
Tom Lane
Date:
Jeff Boes <jboes@nexcerpt.com> writes:
> For large (>1 million rows) tables
> which have a pretty high turn-over (average life span of a row is 3
> days), should there be any query performance differences whether you
> VACUUM FULL or not?

How often do you VACUUM?  Do you have enough FSM space to support the
number of pages that get dirtied between vacuums?  If you don't, the
physical table size will bloat over time, leading to progressive
slowdown.

            regards, tom lane

Re: Still confused about VACUUM vs. VACUUM FULL

From
Jeff Boes
Date:
On Thu, 2003-06-12 at 12:42, Tom Lane wrote:
> Jeff Boes <jboes@nexcerpt.com> writes:
> > For large (>1 million rows) tables
> > which have a pretty high turn-over (average life span of a row is 3
> > days), should there be any query performance differences whether you
> > VACUUM FULL or not?
>
> How often do you VACUUM?  Do you have enough FSM space to support the
> number of pages that get dirtied between vacuums?  If you don't, the
> physical table size will bloat over time, leading to progressive
> slowdown.
>

We've gone from daily, to twice daily, to several times during the "peak
updates" period, and back to twice daily.

We're also ANALYZE-ing the largest 12-18 tables on a cycle: every twenty
minutes, a daemon wakes up and ANALYZEs until they're all done or two
minutes has elapsed, whichever comes first.

max_fsm_relations = 200
max_fsm_pages = 350000

We have around 220 tables total, only 40 of which have more than 1000
pg_class.reltuples.

--
Jeff Boes                                      vox 269.226.9550 ext 24
Database Engineer                                     fax 269.349.9076
Nexcerpt, Inc.                                 http://www.nexcerpt.com
           ...Nexcerpt... Extend your Expertise


Re: Still confused about VACUUM vs. VACUUM FULL

From
Tom Lane
Date:
Jeff Boes <jboes@nexcerpt.com> writes:
> On Thu, 2003-06-12 at 12:42, Tom Lane wrote:
>> How often do you VACUUM?

> We've gone from daily, to twice daily, to several times during the "peak
> updates" period, and back to twice daily.

I suspect that the real problem here is the old open transactions
discussed in your other message.  Those are preventing VACUUM from
reclaiming space.  Now that you've got that fixed, you need to revisit
your experiments about how often to VACUUM.

> We're also ANALYZE-ing the largest 12-18 tables on a cycle: every twenty
> minutes, a daemon wakes up and ANALYZEs until they're all done or two
> minutes has elapsed, whichever comes first.

That sounds a tad excessive; are the statistics really changing that
fast?

            regards, tom lane

Re: Still confused about VACUUM vs. VACUUM FULL

From
Jeff Boes
Date:
On Thu, 2003-06-12 at 13:16, Tom Lane wrote:

> > We're also ANALYZE-ing the largest 12-18 tables on a cycle: every twenty
> > minutes, a daemon wakes up and ANALYZEs until they're all done or two
> > minutes has elapsed, whichever comes first.
>
> That sounds a tad excessive; are the statistics really changing that
> fast?


Well, I have some convincing evidence on this. One table at the center
of some of our biggest, hairiest queries uses an index on a timestamp.
Generally, the queries run looking back about 24 hours. We are inserting
40,000 rows a day (and deleting the same number, but the deletes happen
all at once, and the inserts happen during nearly every part of the
clock).

I've done

  explain select * from foo where the_time < <some-timestamp>;

and found that I could slice it down to a one-minute interval or so:
before 11:42 AM, and the optimizer uses a sequential scan; after, and it
uses the index.

And of course it stays at that point, even if another 10,000 rows get
inserted with current timestamps, until it's ANALYZEd again. So two or
three ANALYZEs per hour is not excessive, if it will keep the index
usable under the "right" circumstances.

--
Jeff Boes                                      vox 269.226.9550 ext 24
Database Engineer                                     fax 269.349.9076
Nexcerpt, Inc.                                 http://www.nexcerpt.com
           ...Nexcerpt... Extend your Expertise


Re: Still confused about VACUUM vs. VACUUM FULL

From
Tom Lane
Date:
Jeff Boes <jboes@nexcerpt.com> writes:
> On Thu, 2003-06-12 at 13:16, Tom Lane wrote:
>> That sounds a tad excessive; are the statistics really changing that
>> fast?

> Well, I have some convincing evidence on this.

Ah.  Well, you might consider analyzing just that table (maybe even just
its timestamp column) every few minutes, but I still doubt that it's
buying you much for most of the database.

            regards, tom lane