Thread: db size and VACUUM ANALYZE

db size and VACUUM ANALYZE

From
Marcin Krol
Date:
Hello,

The db in the application I maintain but didn't write (it obviously
makes use of PG, v 8.3), has been systematically growing in size from
about 600M to 1.6G.

At the same time, the performance of the app has degraded significantly
(several times).

So I've done VACUUM ANALYZE on entire db. Nothing. The db did not
decrease in size, the performance stayed the same.

So I backed it up using pg_dump, deleted database, and recreated it from
backup.

The size of db on disk went down to 600M, performance recovered to the
original level.

Why that is so? I thought that VACUUM ANALYZE does everything that is
needed to optimize disk usage?

Regards,
mk

Re: db size and VACUUM ANALYZE

From
Brad Nicholson
Date:
On Fri, 2010-02-12 at 18:09 +0100, Marcin Krol wrote:
> Hello,
>
> The db in the application I maintain but didn't write (it obviously
> makes use of PG, v 8.3), has been systematically growing in size from
> about 600M to 1.6G.
>
> At the same time, the performance of the app has degraded significantly
> (several times).

This is usually an indication that you are not vacuuming frequently
enough.

> So I've done VACUUM ANALYZE on entire db. Nothing. The db did not
> decrease in size, the performance stayed the same.

Regular vacuum will not decrease the size of the database.  It will just
mark the dead tuples available for re-use.

VACUUM FULL will remove the dead tuples from the database, and make it
smaller.  It's a bad idea to get in the habit of doing this.  It's
better to tune autovacuum and avoid the issue altogether.

> So I backed it up using pg_dump, deleted database, and recreated it from
> backup.
>
> The size of db on disk went down to 600M, performance recovered to the
> original level.

Reloading the database eliminates the dead tuples, which is why the
database is smaller.

> Why that is so? I thought that VACUUM ANALYZE does everything that is
> needed to optimize disk usage?

It may not do everything, but the idea is to VACUUM your tables often
enough that they don't grow out of control.

Are you running autovacuum?  It should take care of this for you.  You
may need to make it more aggressive than the default though.


--
Brad Nicholson  416-673-4106
Database Administrator, Afilias Canada Corp.



Re: db size and VACUUM ANALYZE

From
Marcin Krol
Date:
Brad Nicholson wrote:

First of all, I don't really care about 1G of disk space, the main
problem was why the performance degraded so much?

> Are you running autovacuum?

Apparently no. I have turned it on in conf and restarted pg, I'll see
how that works.

It should take care of this for you.  You
> may need to make it more aggressive than the default though.

Hmm what do you mean by more aggressive? I haven't seen anything in the
parameters that would suggest whether it is more likely or less likely
to recover dead tuples:

                                         # actions running at least that
time.
#autovacuum_max_workers = 3             # max number of autovacuum
subprocesses
#autovacuum_naptime = 1min              # time between autovacuum runs
#autovacuum_vacuum_threshold = 50       # min number of row updates before
                                         # vacuum
#autovacuum_analyze_threshold = 50      # min number of row updates before
                                         # analyze
#autovacuum_vacuum_scale_factor = 0.2   # fraction of table size before
vacuum
#autovacuum_analyze_scale_factor = 0.1  # fraction of table size before
analyze
#autovacuum_freeze_max_age = 200000000  # maximum XID age before forced
vacuum
                                         # (change requires restart)
#autovacuum_vacuum_cost_delay = 20      # default vacuum cost delay for
                                         # autovacuum, -1 means use
                                         # vacuum_cost_delay
#autovacuum_vacuum_cost_limit = -1      # default vacuum cost limit for
                                         # autovacuum, -1 means use
                                         # vacuum_cost_limit


I don't see anything in here that would suggest equivalent of VACUUM FULL.

Regards,
mk


Re: db size and VACUUM ANALYZE

From
Brad Nicholson
Date:
On Fri, 2010-02-12 at 18:41 +0100, Marcin Krol wrote:
> Brad Nicholson wrote:
>
> First of all, I don't really care about 1G of disk space, the main
> problem was why the performance degraded so much?

Because you data is now spread across a whole lot more blocks on disk.
It takes more work to find the data than it used to.

> > Are you running autovacuum?
>
> Apparently no. I have turned it on in conf and restarted pg, I'll see
> how that works.
>
> It should take care of this for you.  You
> > may need to make it more aggressive than the default though.
>
> Hmm what do you mean by more aggressive? I haven't seen anything in the
> parameters that would suggest whether it is more likely or less likely
> to recover dead tuples:

I would start by turning autovacuum on and running it with the defaults.
This will most likely make this problem go away.  If you find that you
still have problems, try lowering autovacuum_vacuum_scale_factor a bit.

>                                          # actions running at least that
> time.
> #autovacuum_max_workers = 3             # max number of autovacuum
> subprocesses
> #autovacuum_naptime = 1min              # time between autovacuum runs
> #autovacuum_vacuum_threshold = 50       # min number of row updates before
>                                          # vacuum
> #autovacuum_analyze_threshold = 50      # min number of row updates before
>                                          # analyze
> #autovacuum_vacuum_scale_factor = 0.2   # fraction of table size before
> vacuum
> #autovacuum_analyze_scale_factor = 0.1  # fraction of table size before
> analyze
> #autovacuum_freeze_max_age = 200000000  # maximum XID age before forced
> vacuum
>                                          # (change requires restart)
> #autovacuum_vacuum_cost_delay = 20      # default vacuum cost delay for
>                                          # autovacuum, -1 means use
>                                          # vacuum_cost_delay
> #autovacuum_vacuum_cost_limit = -1      # default vacuum cost limit for
>                                          # autovacuum, -1 means use
>                                          # vacuum_cost_limit
>
>
> I don't see anything in here that would suggest equivalent of VACUUM FULL.

Autovaccum does not run VACUUM FULL, it runs a regular VACUUM.

I recommend reading this page in the documentation:

http://www.postgresql.org/docs/8.3/interactive/routine-vacuuming.html


--
Brad Nicholson  416-673-4106
Database Administrator, Afilias Canada Corp.



Re: db size and VACUUM ANALYZE

From
Brad Nicholson
Date:
On Fri, 2010-02-12 at 13:46 -0500, Brad Nicholson wrote:
> On Fri, 2010-02-12 at 18:41 +0100, Marcin Krol wrote:
> > Brad Nicholson wrote:
> >
> > First of all, I don't really care about 1G of disk space, the main
> > problem was why the performance degraded so much?
>
> Because you data is now spread across a whole lot more blocks on disk.
> It takes more work to find the data than it used to.

Oh, I didn't realize the DB was so small.  How big is your shared buffer
pool?  It very well might be that at 600MB, the entire DB fits into
buffer pool.  As the DB grows, it no longer the case, and the speed of
your disk now comes into play.

--
Brad Nicholson  416-673-4106
Database Administrator, Afilias Canada Corp.