Thread: db size and VACUUM ANALYZE

db size and VACUUM ANALYZE

From
Marcin Krol
Date:
Hello,

The db in the application I maintain but didn't write (it obviously
makes use of PG, v 8.3), has been systematically growing in size from
about 600M to 1.6G.

At the same time, the performance of the app has degraded significantly
(several times).

So I've done VACUUM ANALYZE on entire db. Nothing. The db did not
decrease in size, the performance stayed the same.

So I backed it up using pg_dump, deleted database, and recreated it from
backup.

The size of db on disk went down to 600M, performance recovered to the
original level.

Why that is so? I thought that VACUUM ANALYZE does everything that is
needed to optimize disk usage?

Regards,
mk


Re: db size and VACUUM ANALYZE

From
Amitabh Kant
Date:
On Fri, Feb 12, 2010 at 10:40 PM, Marcin Krol <mrkafk@gmail.com> wrote:
Hello,

The db in the application I maintain but didn't write (it obviously
makes use of PG, v 8.3), has been systematically growing in size from
about 600M to 1.6G.

At the same time, the performance of the app has degraded significantly
(several times).

So I've done VACUUM ANALYZE on entire db. Nothing. The db did not
decrease in size, the performance stayed the same.

So I backed it up using pg_dump, deleted database, and recreated it from
backup.

The size of db on disk went down to 600M, performance recovered to the
original level.

Why that is so? I thought that VACUUM ANALYZE does everything that is
needed to optimize disk usage?

Regards,
mk
 

You need to do VACUUM FULL ANALYZE to claim the disk space, but this creates a exclusive lock on the tables.

See http://www.postgresql.org/docs/8.3/static/sql-vacuum.html


With regards

Amitabh Kant

Re: db size and VACUUM ANALYZE

From
Marcin Krol
Date:
Amitabh Kant wrote:
> You need to do VACUUM FULL ANALYZE to claim the disk space, but this
> creates a exclusive lock on the tables.
>
> See http://www.postgresql.org/docs/8.3/static/sql-vacuum.html

Aha!

OK but why did the performance degrade so much? The same reason -- lack
of autovacuuming/vacuum full?

Regards,
mk


Re: db size and VACUUM ANALYZE

From
Joao Ferreira gmail
Date:
On Fri, 2010-02-12 at 18:43 +0100, Marcin Krol wrote:
> Amitabh Kant wrote:
> > You need to do VACUUM FULL ANALYZE to claim the disk space, but this
> > creates a exclusive lock on the tables.
> >
> > See http://www.postgresql.org/docs/8.3/static/sql-vacuum.html
>
> Aha!
>
> OK but why did the performance degrade so much? The same reason -- lack
> of autovacuuming/vacuum full?
>

if the application makes use of INDEXes then REINDEX will also play an
important role.... in that case REINDEXING your indexes once in a while
may give you imediate improvements in performance (may give... may not
give.... depends)

moreover, you should expect that in a few days/weeks/months the database
size can (probably will) grow up again... it's the way pg works

try using autovacuum.... if you are already using it you can make it
more agressive by decreasing the thresholds and so on....

Joao


> Regards,
> mk
>
>


Re: db size and VACUUM ANALYZE

From
Bill Moran
Date:
In response to Marcin Krol <mrkafk@gmail.com>:

> Amitabh Kant wrote:
> > You need to do VACUUM FULL ANALYZE to claim the disk space, but this
> > creates a exclusive lock on the tables.
> >
> > See http://www.postgresql.org/docs/8.3/static/sql-vacuum.html
>
> Aha!
>
> OK but why did the performance degrade so much? The same reason -- lack
> of autovacuuming/vacuum full?

Note that the "correct" disk size for your database is probably closer
to the 1.6G you were seeing before.  This allows PG some free space
within the data files to add/remove records.  vacuum full removes this
space, and you'll likely find that the files will simply expand to
use it again.  Vaccuum (without full) keeps that space at an equilibrium.

As to performance degradation, you'll always see performance hits as
your database size increases.  I'm assuming from your need to ask about
this issue that the degradation was significant.  In that case, you first
want to make sure that the tables in the database have indexes in
all the right places -- in my experience, this is the biggest cause of
performance issues.  Use of EXPLAIN ANALYZE on queries that are performing
slow will usually indicate where indexes can help.

From there, you may simply have too little hardware for the database to
run at the speed you expect.  Giving it more RAM is cheap and tends to
work wonders.  Any time the system runs out of RAM, it needs to use disk
instead, which significantly hurts performance.

Hope this is helpful.

--
Bill Moran
http://www.potentialtech.com
http://people.collaborativefusion.com/~wmoran/

Re: db size and VACUUM ANALYZE

From
Marcin Krol
Date:
Bill Moran wrote:
> Note that the "correct" disk size for your database is probably closer
> to the 1.6G you were seeing before.

This might be the case, but how do I find out what are the "correct" sizes?

I have a script that does following queries:

SELECT relpages * 8192 AS size_in_bytes, relname FROM pg_class WHERE
relnamespace = (SELECT oid FROM pg_namespace WHERE nspname = 'public')
ORDER BY size_in_bytes DESC LIMIT 10;

SELECT SUM(sizes.size_in_bytes) AS total_size_for_top_10_tables FROM
(SELECT relpages * 8192 AS size_in_bytes, relname FROM pg_class WHERE
relnamespace = (SELECT oid FROM pg_namespace WHERE nspname = 'public')
ORDER BY size_in_bytes DESC LIMIT 10) AS sizes;

SELECT SUM(sizes.size_in_bytes) AS total_size_for_all_tables FROM
(SELECT relpages * 8192 AS size_in_bytes, relname FROM pg_class WHERE
relnamespace = (SELECT oid FROM pg_namespace WHERE nspname = 'public'))
AS sizes;

Result before (1.6G db):


  size_in_bytes |       relname
---------------+----------------------
      806387712 | cs_ver_digests_pkey
      103530496 | oai_edi_atts_pkey
       62021632 | cs_ver_paths
       61734912 | cs_ver_digests
       55721984 | cs_fil_paths
       45309952 | met_files
       38412288 | met_versions
       26247168 | cs_ver_content_types
       25444352 | met_edi_ver
       23724032 | met_edi_atts
(10 rows)

  total_size_for_top_10_tables
------------------------------
                    1248534528
(1 row)

  total_size_for_all_tables
---------------------------
                 1467809792


Results now (600M db):

  size_in_bytes |          relname
---------------+---------------------------
       62169088 | cs_ver_paths
       55828480 | cs_fil_paths
       45441024 | met_files
       42000384 | cs_ver_digests
       37552128 | met_versions
       25509888 | met_edi_ver
       24215552 | cs_ver_content_types
       20717568 | met_edi_atts
       18186240 | met_edi_ver_pkey
       13565952 | cs_ver_content_types_pkey
(10 rows)

  total_size_for_top_10_tables
------------------------------
                     345186304
(1 row)

  total_size_for_all_tables
---------------------------
                  467476480
(1 row)



>This allows PG some free space
> within the data files to add/remove records.  vacuum full removes this
> space, and you'll likely find that the files will simply expand to
> use it again.  Vaccuum (without full) keeps that space at an equilibrium.

I don't mind slight performance degradation, the problem is that it is
2nd time that beyond certain db size the performance degradation tends
to be almost runaway.

> As to performance degradation, you'll always see performance hits as
> your database size increases.  I'm assuming from your need to ask about
> this issue that the degradation was significant.

Yes, to the point of unacceptable (that is, queries took like 20-30
seconds).

> In that case, you first
> want to make sure that the tables in the database have indexes in
> all the right places -- in my experience, this is the biggest cause of
> performance issues.  Use of EXPLAIN ANALYZE on queries that are performing
> slow will usually indicate where indexes can help.

I'll try, though that will not be easy as they are complex and were not
written by me (it's a closed system).


>From there, you may simply have too little hardware for the database to
> run at the speed you expect.

You see that's the weird thing: the machine in question has 4 cpus and
4G of ram. When the performance was unacceptable, the loadavg was around
1, all cpus were slightly loaded, and iostat didn't show much happening
on the disks. The one thing I remember is that there were many
postmaster processes (like 20), they had huge virtual sizes (like 800m)
and large resident sizes (like 300M).

On top of having the pg_dump backup, I have copied the binary files of
db when pg was stopped. I could play with those files (change them under
  the same pg config on another machine).

> Giving it more RAM is cheap and tends to
> work wonders.  Any time the system runs out of RAM, it needs to use disk
> instead, which significantly hurts performance.

This is my memory config:

shared_buffers = 768MB

temp_buffers = 32MB                     # min 800kB

work_mem = 32MB                         # min 64kB

max_stack_depth = 256MB                 # min 100kB

max_fsm_pages = 153600


% sysctl -a | grep shm
kernel.shmmni = 4096
kernel.shmall = 262144
kernel.shmmax = 1073741824

Re: db size and VACUUM ANALYZE

From
Greg Smith
Date:
Amitabh Kant wrote:
> You need to do VACUUM FULL ANALYZE to claim the disk space, but this
> creates a exclusive lock on the tables.
> See http://www.postgresql.org/docs/8.3/static/sql-vacuum.html

First off, you don't need the ANALYZE in there.

Second, VACUUM FULL is a terrible way to fix a table that's seriously
screwed up--it will take forever to run.  Use CLUSTER to accomplish the
same thing much faster; it basically does the same thing as the
dump/restore step that's restoring good performance to the database.

Before doing that, I would run a VACUUM VERBOSE on the whole cluster and
see if there are any max_fsm_pages warnings in there.  Those settings
might be too low, for example if large deletions are done in batches,
and ultimately be the true cause of this problem.

In general, the answer to most "why is my database getting too big/slow
after it's been up for a while?" questions is "you aren't vacuuming
often enough".  Is autovacuum on?  Are there any long-running
transactions that keep it from working?  There are use patterns where
that's still not good enough, but those are less common than the case
where the basics (use autovacuum and makes sure the FSM parameters are
set correctly) just aren't being done.

--
Greg Smith    2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com  www.2ndQuadrant.com


Re: db size and VACUUM ANALYZE

From
Greg Smith
Date:
Marcin Krol wrote:
> Result before (1.6G db):
>
>
>  size_in_bytes |       relname
> ---------------+----------------------
>      806387712 | cs_ver_digests_pkey
>      103530496 | oai_edi_atts_pkey

There's your problem.  This is called "index bloat"; these are the two
biggest relations in the large and slow database, but don't even show up
in the top 10 on the smaller one.  It usually happens when your VACUUM
strategy is bad and you delete/update things all the time.  Notes on
this topic start at
http://www.postgresql.org/docs/8.3/static/sql-reindex.html

You can clean it up with REINDEX or CLUSTER, but not VACUUM FULL, which
actually makes the problem worse.  No need to rebuild the whole DB.

> max_fsm_pages = 153600

It's quite possible that's way too low for your workload.  I already
suggested VACUUM VERBOSE would dump info into the logs suggesting as
much if that's the case; try that out next time you see the database get
too big.

--
Greg Smith    2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com  www.2ndQuadrant.com


Re: db size and VACUUM ANALYZE

From
Scott Marlowe
Date:
On Sat, Feb 13, 2010 at 12:19 AM, Greg Smith <greg@2ndquadrant.com> wrote:
> Amitabh Kant wrote:
>>
>> You need to do VACUUM FULL ANALYZE to claim the disk space, but this
>> creates a exclusive lock on the tables.
>> See http://www.postgresql.org/docs/8.3/static/sql-vacuum.html
>
> First off, you don't need the ANALYZE in there.
>
> Second, VACUUM FULL is a terrible way to fix a table that's seriously
> screwed up--it will take forever to run.  Use CLUSTER to accomplish the same
> thing much faster; it basically does the same thing as the dump/restore step
> that's restoring good performance to the database.

This is a bit of an oversimplification.  I've found that selecting the
contents of the table out, truncating the table, and inserting them
back in from a select with an order by can be orders of magnitude
faster than cluster IF the data in the table is basically random.
After that, cluster can perform reasonably well to keep the table
clustered, because it's mostly in order already.  Basically, unless
it's been fixed in 9.0, cluster reads the table by index entry one row
at a time and builds the new table.  This is very very slow for a
randomly ordered table.

> Before doing that, I would run a VACUUM VERBOSE on the whole cluster and see
> if there are any max_fsm_pages warnings in there.  Those settings might be
> too low, for example if large deletions are done in batches, and ultimately
> be the true cause of this problem.

Good point, if he's blowing out the fsm regularly then the fix above
will be temporary at best.

Since setting fsm pages / relations is basically very cheap, it's a
good idea to set them a few times higher than what you need, so if you
need 1M set it to 10M to give a big buffer in case things get worse
over time.  Especially since fsm pages is a restart requiring change.