Re: db size - Mailing list pgsql-performance

From Craig Ringer
Subject Re: db size
Date
Msg-id 480334A7.1070102@postnewspapers.com.au
Whole thread Raw
In response to Re: db size  (Adrian Moisey <adrian@careerjunction.co.za>)
List pgsql-performance
Adrian Moisey wrote:
> Hi
>
>>     If you suspect your tables or indexes are bloated, restore your
>> dump to a test box.
>>     Use fsync=off during restore, you don't care about integrity on
>> the test box.
>>     This will avoid slowing down your production database.
>>     Then look at the size of the restored database.
>>     If it is much smaller than your production database, then you have
>> bloat.
>
> I have done that, and I get the following:
>
> the live one is 113G
> the restored one is 78G
>
> How should I get rid of the bloat?
> VACUUM FULL?

And/or REINDEX if you're not satisfied with the results of a VACUUM FULL.

http://www.postgresql.org/docs/8.3/interactive/vacuum.html
http://www.postgresql.org/docs/8.3/interactive/sql-reindex.html

Of course, all of these will have performance consequences while they're
running, and take out locks that prevent certain other operatons as
shown in table 13-2:

http://www.postgresql.org/docs/8.3/static/explicit-locking.html

and the explanation following it.

Note in particular:

----
ACCESS EXCLUSIVE

    Conflicts with locks of all modes (ACCESS SHARE, ROW SHARE, ROW
EXCLUSIVE, SHARE UPDATE EXCLUSIVE, SHARE, SHARE ROW EXCLUSIVE,
EXCLUSIVE, and ACCESS EXCLUSIVE). This mode guarantees that the holder
is the only transaction accessing the table in any way.

    Acquired by the ALTER TABLE, DROP TABLE, TRUNCATE, REINDEX, CLUSTER,
and VACUUM FULL commands. This is also the default lock mode for LOCK
TABLE statements that do not specify a mode explicitly.

    Tip: Only an ACCESS EXCLUSIVE lock blocks a SELECT (without FOR
UPDATE/SHARE) statement.
----

In other words, you won't be doing much with a table/index while a
VACUUM FULL or a REINDEX is in progress on it.

Given that, you probably want to check your table/index sizes and see if
there are particular problem tables or indexes, rather than just using a
sledgehammer approach.

--
Craig Ringer

pgsql-performance by date:

Previous
From: PFC
Date:
Subject: Re: db size
Next
From: Adrian Moisey
Date:
Subject: Re: db size