Re: reducing postgresql disk space - Mailing list pgsql-general

From Steve Crawford
Subject Re: reducing postgresql disk space
Date
Msg-id 4BFD3E9B.90701@pinpointresearch.com
Whole thread Raw
In response to reducing postgresql disk space  (paladine <yasinmalli@gmail.com>)
Responses Re: reducing postgresql disk space  (paladine <yasinmalli@gmail.com>)
List pgsql-general
On 05/26/2010 07:16 AM, paladine wrote:
> Hi all,
>
> How can I reduce disk space postgresql used ?
> I tried to delete many rows from my database and
> I am running ' vacuum analyze reindexdb ' commands regularly
> but my disk space on my linux machine didn't reduce.
>
> I know that ' vacuum full ' command can do that but I don't want to use
> that command because of the disadvantages.
>
> Anyone know another method ?
>
Are you attempting a one-time space reduction or are you having general
bloat issues?

It is important to understand what is happening behind the scenes. Due
to MVCC (multi-version concurrency control), when you update a record,
PostgreSQL keeps the old one available until the transaction commits.
When no transaction needs the old record, it is not physically removed
but it is marked as dead. The basic vacuum process does not free
disk-space but rather identifies space within the files that hold the
table that has become available for reuse.

In a modern version of PostgreSQL with autovacuum running and set
appropriately for your workload, bloat should stay reasonably under
control (i.e. make sure you have upgraded and that autovacuum is enabled
and correctly tuned). But there are some things that can cause excess
table bloat like updates that hit all rows (this will roughly double the
size of a clean table) or deletes of substantial portions of a table.
Vacuum will allow this space to be reclaimed eventually, but you may
want to reduce disk-space sooner.

Your options:

Dump/restore. Not useful on a live, running database but can be useful
when you have yourself wedged in a corner on a machine out-of-space as
you can dump to another machine then do a clean restore back to your
server. Depending on your situation (especially foreign-key
constraints), you *may* be able to dump/restore just a specific
offending table.

Vacuum full. Reclaims the space, but is typically sloooow and requires
an exclusive table lock. IIRC, should be followed by a reindex of the
table. But vacuum-full runs "in-place" so it can be of use when you have
little free-space remaining on your device.

Cluster. Reclaims free-space and reindexes. Also reorders the table-data
to match the specified index which is often useful. Cluster must be run
on a table-by-table basis. Cluster also requires an exclusive lock but
is *way* faster than vacuum-full. Cluster requires enough free-space to
fully create the new clean copy of the table. This means a table can
require as much as double it's original space for clustering though a
heavily bloated table may require far less.

Both cluster and vacuum full are safe. If you are in a tight place, you
can carefully choose the method to use on a table-by-table basis:
vacuum-full if your hand is forced and cluster when you have made enough
free-space available.

Once things are cleaned up, examine how they got bad to begin with so
you aren't bitten again.

Cheers,
Steve


pgsql-general by date:

Previous
From: Palle Girgensohn
Date:
Subject: why doesn't insert into foo delete from bar returning baz work?
Next
From: Merlin Moncure
Date:
Subject: Re: why doesn't insert into foo delete from bar returning baz work?