Thread: reducing postgresql disk space

reducing postgresql disk space

From
paladine
Date:
Hi all,

How can I reduce disk space postgresql used ?
I tried to delete many rows from my database and
I am running ' vacuum analyze reindexdb ' commands regularly
but my disk space on my linux machine didn't reduce.

I know that ' vacuum full ' command can do that but I don't want to use
that command because of the disadvantages.

Anyone know another method ?

Thanks in advance.
--
View this message in context: http://old.nabble.com/reducing-postgresql-disk-space-tp28681415p28681415.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


Re: reducing postgresql disk space

From
Thom Brown
Date:
On 26 May 2010 15:16, paladine <yasinmalli@gmail.com> wrote:
>
> Hi all,
>
> How can I reduce disk space postgresql used ?
> I tried to delete many rows from my database and
> I am running ' vacuum analyze reindexdb ' commands regularly
> but my disk space on my linux machine didn't reduce.
>
> I know that ' vacuum full ' command can do that but I don't want to use
> that command because of the disadvantages.
>
> Anyone know another method ?
>
> Thanks in advance.
> --
> View this message in context: http://old.nabble.com/reducing-postgresql-disk-space-tp28681415p28681415.html
> Sent from the PostgreSQL - general mailing list archive at Nabble.com.
>

What have you got checkpoint_segments set to in your postgresql.conf
file?  And how big is your database cluster?

Thom

Re: reducing postgresql disk space

From
Vick Khera
Date:
On Wed, May 26, 2010 at 10:16 AM, paladine <yasinmalli@gmail.com> wrote:
> Anyone know another method ?
>

options to reclaim disk space:

vacuum full
dump/restore (sometimes faster than vacuum full)
cluster (not mvcc safe as far as i know)
alter a table column to its own type, like this:
 alter table foo alter column my_counter type integer; -- my_counter
is already an integer

sometimes all you need to do is reindex the table (or just the larger
indexes on the table selectively)

Re: reducing postgresql disk space

From
paladine
Date:
It is default value ( #checkpoint_segments = 3    # in logfile segments, min
1, 16MB each )
Many of my database configurations are default values. (plain TOAST  etc)
my database is a log database so, some tables of db grow everytime.
My ' /base ' directory contains a lot of compressed object (1GB size)
These are maybe normal operations but I don't understand that
although I delete many rows from my db and regularly vacuum , reindexing
operations,
how doesn't postgresql give back that deleted areas for reusing.





Thom Brown wrote:
>
> On 26 May 2010 15:16, paladine <yasinmalli@gmail.com> wrote:
>>
>> Hi all,
>>
>> How can I reduce disk space postgresql used ?
>> I tried to delete many rows from my database and
>> I am running ' vacuum analyze reindexdb ' commands regularly
>> but my disk space on my linux machine didn't reduce.
>>
>> I know that ' vacuum full ' command can do that but I don't want to use
>> that command because of the disadvantages.
>>
>> Anyone know another method ?
>>
>> Thanks in advance.
>> --
>> View this message in context:
>> http://old.nabble.com/reducing-postgresql-disk-space-tp28681415p28681415.html
>> Sent from the PostgreSQL - general mailing list archive at Nabble.com.
>>
>
> What have you got checkpoint_segments set to in your postgresql.conf
> file?  And how big is your database cluster?
>
> Thom
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>
>

--
View this message in context: http://old.nabble.com/reducing-postgresql-disk-space-tp28681415p28681867.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


Re: reducing postgresql disk space

From
Thom Brown
Date:
On 26 May 2010 15:50, paladine <yasinmalli@gmail.com> wrote:
>
> It is default value ( #checkpoint_segments = 3    # in logfile segments, min
> 1, 16MB each )
> Many of my database configurations are default values. (plain TOAST  etc)
> my database is a log database so, some tables of db grow everytime.
> My ' /base ' directory contains a lot of compressed object (1GB size)
> These are maybe normal operations but I don't understand that
> although I delete many rows from my db and regularly vacuum , reindexing
> operations,
> how doesn't postgresql give back that deleted areas for reusing.
>

I'm just wondering if you're still building up the initial set of WAL
files which will begin to plateau if the data in your database in
roughly consistent in size over time.

Try:

select datname, pg_size_pretty(pg_database_size(datname)) from
pg_database order by datname;

That should give you the actual sizes of each database.  Also, how
verbose is the database logging?  If you're logging every query to a
log file that may also account for it.

Regards

Thom

Re: reducing postgresql disk space

From
Steve Crawford
Date:
On 05/26/2010 07:16 AM, paladine wrote:
> Hi all,
>
> How can I reduce disk space postgresql used ?
> I tried to delete many rows from my database and
> I am running ' vacuum analyze reindexdb ' commands regularly
> but my disk space on my linux machine didn't reduce.
>
> I know that ' vacuum full ' command can do that but I don't want to use
> that command because of the disadvantages.
>
> Anyone know another method ?
>
Are you attempting a one-time space reduction or are you having general
bloat issues?

It is important to understand what is happening behind the scenes. Due
to MVCC (multi-version concurrency control), when you update a record,
PostgreSQL keeps the old one available until the transaction commits.
When no transaction needs the old record, it is not physically removed
but it is marked as dead. The basic vacuum process does not free
disk-space but rather identifies space within the files that hold the
table that has become available for reuse.

In a modern version of PostgreSQL with autovacuum running and set
appropriately for your workload, bloat should stay reasonably under
control (i.e. make sure you have upgraded and that autovacuum is enabled
and correctly tuned). But there are some things that can cause excess
table bloat like updates that hit all rows (this will roughly double the
size of a clean table) or deletes of substantial portions of a table.
Vacuum will allow this space to be reclaimed eventually, but you may
want to reduce disk-space sooner.

Your options:

Dump/restore. Not useful on a live, running database but can be useful
when you have yourself wedged in a corner on a machine out-of-space as
you can dump to another machine then do a clean restore back to your
server. Depending on your situation (especially foreign-key
constraints), you *may* be able to dump/restore just a specific
offending table.

Vacuum full. Reclaims the space, but is typically sloooow and requires
an exclusive table lock. IIRC, should be followed by a reindex of the
table. But vacuum-full runs "in-place" so it can be of use when you have
little free-space remaining on your device.

Cluster. Reclaims free-space and reindexes. Also reorders the table-data
to match the specified index which is often useful. Cluster must be run
on a table-by-table basis. Cluster also requires an exclusive lock but
is *way* faster than vacuum-full. Cluster requires enough free-space to
fully create the new clean copy of the table. This means a table can
require as much as double it's original space for clustering though a
heavily bloated table may require far less.

Both cluster and vacuum full are safe. If you are in a tight place, you
can carefully choose the method to use on a table-by-table basis:
vacuum-full if your hand is forced and cluster when you have made enough
free-space available.

Once things are cleaned up, examine how they got bad to begin with so
you aren't bitten again.

Cheers,
Steve


Re: reducing postgresql disk space

From
Steve Crawford
Date:
On 05/26/2010 07:50 AM, paladine wrote:
> It is default value ( #checkpoint_segments = 3    # in logfile segments, min
> 1, 16MB each )
> Many of my database configurations are default values. (plain TOAST  etc)
> my database is a log database so, some tables of db grow everytime.
> My ' /base ' directory contains a lot of compressed object (1GB size)
> These are maybe normal operations but I don't understand that
> although I delete many rows from my db and regularly vacuum , reindexing
> operations,
> how doesn't postgresql give back that deleted areas for reusing.
>
What is your PostgreSQL version? In older versions, you needed to set
the free space map high enough to manage the space that vacuum
identified as available. If that isn't high enough, vacuum won't be able
to fully do its job.

Also, if you are doing a typical form of logging where you delete
entries older than some set age, you should read up on table
partitioning. For example, if you keep log data for a year, set up an
empty parent table and create child tables spanning the appropriate
subset of the year (month, week, ...). After the child table is
no-longer needed it can be dropped or truncated depending on your
situation. Dropping or truncating is far faster than "delete
from...where..." and causes no table or index bloat.

Cheers,
Steve


Re: reducing postgresql disk space

From
paladine
Date:
My PostgreSQL version is 8.1.11.

My log entries are on filesytem and I write these logs to db from unix
stream.
Another interesting situation is that while my logs on filesystem contains
about 5-6 GB,
db directory (/base/OID) contains 40 GB. Is that normal ?

I implement your suggestions by using shell scripts ( when the disk filled
up I delete it from db etc. )


Steve Crawford wrote:
>
> On 05/26/2010 07:50 AM, paladine wrote:
>> It is default value ( #checkpoint_segments = 3    # in logfile segments,
>> min
>> 1, 16MB each )
>> Many of my database configurations are default values. (plain TOAST  etc)
>> my database is a log database so, some tables of db grow everytime.
>> My ' /base ' directory contains a lot of compressed object (1GB size)
>> These are maybe normal operations but I don't understand that
>> although I delete many rows from my db and regularly vacuum , reindexing
>> operations,
>> how doesn't postgresql give back that deleted areas for reusing.
>>
> What is your PostgreSQL version? In older versions, you needed to set
> the free space map high enough to manage the space that vacuum
> identified as available. If that isn't high enough, vacuum won't be able
> to fully do its job.
>
> Also, if you are doing a typical form of logging where you delete
> entries older than some set age, you should read up on table
> partitioning. For example, if you keep log data for a year, set up an
> empty parent table and create child tables spanning the appropriate
> subset of the year (month, week, ...). After the child table is
> no-longer needed it can be dropped or truncated depending on your
> situation. Dropping or truncating is far faster than "delete
> from...where..." and causes no table or index bloat.
>
> Cheers,
> Steve
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>
>

--
View this message in context: http://old.nabble.com/reducing-postgresql-disk-space-tp28681415p28682916.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


Re: reducing postgresql disk space

From
paladine
Date:
Firstly, thanks for your explanations...

> Are you attempting a one-time space reduction or are you having general
> bloat issues?

Unfortunately, I have growing bloat issues so I want to reduce space as it
filled up.
Thus I wrote a script but as I said before it doesn't reclaim disk space.

> make sure you have upgraded and that autovacuum is enabled and correctly
> tuned

In my pg configuration, ' autovacuum = off '
but I run a script ( daily cronjob ) that controls the number of table row
and
if it expires a determined limit, run ' delete command ' and then run '
vacuum analyse verbose '
In your opinion, Is autovacuuming more efficient way ?





Steve Crawford wrote:
>
> On 05/26/2010 07:16 AM, paladine wrote:
>> Hi all,
>>
>> How can I reduce disk space postgresql used ?
>> I tried to delete many rows from my database and
>> I am running ' vacuum analyze reindexdb ' commands regularly
>> but my disk space on my linux machine didn't reduce.
>>
>> I know that ' vacuum full ' command can do that but I don't want to use
>> that command because of the disadvantages.
>>
>> Anyone know another method ?
>>
> Are you attempting a one-time space reduction or are you having general
> bloat issues?
>
> It is important to understand what is happening behind the scenes. Due
> to MVCC (multi-version concurrency control), when you update a record,
> PostgreSQL keeps the old one available until the transaction commits.
> When no transaction needs the old record, it is not physically removed
> but it is marked as dead. The basic vacuum process does not free
> disk-space but rather identifies space within the files that hold the
> table that has become available for reuse.
>
> In a modern version of PostgreSQL with autovacuum running and set
> appropriately for your workload, bloat should stay reasonably under
> control (i.e. make sure you have upgraded and that autovacuum is enabled
> and correctly tuned). But there are some things that can cause excess
> table bloat like updates that hit all rows (this will roughly double the
> size of a clean table) or deletes of substantial portions of a table.
> Vacuum will allow this space to be reclaimed eventually, but you may
> want to reduce disk-space sooner.
>
> Your options:
>
> Dump/restore. Not useful on a live, running database but can be useful
> when you have yourself wedged in a corner on a machine out-of-space as
> you can dump to another machine then do a clean restore back to your
> server. Depending on your situation (especially foreign-key
> constraints), you *may* be able to dump/restore just a specific
> offending table.
>
> Vacuum full. Reclaims the space, but is typically sloooow and requires
> an exclusive table lock. IIRC, should be followed by a reindex of the
> table. But vacuum-full runs "in-place" so it can be of use when you have
> little free-space remaining on your device.
>
> Cluster. Reclaims free-space and reindexes. Also reorders the table-data
> to match the specified index which is often useful. Cluster must be run
> on a table-by-table basis. Cluster also requires an exclusive lock but
> is *way* faster than vacuum-full. Cluster requires enough free-space to
> fully create the new clean copy of the table. This means a table can
> require as much as double it's original space for clustering though a
> heavily bloated table may require far less.
>
> Both cluster and vacuum full are safe. If you are in a tight place, you
> can carefully choose the method to use on a table-by-table basis:
> vacuum-full if your hand is forced and cluster when you have made enough
> free-space available.
>
> Once things are cleaned up, examine how they got bad to begin with so
> you aren't bitten again.
>
> Cheers,
> Steve
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>
>

--
View this message in context: http://old.nabble.com/reducing-postgresql-disk-space-tp28681415p28690348.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


Re: reducing postgresql disk space

From
paladine
Date:
command output for mentioned db :  ' my_db_name | 42 GB '

I don't print query results to logfile. I restore my system logs to db.
I have a lot of live ( growing ) logs on my machine and I register these
logs to db.
My essential question is that why don't I reclaim disk space though I run
this command
'delete from db ....' ?




Thom Brown wrote:
>
> On 26 May 2010 15:50, paladine <yasinmalli@gmail.com> wrote:
>>
>> It is default value ( #checkpoint_segments = 3    # in logfile segments,
>> min
>> 1, 16MB each )
>> Many of my database configurations are default values. (plain TOAST  etc)
>> my database is a log database so, some tables of db grow everytime.
>> My ' /base ' directory contains a lot of compressed object (1GB size)
>> These are maybe normal operations but I don't understand that
>> although I delete many rows from my db and regularly vacuum , reindexing
>> operations,
>> how doesn't postgresql give back that deleted areas for reusing.
>>
>
> I'm just wondering if you're still building up the initial set of WAL
> files which will begin to plateau if the data in your database in
> roughly consistent in size over time.
>
> Try:
>
> select datname, pg_size_pretty(pg_database_size(datname)) from
> pg_database order by datname;
>
> That should give you the actual sizes of each database.  Also, how
> verbose is the database logging?  If you're logging every query to a
> log file that may also account for it.
>
> Regards
>
> Thom
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>
>

--
View this message in context: http://old.nabble.com/reducing-postgresql-disk-space-tp28681415p28690076.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


Re: reducing postgresql disk space

From
paladine
Date:
> alter a table column to its own type, like this:
> alter table foo alter column my_counter type integer; -- my_counter
> is already an integer

Is that really reclaim disk space and how ??
For example; if 'my_counter' column is already integer,
why do I alter this column to integer again ?



Vick Khera wrote:
>
> On Wed, May 26, 2010 at 10:16 AM, paladine <yasinmalli@gmail.com> wrote:
>> Anyone know another method ?
>>
>
> options to reclaim disk space:
>
> vacuum full
> dump/restore (sometimes faster than vacuum full)
> cluster (not mvcc safe as far as i know)
> alter a table column to its own type, like this:
>  alter table foo alter column my_counter type integer; -- my_counter
> is already an integer
>
> sometimes all you need to do is reindex the table (or just the larger
> indexes on the table selectively)
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>
>

--
View this message in context: http://old.nabble.com/reducing-postgresql-disk-space-tp28681415p28690159.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


Re: reducing postgresql disk space

From
paladine
Date:
Firstly, thanks for your explanations...

> Are you attempting a one-time space reduction or are you having general
> bloat issues?

Unfortunately, I have growing bloat issues so I want to reduce space as it
filled up.
Thus I wrote a script but as I said before it doesn't reclaim disk space.

> make sure you have upgraded and that autovacuum is enabled and correctly
> tuned

In my pg configuration, ' autovacuum = off '
but I run a script ( daily cronjob ) that controls the number of table row
and
if it expires a determined limit, run ' delete command ' and then run '
vacuum analyse verbose '
In your opinion, Is autovacuuming more efficient way ?





Steve Crawford wrote:
>
> On 05/26/2010 07:16 AM, paladine wrote:
>> Hi all,
>>
>> How can I reduce disk space postgresql used ?
>> I tried to delete many rows from my database and
>> I am running ' vacuum analyze reindexdb ' commands regularly
>> but my disk space on my linux machine didn't reduce.
>>
>> I know that ' vacuum full ' command can do that but I don't want to use
>> that command because of the disadvantages.
>>
>> Anyone know another method ?
>>
> Are you attempting a one-time space reduction or are you having general
> bloat issues?
>
> It is important to understand what is happening behind the scenes. Due
> to MVCC (multi-version concurrency control), when you update a record,
> PostgreSQL keeps the old one available until the transaction commits.
> When no transaction needs the old record, it is not physically removed
> but it is marked as dead. The basic vacuum process does not free
> disk-space but rather identifies space within the files that hold the
> table that has become available for reuse.
>
> In a modern version of PostgreSQL with autovacuum running and set
> appropriately for your workload, bloat should stay reasonably under
> control (i.e. make sure you have upgraded and that autovacuum is enabled
> and correctly tuned). But there are some things that can cause excess
> table bloat like updates that hit all rows (this will roughly double the
> size of a clean table) or deletes of substantial portions of a table.
> Vacuum will allow this space to be reclaimed eventually, but you may
> want to reduce disk-space sooner.
>
> Your options:
>
> Dump/restore. Not useful on a live, running database but can be useful
> when you have yourself wedged in a corner on a machine out-of-space as
> you can dump to another machine then do a clean restore back to your
> server. Depending on your situation (especially foreign-key
> constraints), you *may* be able to dump/restore just a specific
> offending table.
>
> Vacuum full. Reclaims the space, but is typically sloooow and requires
> an exclusive table lock. IIRC, should be followed by a reindex of the
> table. But vacuum-full runs "in-place" so it can be of use when you have
> little free-space remaining on your device.
>
> Cluster. Reclaims free-space and reindexes. Also reorders the table-data
> to match the specified index which is often useful. Cluster must be run
> on a table-by-table basis. Cluster also requires an exclusive lock but
> is *way* faster than vacuum-full. Cluster requires enough free-space to
> fully create the new clean copy of the table. This means a table can
> require as much as double it's original space for clustering though a
> heavily bloated table may require far less.
>
> Both cluster and vacuum full are safe. If you are in a tight place, you
> can carefully choose the method to use on a table-by-table basis:
> vacuum-full if your hand is forced and cluster when you have made enough
> free-space available.
>
> Once things are cleaned up, examine how they got bad to begin with so
> you aren't bitten again.
>
> Cheers,
> Steve
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>
>

--
View this message in context: http://old.nabble.com/reducing-postgresql-disk-space-tp28681415p28690348.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


Re: reducing postgresql disk space

From
paladine
Date:
> alter a table column to its own type, like this:
> alter table foo alter column my_counter type integer; -- my_counter
> is already an integer

Is that really reclaim disk space and how ??
For example; if 'my_counter' column is already integer,
why do I alter this column to integer again ?



Vick Khera wrote:
>
> On Wed, May 26, 2010 at 10:16 AM, paladine <yasinmalli@gmail.com> wrote:
>> Anyone know another method ?
>>
>
> options to reclaim disk space:
>
> vacuum full
> dump/restore (sometimes faster than vacuum full)
> cluster (not mvcc safe as far as i know)
> alter a table column to its own type, like this:
>  alter table foo alter column my_counter type integer; -- my_counter
> is already an integer
>
> sometimes all you need to do is reindex the table (or just the larger
> indexes on the table selectively)
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>
>

--
View this message in context: http://old.nabble.com/reducing-postgresql-disk-space-tp28681415p28690159.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


Re: reducing postgresql disk space

From
paladine
Date:
command output for mentioned db :  ' my_db_name | 42 GB '

I don't print query results to logfile. I restore my system logs to db.
I have a lot of live ( growing ) logs on my machine and I register these
logs to db.
My essential question is that why don't I reclaim disk space though I run
this command
'delete from db ....' ?




Thom Brown wrote:
>
> On 26 May 2010 15:50, paladine <yasinmalli@gmail.com> wrote:
>>
>> It is default value ( #checkpoint_segments = 3    # in logfile segments,
>> min
>> 1, 16MB each )
>> Many of my database configurations are default values. (plain TOAST  etc)
>> my database is a log database so, some tables of db grow everytime.
>> My ' /base ' directory contains a lot of compressed object (1GB size)
>> These are maybe normal operations but I don't understand that
>> although I delete many rows from my db and regularly vacuum , reindexing
>> operations,
>> how doesn't postgresql give back that deleted areas for reusing.
>>
>
> I'm just wondering if you're still building up the initial set of WAL
> files which will begin to plateau if the data in your database in
> roughly consistent in size over time.
>
> Try:
>
> select datname, pg_size_pretty(pg_database_size(datname)) from
> pg_database order by datname;
>
> That should give you the actual sizes of each database.  Also, how
> verbose is the database logging?  If you're logging every query to a
> log file that may also account for it.
>
> Regards
>
> Thom
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>
>

--
View this message in context: http://old.nabble.com/reducing-postgresql-disk-space-tp28681415p28690076.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.