Thread: reducing postgresql disk space
Hi all, How can I reduce disk space postgresql used ? I tried to delete many rows from my database and I am running ' vacuum analyze reindexdb ' commands regularly but my disk space on my linux machine didn't reduce. I know that ' vacuum full ' command can do that but I don't want to use that command because of the disadvantages. Anyone know another method ? Thanks in advance. -- View this message in context: http://old.nabble.com/reducing-postgresql-disk-space-tp28681415p28681415.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
On 26 May 2010 15:16, paladine <yasinmalli@gmail.com> wrote: > > Hi all, > > How can I reduce disk space postgresql used ? > I tried to delete many rows from my database and > I am running ' vacuum analyze reindexdb ' commands regularly > but my disk space on my linux machine didn't reduce. > > I know that ' vacuum full ' command can do that but I don't want to use > that command because of the disadvantages. > > Anyone know another method ? > > Thanks in advance. > -- > View this message in context: http://old.nabble.com/reducing-postgresql-disk-space-tp28681415p28681415.html > Sent from the PostgreSQL - general mailing list archive at Nabble.com. > What have you got checkpoint_segments set to in your postgresql.conf file? And how big is your database cluster? Thom
On Wed, May 26, 2010 at 10:16 AM, paladine <yasinmalli@gmail.com> wrote: > Anyone know another method ? > options to reclaim disk space: vacuum full dump/restore (sometimes faster than vacuum full) cluster (not mvcc safe as far as i know) alter a table column to its own type, like this: alter table foo alter column my_counter type integer; -- my_counter is already an integer sometimes all you need to do is reindex the table (or just the larger indexes on the table selectively)
It is default value ( #checkpoint_segments = 3 # in logfile segments, min 1, 16MB each ) Many of my database configurations are default values. (plain TOAST etc) my database is a log database so, some tables of db grow everytime. My ' /base ' directory contains a lot of compressed object (1GB size) These are maybe normal operations but I don't understand that although I delete many rows from my db and regularly vacuum , reindexing operations, how doesn't postgresql give back that deleted areas for reusing. Thom Brown wrote: > > On 26 May 2010 15:16, paladine <yasinmalli@gmail.com> wrote: >> >> Hi all, >> >> How can I reduce disk space postgresql used ? >> I tried to delete many rows from my database and >> I am running ' vacuum analyze reindexdb ' commands regularly >> but my disk space on my linux machine didn't reduce. >> >> I know that ' vacuum full ' command can do that but I don't want to use >> that command because of the disadvantages. >> >> Anyone know another method ? >> >> Thanks in advance. >> -- >> View this message in context: >> http://old.nabble.com/reducing-postgresql-disk-space-tp28681415p28681415.html >> Sent from the PostgreSQL - general mailing list archive at Nabble.com. >> > > What have you got checkpoint_segments set to in your postgresql.conf > file? And how big is your database cluster? > > Thom > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general > > -- View this message in context: http://old.nabble.com/reducing-postgresql-disk-space-tp28681415p28681867.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
On 26 May 2010 15:50, paladine <yasinmalli@gmail.com> wrote: > > It is default value ( #checkpoint_segments = 3 # in logfile segments, min > 1, 16MB each ) > Many of my database configurations are default values. (plain TOAST etc) > my database is a log database so, some tables of db grow everytime. > My ' /base ' directory contains a lot of compressed object (1GB size) > These are maybe normal operations but I don't understand that > although I delete many rows from my db and regularly vacuum , reindexing > operations, > how doesn't postgresql give back that deleted areas for reusing. > I'm just wondering if you're still building up the initial set of WAL files which will begin to plateau if the data in your database in roughly consistent in size over time. Try: select datname, pg_size_pretty(pg_database_size(datname)) from pg_database order by datname; That should give you the actual sizes of each database. Also, how verbose is the database logging? If you're logging every query to a log file that may also account for it. Regards Thom
On 05/26/2010 07:16 AM, paladine wrote: > Hi all, > > How can I reduce disk space postgresql used ? > I tried to delete many rows from my database and > I am running ' vacuum analyze reindexdb ' commands regularly > but my disk space on my linux machine didn't reduce. > > I know that ' vacuum full ' command can do that but I don't want to use > that command because of the disadvantages. > > Anyone know another method ? > Are you attempting a one-time space reduction or are you having general bloat issues? It is important to understand what is happening behind the scenes. Due to MVCC (multi-version concurrency control), when you update a record, PostgreSQL keeps the old one available until the transaction commits. When no transaction needs the old record, it is not physically removed but it is marked as dead. The basic vacuum process does not free disk-space but rather identifies space within the files that hold the table that has become available for reuse. In a modern version of PostgreSQL with autovacuum running and set appropriately for your workload, bloat should stay reasonably under control (i.e. make sure you have upgraded and that autovacuum is enabled and correctly tuned). But there are some things that can cause excess table bloat like updates that hit all rows (this will roughly double the size of a clean table) or deletes of substantial portions of a table. Vacuum will allow this space to be reclaimed eventually, but you may want to reduce disk-space sooner. Your options: Dump/restore. Not useful on a live, running database but can be useful when you have yourself wedged in a corner on a machine out-of-space as you can dump to another machine then do a clean restore back to your server. Depending on your situation (especially foreign-key constraints), you *may* be able to dump/restore just a specific offending table. Vacuum full. Reclaims the space, but is typically sloooow and requires an exclusive table lock. IIRC, should be followed by a reindex of the table. But vacuum-full runs "in-place" so it can be of use when you have little free-space remaining on your device. Cluster. Reclaims free-space and reindexes. Also reorders the table-data to match the specified index which is often useful. Cluster must be run on a table-by-table basis. Cluster also requires an exclusive lock but is *way* faster than vacuum-full. Cluster requires enough free-space to fully create the new clean copy of the table. This means a table can require as much as double it's original space for clustering though a heavily bloated table may require far less. Both cluster and vacuum full are safe. If you are in a tight place, you can carefully choose the method to use on a table-by-table basis: vacuum-full if your hand is forced and cluster when you have made enough free-space available. Once things are cleaned up, examine how they got bad to begin with so you aren't bitten again. Cheers, Steve
On 05/26/2010 07:50 AM, paladine wrote: > It is default value ( #checkpoint_segments = 3 # in logfile segments, min > 1, 16MB each ) > Many of my database configurations are default values. (plain TOAST etc) > my database is a log database so, some tables of db grow everytime. > My ' /base ' directory contains a lot of compressed object (1GB size) > These are maybe normal operations but I don't understand that > although I delete many rows from my db and regularly vacuum , reindexing > operations, > how doesn't postgresql give back that deleted areas for reusing. > What is your PostgreSQL version? In older versions, you needed to set the free space map high enough to manage the space that vacuum identified as available. If that isn't high enough, vacuum won't be able to fully do its job. Also, if you are doing a typical form of logging where you delete entries older than some set age, you should read up on table partitioning. For example, if you keep log data for a year, set up an empty parent table and create child tables spanning the appropriate subset of the year (month, week, ...). After the child table is no-longer needed it can be dropped or truncated depending on your situation. Dropping or truncating is far faster than "delete from...where..." and causes no table or index bloat. Cheers, Steve
My PostgreSQL version is 8.1.11. My log entries are on filesytem and I write these logs to db from unix stream. Another interesting situation is that while my logs on filesystem contains about 5-6 GB, db directory (/base/OID) contains 40 GB. Is that normal ? I implement your suggestions by using shell scripts ( when the disk filled up I delete it from db etc. ) Steve Crawford wrote: > > On 05/26/2010 07:50 AM, paladine wrote: >> It is default value ( #checkpoint_segments = 3 # in logfile segments, >> min >> 1, 16MB each ) >> Many of my database configurations are default values. (plain TOAST etc) >> my database is a log database so, some tables of db grow everytime. >> My ' /base ' directory contains a lot of compressed object (1GB size) >> These are maybe normal operations but I don't understand that >> although I delete many rows from my db and regularly vacuum , reindexing >> operations, >> how doesn't postgresql give back that deleted areas for reusing. >> > What is your PostgreSQL version? In older versions, you needed to set > the free space map high enough to manage the space that vacuum > identified as available. If that isn't high enough, vacuum won't be able > to fully do its job. > > Also, if you are doing a typical form of logging where you delete > entries older than some set age, you should read up on table > partitioning. For example, if you keep log data for a year, set up an > empty parent table and create child tables spanning the appropriate > subset of the year (month, week, ...). After the child table is > no-longer needed it can be dropped or truncated depending on your > situation. Dropping or truncating is far faster than "delete > from...where..." and causes no table or index bloat. > > Cheers, > Steve > > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general > > -- View this message in context: http://old.nabble.com/reducing-postgresql-disk-space-tp28681415p28682916.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
Firstly, thanks for your explanations... > Are you attempting a one-time space reduction or are you having general > bloat issues? Unfortunately, I have growing bloat issues so I want to reduce space as it filled up. Thus I wrote a script but as I said before it doesn't reclaim disk space. > make sure you have upgraded and that autovacuum is enabled and correctly > tuned In my pg configuration, ' autovacuum = off ' but I run a script ( daily cronjob ) that controls the number of table row and if it expires a determined limit, run ' delete command ' and then run ' vacuum analyse verbose ' In your opinion, Is autovacuuming more efficient way ? Steve Crawford wrote: > > On 05/26/2010 07:16 AM, paladine wrote: >> Hi all, >> >> How can I reduce disk space postgresql used ? >> I tried to delete many rows from my database and >> I am running ' vacuum analyze reindexdb ' commands regularly >> but my disk space on my linux machine didn't reduce. >> >> I know that ' vacuum full ' command can do that but I don't want to use >> that command because of the disadvantages. >> >> Anyone know another method ? >> > Are you attempting a one-time space reduction or are you having general > bloat issues? > > It is important to understand what is happening behind the scenes. Due > to MVCC (multi-version concurrency control), when you update a record, > PostgreSQL keeps the old one available until the transaction commits. > When no transaction needs the old record, it is not physically removed > but it is marked as dead. The basic vacuum process does not free > disk-space but rather identifies space within the files that hold the > table that has become available for reuse. > > In a modern version of PostgreSQL with autovacuum running and set > appropriately for your workload, bloat should stay reasonably under > control (i.e. make sure you have upgraded and that autovacuum is enabled > and correctly tuned). But there are some things that can cause excess > table bloat like updates that hit all rows (this will roughly double the > size of a clean table) or deletes of substantial portions of a table. > Vacuum will allow this space to be reclaimed eventually, but you may > want to reduce disk-space sooner. > > Your options: > > Dump/restore. Not useful on a live, running database but can be useful > when you have yourself wedged in a corner on a machine out-of-space as > you can dump to another machine then do a clean restore back to your > server. Depending on your situation (especially foreign-key > constraints), you *may* be able to dump/restore just a specific > offending table. > > Vacuum full. Reclaims the space, but is typically sloooow and requires > an exclusive table lock. IIRC, should be followed by a reindex of the > table. But vacuum-full runs "in-place" so it can be of use when you have > little free-space remaining on your device. > > Cluster. Reclaims free-space and reindexes. Also reorders the table-data > to match the specified index which is often useful. Cluster must be run > on a table-by-table basis. Cluster also requires an exclusive lock but > is *way* faster than vacuum-full. Cluster requires enough free-space to > fully create the new clean copy of the table. This means a table can > require as much as double it's original space for clustering though a > heavily bloated table may require far less. > > Both cluster and vacuum full are safe. If you are in a tight place, you > can carefully choose the method to use on a table-by-table basis: > vacuum-full if your hand is forced and cluster when you have made enough > free-space available. > > Once things are cleaned up, examine how they got bad to begin with so > you aren't bitten again. > > Cheers, > Steve > > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general > > -- View this message in context: http://old.nabble.com/reducing-postgresql-disk-space-tp28681415p28690348.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
command output for mentioned db : ' my_db_name | 42 GB ' I don't print query results to logfile. I restore my system logs to db. I have a lot of live ( growing ) logs on my machine and I register these logs to db. My essential question is that why don't I reclaim disk space though I run this command 'delete from db ....' ? Thom Brown wrote: > > On 26 May 2010 15:50, paladine <yasinmalli@gmail.com> wrote: >> >> It is default value ( #checkpoint_segments = 3 # in logfile segments, >> min >> 1, 16MB each ) >> Many of my database configurations are default values. (plain TOAST etc) >> my database is a log database so, some tables of db grow everytime. >> My ' /base ' directory contains a lot of compressed object (1GB size) >> These are maybe normal operations but I don't understand that >> although I delete many rows from my db and regularly vacuum , reindexing >> operations, >> how doesn't postgresql give back that deleted areas for reusing. >> > > I'm just wondering if you're still building up the initial set of WAL > files which will begin to plateau if the data in your database in > roughly consistent in size over time. > > Try: > > select datname, pg_size_pretty(pg_database_size(datname)) from > pg_database order by datname; > > That should give you the actual sizes of each database. Also, how > verbose is the database logging? If you're logging every query to a > log file that may also account for it. > > Regards > > Thom > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general > > -- View this message in context: http://old.nabble.com/reducing-postgresql-disk-space-tp28681415p28690076.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
> alter a table column to its own type, like this: > alter table foo alter column my_counter type integer; -- my_counter > is already an integer Is that really reclaim disk space and how ?? For example; if 'my_counter' column is already integer, why do I alter this column to integer again ? Vick Khera wrote: > > On Wed, May 26, 2010 at 10:16 AM, paladine <yasinmalli@gmail.com> wrote: >> Anyone know another method ? >> > > options to reclaim disk space: > > vacuum full > dump/restore (sometimes faster than vacuum full) > cluster (not mvcc safe as far as i know) > alter a table column to its own type, like this: > alter table foo alter column my_counter type integer; -- my_counter > is already an integer > > sometimes all you need to do is reindex the table (or just the larger > indexes on the table selectively) > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general > > -- View this message in context: http://old.nabble.com/reducing-postgresql-disk-space-tp28681415p28690159.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
Firstly, thanks for your explanations... > Are you attempting a one-time space reduction or are you having general > bloat issues? Unfortunately, I have growing bloat issues so I want to reduce space as it filled up. Thus I wrote a script but as I said before it doesn't reclaim disk space. > make sure you have upgraded and that autovacuum is enabled and correctly > tuned In my pg configuration, ' autovacuum = off ' but I run a script ( daily cronjob ) that controls the number of table row and if it expires a determined limit, run ' delete command ' and then run ' vacuum analyse verbose ' In your opinion, Is autovacuuming more efficient way ? Steve Crawford wrote: > > On 05/26/2010 07:16 AM, paladine wrote: >> Hi all, >> >> How can I reduce disk space postgresql used ? >> I tried to delete many rows from my database and >> I am running ' vacuum analyze reindexdb ' commands regularly >> but my disk space on my linux machine didn't reduce. >> >> I know that ' vacuum full ' command can do that but I don't want to use >> that command because of the disadvantages. >> >> Anyone know another method ? >> > Are you attempting a one-time space reduction or are you having general > bloat issues? > > It is important to understand what is happening behind the scenes. Due > to MVCC (multi-version concurrency control), when you update a record, > PostgreSQL keeps the old one available until the transaction commits. > When no transaction needs the old record, it is not physically removed > but it is marked as dead. The basic vacuum process does not free > disk-space but rather identifies space within the files that hold the > table that has become available for reuse. > > In a modern version of PostgreSQL with autovacuum running and set > appropriately for your workload, bloat should stay reasonably under > control (i.e. make sure you have upgraded and that autovacuum is enabled > and correctly tuned). But there are some things that can cause excess > table bloat like updates that hit all rows (this will roughly double the > size of a clean table) or deletes of substantial portions of a table. > Vacuum will allow this space to be reclaimed eventually, but you may > want to reduce disk-space sooner. > > Your options: > > Dump/restore. Not useful on a live, running database but can be useful > when you have yourself wedged in a corner on a machine out-of-space as > you can dump to another machine then do a clean restore back to your > server. Depending on your situation (especially foreign-key > constraints), you *may* be able to dump/restore just a specific > offending table. > > Vacuum full. Reclaims the space, but is typically sloooow and requires > an exclusive table lock. IIRC, should be followed by a reindex of the > table. But vacuum-full runs "in-place" so it can be of use when you have > little free-space remaining on your device. > > Cluster. Reclaims free-space and reindexes. Also reorders the table-data > to match the specified index which is often useful. Cluster must be run > on a table-by-table basis. Cluster also requires an exclusive lock but > is *way* faster than vacuum-full. Cluster requires enough free-space to > fully create the new clean copy of the table. This means a table can > require as much as double it's original space for clustering though a > heavily bloated table may require far less. > > Both cluster and vacuum full are safe. If you are in a tight place, you > can carefully choose the method to use on a table-by-table basis: > vacuum-full if your hand is forced and cluster when you have made enough > free-space available. > > Once things are cleaned up, examine how they got bad to begin with so > you aren't bitten again. > > Cheers, > Steve > > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general > > -- View this message in context: http://old.nabble.com/reducing-postgresql-disk-space-tp28681415p28690348.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
> alter a table column to its own type, like this: > alter table foo alter column my_counter type integer; -- my_counter > is already an integer Is that really reclaim disk space and how ?? For example; if 'my_counter' column is already integer, why do I alter this column to integer again ? Vick Khera wrote: > > On Wed, May 26, 2010 at 10:16 AM, paladine <yasinmalli@gmail.com> wrote: >> Anyone know another method ? >> > > options to reclaim disk space: > > vacuum full > dump/restore (sometimes faster than vacuum full) > cluster (not mvcc safe as far as i know) > alter a table column to its own type, like this: > alter table foo alter column my_counter type integer; -- my_counter > is already an integer > > sometimes all you need to do is reindex the table (or just the larger > indexes on the table selectively) > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general > > -- View this message in context: http://old.nabble.com/reducing-postgresql-disk-space-tp28681415p28690159.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
command output for mentioned db : ' my_db_name | 42 GB ' I don't print query results to logfile. I restore my system logs to db. I have a lot of live ( growing ) logs on my machine and I register these logs to db. My essential question is that why don't I reclaim disk space though I run this command 'delete from db ....' ? Thom Brown wrote: > > On 26 May 2010 15:50, paladine <yasinmalli@gmail.com> wrote: >> >> It is default value ( #checkpoint_segments = 3 # in logfile segments, >> min >> 1, 16MB each ) >> Many of my database configurations are default values. (plain TOAST etc) >> my database is a log database so, some tables of db grow everytime. >> My ' /base ' directory contains a lot of compressed object (1GB size) >> These are maybe normal operations but I don't understand that >> although I delete many rows from my db and regularly vacuum , reindexing >> operations, >> how doesn't postgresql give back that deleted areas for reusing. >> > > I'm just wondering if you're still building up the initial set of WAL > files which will begin to plateau if the data in your database in > roughly consistent in size over time. > > Try: > > select datname, pg_size_pretty(pg_database_size(datname)) from > pg_database order by datname; > > That should give you the actual sizes of each database. Also, how > verbose is the database logging? If you're logging every query to a > log file that may also account for it. > > Regards > > Thom > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general > > -- View this message in context: http://old.nabble.com/reducing-postgresql-disk-space-tp28681415p28690076.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.