Thread: pg_xlog is getting bigger
Hi,
I am working on a Postgresql 9.0 server. I have no replication and archive mode setup. But I found that the pg_xlog is getting bigger and bigger.
Right now it is 20 GB.
How should I recover these spaces?
Please let me know.
Thanks.
AI Rumman wrote: > I am working on a Postgresql 9.0 server. I have no replication and archive > mode setup. But I found that the pg_xlog is getting bigger and bigger. > Right now it is 20 GB. > > How should I recover these spaces? Do you have archiving turned on? Are you getting errors in the server log related to failures of the archiving? -Kevin
AI Rumman wrote: > Kevin Grittner <kgrittn@mail.com> wrote: >> AI Rumman wrote: >>> Kevin Grittner <kgrittn@mail.com> wrote: >>>> AI Rumman wrote: >>>> >>>>> I am working on a Postgresql 9.0 server. I have no replication and >>>>> archive mode setup. But I found that the pg_xlog is getting bigger >>>>> and bigger. Right now it is 20 GB. >>>>> >>>>> How should I recover these spaces? >>>> >>>> Do you have archiving turned on? Are you getting errors in the server >>>> log related to failures of the archiving? >>> >>> I don't have archiving turned on. Can I remove some old xlog files? >> >> No. You can corrupt your database if you delete from that directory >> directly. Please post the results from running the query on this >> page: >> >> http://wiki.postgresql.org/wiki/Server_Configuration > version | PostgreSQL 9.0.4 on x86_64-unknown-linux-gnu, > compiled by GCC gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-48), 64-bit > bgwriter_delay | 300ms > bgwriter_lru_maxpages | 100 > bgwriter_lru_multiplier | 2 > bytea_output | escape > checkpoint_segments | 300 > checkpoint_warning | 1h > default_statistics_target | 250 > escape_string_warning | off > fsync | on > lc_collate | en_US.UTF-8 > lc_ctype | en_US.UTF-8 > listen_addresses | * > log_destination | stderr > log_directory | pg_log > log_filename | postgresql-%a.log > log_line_prefix | %t [%p]: [%l-1] host=%h user=%u,db=%d > log_min_duration_statement | 4s > log_rotation_age | 1d > log_rotation_size | 0 > log_truncate_on_rotation | on > logging_collector | on > maintenance_work_mem | 1GB > max_connections | 500 > max_stack_depth | 2MB > port | 5432 > server_encoding | UTF8 > shared_buffers | 512MB > TimeZone | US/Eastern > wal_sync_method | fdatasync > work_mem | 256MB > (31 rows) > > Please let me know if you find anything wrong here. > Thanks. I don't see anything obvious. Putting this back on the list, where it should have stayed all along. Maybe someone else has an idea; I've only seen such behavior when there were archiving problems which were showing up in the server log. -Kevin
>
> I don't see anything obvious. Putting this back on the list, where
> it should have stayed all along. Maybe someone else has an idea;
> I've only seen such behavior when there were archiving problems
> which were showing up in the server log.
>
> -Kevin
>
>
> --
Perhaps it cannot delete old log files because some file system error? Maybe permissions or a corrupted file somewhere?
On 12/19/2012 01:16 PM, Kevin Grittner wrote: > AI Rumman wrote: >> Kevin Grittner <kgrittn@mail.com> wrote: >>> AI Rumman wrote: >>>> Kevin Grittner <kgrittn@mail.com> wrote: >>>>> AI Rumman wrote: >>>>> >>>>>> I am working on a Postgresql 9.0 server. I have no replication and >>>>>> archive mode setup. But I found that the pg_xlog is getting bigger >>>>>> and bigger. Right now it is 20 GB. >>>>>> >>>>>> How should I recover these spaces? >>>>> >>>>> Do you have archiving turned on? Are you getting errors in the server >>>>> log related to failures of the archiving? >>>> >>>> I don't have archiving turned on. Can I remove some old xlog files? >>> >>> No. You can corrupt your database if you delete from that directory >>> directly. Please post the results from running the query on this >>> page: >>> >>> http://wiki.postgresql.org/wiki/Server_Configuration > >> version | PostgreSQL 9.0.4 on x86_64-unknown-linux-gnu, >> compiled by GCC gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-48), 64-bit >> bgwriter_delay | 300ms >> bgwriter_lru_maxpages | 100 >> bgwriter_lru_multiplier | 2 >> bytea_output | escape >> checkpoint_segments | 300 >> checkpoint_warning | 1h >> default_statistics_target | 250 >> escape_string_warning | off >> fsync | on >> lc_collate | en_US.UTF-8 >> lc_ctype | en_US.UTF-8 >> listen_addresses | * >> log_destination | stderr >> log_directory | pg_log >> log_filename | postgresql-%a.log >> log_line_prefix | %t [%p]: [%l-1] host=%h user=%u,db=%d >> log_min_duration_statement | 4s >> log_rotation_age | 1d >> log_rotation_size | 0 >> log_truncate_on_rotation | on >> logging_collector | on >> maintenance_work_mem | 1GB >> max_connections | 500 >> max_stack_depth | 2MB >> port | 5432 >> server_encoding | UTF8 >> shared_buffers | 512MB >> TimeZone | US/Eastern >> wal_sync_method | fdatasync >> work_mem | 256MB >> (31 rows) >> >> Please let me know if you find anything wrong here. >> Thanks. > > I don't see anything obvious. Putting this back on the list, where > it should have stayed all along. Maybe someone else has an idea; > I've only seen such behavior when there were archiving problems > which were showing up in the server log. Well just for grins and to go to the source, look in the postgresql.conf file itself. In particular what is the settings for: wal_keep_segments Also what does querying pg_stat_activity as the postgres user show? Looking for queries that have been held open a long time. > > -Kevin > > -- Adrian Klaver adrian.klaver@gmail.com
On 12/19/2012 01:39 PM, AI Rumman wrote: > > Well just for grins and to go to the source, look in the > postgresql.conf file itself. > > In particular what is the settings for: > > wal_keep_segments > > Also what does querying pg_stat_activity as the postgres user show? > > Looking for queries that have been held open a long time. > > > > > wal_keep_segment = 0 > and in pg_stat_activity all processes are idle. Please reply to the list also. Well the question is how long have those idle transactions been around? For a good blog on the subject see: http://www.depesz.com/2008/08/28/hunting-idle-in-transactions/ -- Adrian Klaver adrian.klaver@gmail.com
On 12/19/2012 02:00 PM, AI Rumman wrote: > > > Please reply to the list also. > > Well the question is how long have those idle transactions been around? > > > For a good blog on the subject see: > http://www.depesz.com/2008/08/__28/hunting-idle-in-__transactions/ > <http://www.depesz.com/2008/08/28/hunting-idle-in-transactions/> > > -- > Adrian Klaver > adrian.klaver@gmail.com <mailto:adrian.klaver@gmail.com> > > > Thanks for the link. But there is no connection showing 'Idle in > Transaction'. > All are showing only '<IDLE>'. Hmm, got ahead of myself. Does pg_locks show anything interesting? Please hit Reply All so the rest of the list can stay in the loop. More eyes = faster solution:) -- Adrian Klaver adrian.klaver@gmail.com
On 12/19/2012 01:16 PM, Kevin Grittner wrote: > AI Rumman wrote: >> Kevin Grittner <kgrittn@mail.com> wrote: >>> AI Rumman wrote: >>>> Kevin Grittner <kgrittn@mail.com> wrote: >>>>> AI Rumman wrote: >>>>> >>>>>> I am working on a Postgresql 9.0 server. I have no replication and >>>>>> archive mode setup. But I found that the pg_xlog is getting bigger >>>>>> and bigger. Right now it is 20 GB. >>>>>> >>>>>> How should I recover these spaces? >>>>> >>>>> Do you have archiving turned on? Are you getting errors in the server >>>>> log related to failures of the archiving? >>>> >>>> I don't have archiving turned on. Can I remove some old xlog files? >>> >>> No. You can corrupt your database if you delete from that directory >>> directly. Please post the results from running the query on this >>> page: >>> >>> http://wiki.postgresql.org/wiki/Server_Configuration > >> version | PostgreSQL 9.0.4 on x86_64-unknown-linux-gnu, >> compiled by GCC gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-48), 64-bit >> bgwriter_delay | 300ms >> bgwriter_lru_maxpages | 100 >> bgwriter_lru_multiplier | 2 >> bytea_output | escape >> checkpoint_segments | 300 Aah, just noticed the above. http://www.postgresql.org/docs/9.0/interactive/wal-configuration.html " Checkpoints are points in the sequence of transactions at which it is guaranteed that the heap and index data files have been updated with all information written before the checkpoint. At checkpoint time, all dirty data pages are flushed to disk and a special checkpoint record is written to the log file. (The changes were previously flushed to the WAL files.) In the event of a crash, the crash recovery procedure looks at the latest checkpoint record to determine the point in the log (known as the redo record) from which it should start the REDO operation. Any changes made to data files before that point are guaranteed to be already on disk. Hence, after a checkpoint, log segments preceding the one containing the redo record are no longer needed and can be recycled or removed. (When WAL archiving is being done, the log segments must be archived before being recycled or removed.)... ...A checkpoint is created every checkpoint_segments log segments,... " Given that log segments are 16MB this means you will get 16MB * 300 segments of logs before they can start to be recycled. Does this correspond with the directory size you are seeing in pg_xlog? FYI the default value is 3. -- Adrian Klaver adrian.klaver@gmail.com
Adrian Klaver <adrian.klaver@gmail.com> writes: > Well the question is how long have those idle transactions been around? Idle transactions shouldn't have anything to do with pg_xlog bloat. What causes xlog bloat is inability to release old WAL because either (a) we're not able to complete checkpoints, or (b) WAL archiving is enabled but malfunctioning, and the old WAL segments are being kept pending successful archiving. Either (a) or (b) should result in bleating in the postmaster log. regards, tom lane
On 12/19/2012 04:12 PM, Tom Lane wrote: > Adrian Klaver <adrian.klaver@gmail.com> writes: >> Well the question is how long have those idle transactions been around? > > Idle transactions shouldn't have anything to do with pg_xlog bloat. > What causes xlog bloat is inability to release old WAL because either > (a) we're not able to complete checkpoints, or (b) WAL archiving is > enabled but malfunctioning, and the old WAL segments are being kept > pending successful archiving. Its obvious I am missing something important about WAL. Scenario: 1) Transaction is opened and say many UPDATEs are done. 2) This means there is now an old tuple and a new tuple for the previous row. 3) The transaction is not committed. I assumed the WAL logs contained information necessary to either go forward to the new on commit or go back to the old on rollback. I further assumed the log segment(s) could not be released until either a commit/rollback was done. At this point I figure I the above assumption is wrong or my understanding of <IDLE in TRANSACTION> is wrong or both! > > Either (a) or (b) should result in bleating in the postmaster log. > > regards, tom lane > > -- Adrian Klaver adrian.klaver@gmail.com
On Wed, Dec 19, 2012 at 7:52 PM, Adrian Klaver <adrian.klaver@gmail.com> wrote:
On 12/19/2012 04:12 PM, Tom Lane wrote:Its obvious I am missing something important about WAL.Adrian Klaver <adrian.klaver@gmail.com> writes:Well the question is how long have those idle transactions been around?
Idle transactions shouldn't have anything to do with pg_xlog bloat.
What causes xlog bloat is inability to release old WAL because either
(a) we're not able to complete checkpoints, or (b) WAL archiving is
enabled but malfunctioning, and the old WAL segments are being kept
pending successful archiving.
Scenario:
1) Transaction is opened and say many UPDATEs are done.
2) This means there is now an old tuple and a new tuple for the previous row.
3) The transaction is not committed.
I assumed the WAL logs contained information necessary to either go forward to the new on commit or go back to the old on rollback. I further assumed the log segment(s) could not be released until either a commit/rollback was done.
At this point I figure I the above assumption is wrong or my understanding of <IDLE in TRANSACTION> is wrong or both!--
Either (a) or (b) should result in bleating in the postmaster log.
regards, tom lane
Adrian Klaver
adrian.klaver@gmail.com
I modified checkpoint_segment to 100 form 300 and then forced some CHECKPOINT and pg_switch_xlog() and now found that the pg_xlog file got almost 1 gb of space back.
Adrian Klaver <adrian.klaver@gmail.com> writes: > On 12/19/2012 04:12 PM, Tom Lane wrote: >> Idle transactions shouldn't have anything to do with pg_xlog bloat. >> What causes xlog bloat is inability to release old WAL because either >> (a) we're not able to complete checkpoints, or (b) WAL archiving is >> enabled but malfunctioning, and the old WAL segments are being kept >> pending successful archiving. > Its obvious I am missing something important about WAL. > Scenario: > 1) Transaction is opened and say many UPDATEs are done. > 2) This means there is now an old tuple and a new tuple for the previous > row. > 3) The transaction is not committed. > I assumed the WAL logs contained information necessary to either go > forward to the new on commit or go back to the old on rollback. No, you're thinking about an Oracle-style WAL implementation. We don't do rollback; both tuple versions are kept around until VACUUM decides it can reclaim one or the other. WAL only need be kept back to the latest checkpoint, even if there are open transactions that are much older. regards, tom lane
Adrian Klaver <adrian.klaver@gmail.com> writes: > On 12/19/2012 01:16 PM, Kevin Grittner wrote: >> AI Rumman wrote: >>> checkpoint_segments | 300 > Aah, just noticed the above. I saw that too, but it doesn't seem to explain 20GB worth of pg_xlog. The fine manual mentions somewhere that we can keep up to twice the specified number of xlog segments, so one would expect at most 16MB * 600 or about 9.6GB in pg_xlog. (And this is assuming that checkpoint_timeout is large enough to let checkpoint_segments be the controlling factor.) So there's something else going on. regards, tom lane
On 12/19/2012 04:55 PM, Tom Lane wrote: > Adrian Klaver <adrian.klaver@gmail.com> writes: >> On 12/19/2012 04:12 PM, Tom Lane wrote: >>> Idle transactions shouldn't have anything to do with pg_xlog bloat. >>> What causes xlog bloat is inability to release old WAL because either >>> (a) we're not able to complete checkpoints, or (b) WAL archiving is >>> enabled but malfunctioning, and the old WAL segments are being kept >>> pending successful archiving. > >> Its obvious I am missing something important about WAL. >> Scenario: >> 1) Transaction is opened and say many UPDATEs are done. >> 2) This means there is now an old tuple and a new tuple for the previous >> row. >> 3) The transaction is not committed. > >> I assumed the WAL logs contained information necessary to either go >> forward to the new on commit or go back to the old on rollback. > > No, you're thinking about an Oracle-style WAL implementation. We don't > do rollback; both tuple versions are kept around until VACUUM decides it > can reclaim one or the other. WAL only need be kept back to the latest > checkpoint, even if there are open transactions that are much older. So basically the WAL mechanism just records the raw data, including transaction information, to the log segments and leaves the cleanup to VACUUM. So uncommitted transactions can lead to table bloat not WAL bloat. > > regards, tom lane > > -- Adrian Klaver adrian.klaver@gmail.com
On 12/19/2012 04:54 PM, AI Rumman wrote: > > > I modified checkpoint_segment to 100 form 300 and then forced some > CHECKPOINT and pg_switch_xlog() and now found that the pg_xlog file got > almost 1 gb of space back. Per Toms post this is a start but probably not the answer. Have you looked in the pg_xlog directory itself to see if anything looks funny? One thing I could think of is file timestamps that do not look right. -- Adrian Klaver adrian.klaver@gmail.com
Go to database # psql -U postgres #checkpoint; exit now go to pg_xlog folder you can can see a pattern in file names of logs. Just remove the file other then current time stamp. Or for safety remove logs other then current date. -- View this message in context: http://postgresql.1045698.n5.nabble.com/pg-xlog-is-getting-bigger-tp5737207p5779313.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
Pankaj <pankaj.mimit@gmail.com> wrote: > now go to pg_xlog folder you can can see a pattern in file names of logs. > Just remove the file other then current time stamp. Or for safety remove > logs other then current date. No!!! NEVER remove ANY files from the pg_xlog sub-directory! That will corrupt your database. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Dera Kevin,
If you run checkpoint in database then you can delete the pg_xlog other then current timestamp. We do this to our customer on regular basis when pg_xlog exceed more then 15 gb On Sat, Nov 23, 2013 at 1:48 AM, Kevin Grittner <kgrittn@ymail.com> wrote:
Pankaj <pankaj.mimit@gmail.com> wrote:
> now go to pg_xlog folder you can can see a pattern in file names of logs.
> Just remove the file other then current time stamp. Or for safety remove
> logs other then current date.
No!!! NEVER remove ANY files from the pg_xlog sub-directory! That
will corrupt your database.
--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Pankaj Mishra escribió: > Dera Kevin, > > If you run checkpoint in database then you can delete the pg_xlog other > then current timestamp. We do this to our customer on regular basis when > pg_xlog exceed more then 15 gb That is "somewhat" correct (meaning you are less likely to see data corruption in that case), but it is still misleading advice. The system is prepared to delete the files it no longer needs automatically. If you have WAL files piling up for no apparent reason, the most likely explanation is that the server is misconfigured, for example due to a failing archive_command, or to a high "keep" setting for WAL files (which is used in some replication scenarios). In those cases, the real solution is to fix that problem, not delete files yourself. > On Sat, Nov 23, 2013 at 1:48 AM, Kevin Grittner <kgrittn@ymail.com> wrote: > > > Pankaj <pankaj.mimit@gmail.com> wrote: > > > > > now go to pg_xlog folder you can can see a pattern in file names of logs. > > > Just remove the file other then current time stamp. Or for safety remove > > > logs other then current date. > > > > No!!! NEVER remove ANY files from the pg_xlog sub-directory! That > > will corrupt your database. > > > > -- > > Kevin Grittner > > EDB: http://www.enterprisedb.com > > The Enterprise PostgreSQL Company > > -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Mon, Nov 25, 2013 at 03:00:56PM +0530, Pankaj Mishra wrote: > We do this to our customer on regular basis when > pg_xlog exceed more then 15 gb Wow. Your poor customers. But I'm glad you phrased it "do this to our customer" rather than "do this _for_ our customer", since your customer certainly is having something done to them. -- Andrew Sullivan ajs@crankycanuck.ca