Thread: pg_xlog is getting bigger

pg_xlog is getting bigger

From
AI Rumman
Date:
Hi,

I am working on a Postgresql 9.0 server. I have no replication and archive mode setup. But I found that  the pg_xlog is getting bigger and bigger.
Right now it is 20 GB.

How should I recover these spaces?

Please let me know.

Thanks.

Re: pg_xlog is getting bigger

From
"Kevin Grittner"
Date:
AI Rumman wrote:

> I am working on a Postgresql 9.0 server. I have no replication and archive
> mode setup. But I found that the pg_xlog is getting bigger and bigger.
> Right now it is 20 GB.
>
> How should I recover these spaces?

Do you have archiving turned on? Are you getting errors in the server
log related to failures of the archiving?

-Kevin


Re: pg_xlog is getting bigger

From
"Kevin Grittner"
Date:
AI Rumman wrote:
> Kevin Grittner <kgrittn@mail.com> wrote:
>> AI Rumman wrote:
>>> Kevin Grittner <kgrittn@mail.com> wrote:
>>>> AI Rumman wrote:
>>>>
>>>>> I am working on a Postgresql 9.0 server. I have no replication and
>>>>> archive mode setup. But I found that the pg_xlog is getting bigger
>>>>> and bigger. Right now it is 20 GB.
>>>>>
>>>>> How should I recover these spaces?
>>>>
>>>> Do you have archiving turned on? Are you getting errors in the server
>>>> log related to failures of the archiving?
>>>
>>> I don't have archiving turned on. Can I remove some old xlog files?
>>
>> No. You can corrupt your database if you delete from that directory
>> directly. Please post the results from running the query on this
>> page:
>>
>> http://wiki.postgresql.org/wiki/Server_Configuration

>  version | PostgreSQL 9.0.4 on x86_64-unknown-linux-gnu,
> compiled by GCC gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-48), 64-bit
>  bgwriter_delay | 300ms
>  bgwriter_lru_maxpages | 100
>  bgwriter_lru_multiplier | 2
>  bytea_output | escape
>  checkpoint_segments | 300
>  checkpoint_warning | 1h
>  default_statistics_target | 250
>  escape_string_warning | off
>  fsync | on
>  lc_collate | en_US.UTF-8
>  lc_ctype | en_US.UTF-8
>  listen_addresses | *
>  log_destination | stderr
>  log_directory | pg_log
>  log_filename | postgresql-%a.log
>  log_line_prefix | %t [%p]: [%l-1] host=%h user=%u,db=%d
>  log_min_duration_statement | 4s
>  log_rotation_age | 1d
>  log_rotation_size | 0
>  log_truncate_on_rotation | on
>  logging_collector | on
>  maintenance_work_mem | 1GB
>  max_connections | 500
>  max_stack_depth | 2MB
>  port | 5432
>  server_encoding | UTF8
>  shared_buffers | 512MB
>  TimeZone | US/Eastern
>  wal_sync_method | fdatasync
>  work_mem | 256MB
> (31 rows)
>
> Please let me know if you find anything wrong here.
> Thanks.

I don't see anything obvious. Putting this back on the list, where
it should have stayed all along. Maybe someone else has an idea;
I've only seen such behavior when there were archiving problems
which were showing up in the server log.

-Kevin


Re: pg_xlog is getting bigger

From
dabicho
Date:


>
> I don't see anything obvious. Putting this back on the list, where
> it should have stayed all along. Maybe someone else has an idea;
> I've only seen such behavior when there were archiving problems
> which were showing up in the server log.
>
> -Kevin
>
>
> --
Perhaps it cannot delete old log files because some file system error? Maybe permissions or a corrupted file somewhere?

Re: pg_xlog is getting bigger

From
Adrian Klaver
Date:
On 12/19/2012 01:16 PM, Kevin Grittner wrote:
> AI Rumman wrote:
>> Kevin Grittner <kgrittn@mail.com> wrote:
>>> AI Rumman wrote:
>>>> Kevin Grittner <kgrittn@mail.com> wrote:
>>>>> AI Rumman wrote:
>>>>>
>>>>>> I am working on a Postgresql 9.0 server. I have no replication and
>>>>>> archive mode setup. But I found that the pg_xlog is getting bigger
>>>>>> and bigger. Right now it is 20 GB.
>>>>>>
>>>>>> How should I recover these spaces?
>>>>>
>>>>> Do you have archiving turned on? Are you getting errors in the server
>>>>> log related to failures of the archiving?
>>>>
>>>> I don't have archiving turned on. Can I remove some old xlog files?
>>>
>>> No. You can corrupt your database if you delete from that directory
>>> directly. Please post the results from running the query on this
>>> page:
>>>
>>> http://wiki.postgresql.org/wiki/Server_Configuration
>
>>   version | PostgreSQL 9.0.4 on x86_64-unknown-linux-gnu,
>> compiled by GCC gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-48), 64-bit
>>   bgwriter_delay | 300ms
>>   bgwriter_lru_maxpages | 100
>>   bgwriter_lru_multiplier | 2
>>   bytea_output | escape
>>   checkpoint_segments | 300
>>   checkpoint_warning | 1h
>>   default_statistics_target | 250
>>   escape_string_warning | off
>>   fsync | on
>>   lc_collate | en_US.UTF-8
>>   lc_ctype | en_US.UTF-8
>>   listen_addresses | *
>>   log_destination | stderr
>>   log_directory | pg_log
>>   log_filename | postgresql-%a.log
>>   log_line_prefix | %t [%p]: [%l-1] host=%h user=%u,db=%d
>>   log_min_duration_statement | 4s
>>   log_rotation_age | 1d
>>   log_rotation_size | 0
>>   log_truncate_on_rotation | on
>>   logging_collector | on
>>   maintenance_work_mem | 1GB
>>   max_connections | 500
>>   max_stack_depth | 2MB
>>   port | 5432
>>   server_encoding | UTF8
>>   shared_buffers | 512MB
>>   TimeZone | US/Eastern
>>   wal_sync_method | fdatasync
>>   work_mem | 256MB
>> (31 rows)
>>
>> Please let me know if you find anything wrong here.
>> Thanks.
>
> I don't see anything obvious. Putting this back on the list, where
> it should have stayed all along. Maybe someone else has an idea;
> I've only seen such behavior when there were archiving problems
> which were showing up in the server log.

Well just for grins and to go to the source, look in the postgresql.conf
file itself.

In particular what is the settings for:

wal_keep_segments

Also what does querying pg_stat_activity as the postgres user show?

Looking for queries that have been held open a long time.

>
> -Kevin
>
>


--
Adrian Klaver
adrian.klaver@gmail.com


Re: pg_xlog is getting bigger

From
Adrian Klaver
Date:
On 12/19/2012 01:39 PM, AI Rumman wrote:

>
>     Well just for grins and to go to the source, look in the
>     postgresql.conf file itself.
>
>     In particular what is the settings for:
>
>     wal_keep_segments
>
>     Also what does querying pg_stat_activity as the postgres user show?
>
>     Looking for queries that have been held open a long time.
>
>
>
>
> wal_keep_segment = 0
> and in pg_stat_activity all processes are idle.


Please reply to the list also.

Well the question is how long have those idle transactions been around?


For a good blog on the subject see:
http://www.depesz.com/2008/08/28/hunting-idle-in-transactions/

--
Adrian Klaver
adrian.klaver@gmail.com


Re: pg_xlog is getting bigger

From
Adrian Klaver
Date:
On 12/19/2012 02:00 PM, AI Rumman wrote:
>

>
>     Please reply to the list also.
>
>     Well the question is how long have those idle transactions been around?
>
>
>     For a good blog on the subject see:
>     http://www.depesz.com/2008/08/__28/hunting-idle-in-__transactions/
>     <http://www.depesz.com/2008/08/28/hunting-idle-in-transactions/>
>
>     --
>     Adrian Klaver
>     adrian.klaver@gmail.com <mailto:adrian.klaver@gmail.com>
>
>
> Thanks for the link. But there is no connection showing 'Idle in
> Transaction'.
> All are showing only '<IDLE>'.

Hmm, got ahead of myself.
Does pg_locks show anything interesting?
Please hit Reply All so the rest of the list can stay in the loop. More
eyes = faster solution:)


--
Adrian Klaver
adrian.klaver@gmail.com


Re: pg_xlog is getting bigger

From
Adrian Klaver
Date:
On 12/19/2012 01:16 PM, Kevin Grittner wrote:
> AI Rumman wrote:
>> Kevin Grittner <kgrittn@mail.com> wrote:
>>> AI Rumman wrote:
>>>> Kevin Grittner <kgrittn@mail.com> wrote:
>>>>> AI Rumman wrote:
>>>>>
>>>>>> I am working on a Postgresql 9.0 server. I have no replication and
>>>>>> archive mode setup. But I found that the pg_xlog is getting bigger
>>>>>> and bigger. Right now it is 20 GB.
>>>>>>
>>>>>> How should I recover these spaces?
>>>>>
>>>>> Do you have archiving turned on? Are you getting errors in the server
>>>>> log related to failures of the archiving?
>>>>
>>>> I don't have archiving turned on. Can I remove some old xlog files?
>>>
>>> No. You can corrupt your database if you delete from that directory
>>> directly. Please post the results from running the query on this
>>> page:
>>>
>>> http://wiki.postgresql.org/wiki/Server_Configuration
>
>>   version | PostgreSQL 9.0.4 on x86_64-unknown-linux-gnu,
>> compiled by GCC gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-48), 64-bit
>>   bgwriter_delay | 300ms
>>   bgwriter_lru_maxpages | 100
>>   bgwriter_lru_multiplier | 2
>>   bytea_output | escape
>>   checkpoint_segments | 300

Aah, just noticed the above.

http://www.postgresql.org/docs/9.0/interactive/wal-configuration.html

"
Checkpoints are points in the sequence of transactions at which it is
guaranteed that the heap and index data files have been updated with all
information written before the checkpoint. At checkpoint time, all dirty
data pages are flushed to disk and a special checkpoint record is
written to the log file. (The changes were previously flushed to the WAL
files.) In the event of a crash, the crash recovery procedure looks at
the latest checkpoint record to determine the point in the log (known as
the redo record) from which it should start the REDO operation. Any
changes made to data files before that point are guaranteed to be
already on disk. Hence, after a checkpoint, log segments preceding the
one containing the redo record are no longer needed and can be recycled
or removed. (When WAL archiving is being done, the log segments must be
archived before being recycled or removed.)...


...A checkpoint is created every checkpoint_segments log segments,...
"

Given that log segments are 16MB this means you will get 16MB * 300
segments of logs before they can start to be recycled. Does this
correspond with the directory size you are seeing in pg_xlog?  FYI the
default value is 3.


--
Adrian Klaver
adrian.klaver@gmail.com


Re: pg_xlog is getting bigger

From
Tom Lane
Date:
Adrian Klaver <adrian.klaver@gmail.com> writes:
> Well the question is how long have those idle transactions been around?

Idle transactions shouldn't have anything to do with pg_xlog bloat.
What causes xlog bloat is inability to release old WAL because either
(a) we're not able to complete checkpoints, or (b) WAL archiving is
enabled but malfunctioning, and the old WAL segments are being kept
pending successful archiving.

Either (a) or (b) should result in bleating in the postmaster log.

            regards, tom lane


Re: pg_xlog is getting bigger

From
Adrian Klaver
Date:
On 12/19/2012 04:12 PM, Tom Lane wrote:
> Adrian Klaver <adrian.klaver@gmail.com> writes:
>> Well the question is how long have those idle transactions been around?
>
> Idle transactions shouldn't have anything to do with pg_xlog bloat.
> What causes xlog bloat is inability to release old WAL because either
> (a) we're not able to complete checkpoints, or (b) WAL archiving is
> enabled but malfunctioning, and the old WAL segments are being kept
> pending successful archiving.

Its obvious I am missing something important about WAL.
Scenario:
1) Transaction is opened and say many UPDATEs are done.
2) This means there is now an old tuple and a new tuple for the previous
row.
3) The transaction is not committed.

I assumed the WAL logs contained information necessary to either go
forward to the new on commit or go back to the old on rollback. I
further assumed the log segment(s) could not be released until either a
commit/rollback was done.

At this point I figure I the above assumption is wrong or my
understanding of <IDLE in TRANSACTION> is wrong or both!


>
> Either (a) or (b) should result in bleating in the postmaster log.
>
>             regards, tom lane
>
>


--
Adrian Klaver
adrian.klaver@gmail.com


Re: pg_xlog is getting bigger

From
AI Rumman
Date:


On Wed, Dec 19, 2012 at 7:52 PM, Adrian Klaver <adrian.klaver@gmail.com> wrote:
On 12/19/2012 04:12 PM, Tom Lane wrote:
Adrian Klaver <adrian.klaver@gmail.com> writes:
Well the question is how long have those idle transactions been around?

Idle transactions shouldn't have anything to do with pg_xlog bloat.
What causes xlog bloat is inability to release old WAL because either
(a) we're not able to complete checkpoints, or (b) WAL archiving is
enabled but malfunctioning, and the old WAL segments are being kept
pending successful archiving.

Its obvious I am missing something important about WAL.
Scenario:
1) Transaction is opened and say many UPDATEs are done.
2) This means there is now an old tuple and a new tuple for the previous row.
3) The transaction is not committed.

I assumed the WAL logs contained information necessary to either go forward to the new on commit or go back to the old on rollback. I further assumed the log segment(s) could not be released until either a commit/rollback was done.

At this point I figure I the above assumption is wrong or my understanding of <IDLE in TRANSACTION> is wrong or both!




Either (a) or (b) should result in bleating in the postmaster log.

                        regards, tom lane




--
Adrian Klaver
adrian.klaver@gmail.com

I modified checkpoint_segment to 100 form 300 and then forced some CHECKPOINT and pg_switch_xlog() and now found that the pg_xlog file got almost 1 gb of space back.

Re: pg_xlog is getting bigger

From
Tom Lane
Date:
Adrian Klaver <adrian.klaver@gmail.com> writes:
> On 12/19/2012 04:12 PM, Tom Lane wrote:
>> Idle transactions shouldn't have anything to do with pg_xlog bloat.
>> What causes xlog bloat is inability to release old WAL because either
>> (a) we're not able to complete checkpoints, or (b) WAL archiving is
>> enabled but malfunctioning, and the old WAL segments are being kept
>> pending successful archiving.

> Its obvious I am missing something important about WAL.
> Scenario:
> 1) Transaction is opened and say many UPDATEs are done.
> 2) This means there is now an old tuple and a new tuple for the previous
> row.
> 3) The transaction is not committed.

> I assumed the WAL logs contained information necessary to either go
> forward to the new on commit or go back to the old on rollback.

No, you're thinking about an Oracle-style WAL implementation.  We don't
do rollback; both tuple versions are kept around until VACUUM decides it
can reclaim one or the other.  WAL only need be kept back to the latest
checkpoint, even if there are open transactions that are much older.

            regards, tom lane


Re: pg_xlog is getting bigger

From
Tom Lane
Date:
Adrian Klaver <adrian.klaver@gmail.com> writes:
> On 12/19/2012 01:16 PM, Kevin Grittner wrote:
>> AI Rumman wrote:
>>> checkpoint_segments | 300

> Aah, just noticed the above.

I saw that too, but it doesn't seem to explain 20GB worth of pg_xlog.
The fine manual mentions somewhere that we can keep up to twice the
specified number of xlog segments, so one would expect at most 16MB
* 600 or about 9.6GB in pg_xlog.  (And this is assuming that
checkpoint_timeout is large enough to let checkpoint_segments be the
controlling factor.)  So there's something else going on.

            regards, tom lane


Re: pg_xlog is getting bigger

From
Adrian Klaver
Date:
On 12/19/2012 04:55 PM, Tom Lane wrote:
> Adrian Klaver <adrian.klaver@gmail.com> writes:
>> On 12/19/2012 04:12 PM, Tom Lane wrote:
>>> Idle transactions shouldn't have anything to do with pg_xlog bloat.
>>> What causes xlog bloat is inability to release old WAL because either
>>> (a) we're not able to complete checkpoints, or (b) WAL archiving is
>>> enabled but malfunctioning, and the old WAL segments are being kept
>>> pending successful archiving.
>
>> Its obvious I am missing something important about WAL.
>> Scenario:
>> 1) Transaction is opened and say many UPDATEs are done.
>> 2) This means there is now an old tuple and a new tuple for the previous
>> row.
>> 3) The transaction is not committed.
>
>> I assumed the WAL logs contained information necessary to either go
>> forward to the new on commit or go back to the old on rollback.
>
> No, you're thinking about an Oracle-style WAL implementation.  We don't
> do rollback; both tuple versions are kept around until VACUUM decides it
> can reclaim one or the other.  WAL only need be kept back to the latest
> checkpoint, even if there are open transactions that are much older.

So basically the WAL mechanism just records the raw data, including
transaction information,  to the log segments and leaves the cleanup to
VACUUM. So uncommitted transactions can lead to table bloat not WAL bloat.

>
>             regards, tom lane
>
>


--
Adrian Klaver
adrian.klaver@gmail.com


Re: pg_xlog is getting bigger

From
Adrian Klaver
Date:
On 12/19/2012 04:54 PM, AI Rumman wrote:
>
>
> I modified checkpoint_segment to 100 form 300 and then forced some
> CHECKPOINT and pg_switch_xlog() and now found that the pg_xlog file got
> almost 1 gb of space back.

Per Toms post this is a start but probably not the answer.
Have you looked in the pg_xlog  directory itself to see if anything
looks funny? One thing I could think of is file timestamps that do not
look right.


--
Adrian Klaver
adrian.klaver@gmail.com


Re: pg_xlog is getting bigger

From
Pankaj
Date:
Go to database
# psql -U postgres
#checkpoint;

exit

now go to pg_xlog folder you can can see a pattern in file names of logs.
Just remove the file other then current time stamp. Or for safety remove
logs other then current date.




--
View this message in context: http://postgresql.1045698.n5.nabble.com/pg-xlog-is-getting-bigger-tp5737207p5779313.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


Re: pg_xlog is getting bigger

From
Kevin Grittner
Date:
Pankaj <pankaj.mimit@gmail.com> wrote:

> now go to pg_xlog folder you can can see a pattern in file names of logs.
> Just remove the file other then current time stamp. Or for safety remove
> logs other then current date.

No!!!  NEVER remove ANY files from the pg_xlog sub-directory!  That
will corrupt your database.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: pg_xlog is getting bigger

From
Pankaj Mishra
Date:
Dera Kevin,

If you run checkpoint in database then you can delete the pg_xlog other then current timestamp. We do this to our customer on regular basis when pg_xlog exceed more then 15 gb
Regards
Pankaj


On Sat, Nov 23, 2013 at 1:48 AM, Kevin Grittner <kgrittn@ymail.com> wrote:
Pankaj <pankaj.mimit@gmail.com> wrote:

> now go to pg_xlog folder you can can see a pattern in file names of logs.
> Just remove the file other then current time stamp. Or for safety remove
> logs other then current date.

No!!!  NEVER remove ANY files from the pg_xlog sub-directory!  That
will corrupt your database.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: pg_xlog is getting bigger

From
Alvaro Herrera
Date:
Pankaj Mishra escribió:
> Dera Kevin,
>
> If you run checkpoint in database then you can delete the pg_xlog other
> then current timestamp. We do this to our customer on regular basis when
> pg_xlog exceed more then 15 gb

That is "somewhat" correct (meaning you are less likely to see data
corruption in that case), but it is still misleading advice.  The system
is prepared to delete the files it no longer needs automatically.  If
you have WAL files piling up for no apparent reason, the most likely
explanation is that the server is misconfigured, for example due to a
failing archive_command, or to a high "keep" setting for WAL files
(which is used in some replication scenarios).  In those cases, the real
solution is to fix that problem, not delete files yourself.


> On Sat, Nov 23, 2013 at 1:48 AM, Kevin Grittner <kgrittn@ymail.com> wrote:
>
> > Pankaj <pankaj.mimit@gmail.com> wrote:
> >
> > > now go to pg_xlog folder you can can see a pattern in file names of logs.
> > > Just remove the file other then current time stamp. Or for safety remove
> > > logs other then current date.
> >
> > No!!!  NEVER remove ANY files from the pg_xlog sub-directory!  That
> > will corrupt your database.
> >
> > --
> > Kevin Grittner
> > EDB: http://www.enterprisedb.com
> > The Enterprise PostgreSQL Company
> >


--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


Re: pg_xlog is getting bigger

From
Andrew Sullivan
Date:
On Mon, Nov 25, 2013 at 03:00:56PM +0530, Pankaj Mishra wrote:
> We do this to our customer on regular basis when
> pg_xlog exceed more then 15 gb

Wow.  Your poor customers.  But I'm glad you phrased it "do this to
our customer" rather than "do this _for_ our customer", since your
customer certainly is having something done to them.

--
Andrew Sullivan
ajs@crankycanuck.ca