Re: URGENT issue: pg-xlog growing on master! - Mailing list pgsql-performance

From Niels Kristian Schjødt
Subject Re: URGENT issue: pg-xlog growing on master!
Date
Msg-id 44F3F6DC-A0BB-407A-AA4D-3879AE5DE221@autouncle.com
Whole thread Raw
In response to Re: URGENT issue: pg-xlog growing on master!  (Jeff Janes <jeff.janes@gmail.com>)
Responses Re: URGENT issue: pg-xlog growing on master!
List pgsql-performance
Okay, cool

You mean that I should do the following right?:

1. Stop slave server
2. set archive_command = 'true' in postgresql.conf on the master server
3. restart master server
4. run psql -c "SELECT pg_start_backup('label', true)" on master
5. run rsync -av --exclude postmaster.pid --exclude pg_xlog /var/lib/postgresql/9.2/main/ postgres@192.168.0.2:/var/lib/postgresql/9.2/main/" on master server
6. run psql -c "SELECT pg_stop_backup();" on master server
7. change archive_command back on master
8. restart master
9. start slave

Just to confirm the approach :-)



Den 10/06/2013 kl. 19.53 skrev Jeff Janes <jeff.janes@gmail.com>:

On Mon, Jun 10, 2013 at 8:35 AM, Niels Kristian Schjødt <nielskristian@autouncle.com> wrote:

WARNING:  pg_stop_backup still waiting for all required WAL segments to be archived (1920 seconds elapsed)
HINT:  Check that your archive_command is executing properly.  pg_stop_backup can be canceled safely, but the database backup will not be usable without all the WAL segments.

When looking at ps aux on the master, I see the following:

postgres 30930  0.0  0.0  98412  1632 ?        Ss   15:59   0:02 postgres: archiver process   failed on 0000000200000E1B000000A9

The file mentioned is the one that it was about to archive, when the standby server failed. Somehow it must still be trying to "catch up" from that file which of cause isn't there any more, since I had to remove those in order to get more space on the HDD.

So the archive_command is failing because it is trying to archive a file that no longer exists.

One way around this is to remove the .ready files from the pg_xlog/archive_status directory, which correspond to the WAL files you manually removed.  

Another way would be to temporarily replace the archive_command with one that will report success even when the archiving fails, until the archiver gets paste this stretch.  In fact you could just replace the command with 'true', so it reports success without even doing anything.

Cheers,

Jeff

pgsql-performance by date:

Previous
From: Matheus de Oliveira
Date:
Subject: Re: URGENT issue: pg-xlog growing on master!
Next
From: Jeff Janes
Date:
Subject: Re: URGENT issue: pg-xlog growing on master!