Re: pg_standby replication problem - Mailing list pgsql-general

From Khangelani Gama
Subject Re: pg_standby replication problem
Date
Msg-id 36e864716fcb063194f5f95e5fc0b35c@mail.gmail.com
Whole thread Raw
In response to Re: pg_standby replication problem  (Alan Hodgson <ahodgson@simkin.ca>)
List pgsql-general
-----Original Message-----
From: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org] On Behalf Of Alan Hodgson
Sent: Monday, June 09, 2014 4:51 PM
To: pgsql-general@postgresql.org
Subject: Re: [GENERAL] pg_standby replication problem

On Monday, June 09, 2014 04:28:53 PM Khangelani Gama wrote:
> Please help me with this, my secondary server shows a replication problem.
> It stopped at the file called *0000000500004BAF000000AF …*then from
> here primary server kept on sending walfiles, until the walfiles used
> up the disc space in the data directory. How do I fix this problem.
> It’s postgres 9.1.2.
>

It looks to me like your archive_command is probably failing on the primary
server. If that fails, the logs will build up and fill up your disk as
described. And they wouldn't be available to the slave to find.


I am sorry, I am still trying to understand all the settings, the person who
set up the servers left the company.

In primary server, postgresql.conf shows the following:

# WRITE AHEAD LOG
#------------------------------------------------------------------------------

# - Settings -

wal_level = archive
# - Checkpoints -

checkpoint_segments = 128
checkpoint_timeout = 15min
checkpoint_warning = 885s
# - Archiving -

archive_mode = on
#archive_mode = off             # allows archiving to be done
archive_command = '/home/cdbs/bin/run_replication.sh %p %f'

# REPLICATION
#------------------------------------------------------------------------------

# - Master Server -

# These settings are ignored on a standby server

max_wal_senders = 3



The setting archive_command points to a script being run and the variable %p
and %f being passed.




There is replication script running in the primary server  has the
following:


while [ $test = "false" ]
do
        rsync -a /pgsql2/data/${src}
postgres@10.58.101.10:/pgsql2/walfiles/${dest} >>
/tmp/run_replication.sh.out 2>> /tmp/run_replication.sh.out
        test=`ssh AB_CDS3 "if [ -f /pgsql2/walfiles/${dest} ];then echo
'true' ;else echo 'false';fi"`
        if [ ${test} = "false" ]
        then
                echo "Test is false for CDS3, sleeping 10" >>
/tmp/run_replication.sh.out
                sleep 10
                cnt=$(( $cnt + 1 ))
                if [ ${cnt} -ge 60 ]
                then
                        message="Replication ERROR: Unable to send WAL
file(${desc}) from CDS to CDS3"
                        echo "`date` : ${message}" >>
/tmp/run_replication.sh.out
                        sendsms
                fi
        fi
done





--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make
changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


CONFIDENTIALITY NOTICE
The contents of and attachments to this e-mail are intended for the addressee only, and may contain the confidential
information of Argility (Proprietary) Limited and/or its subsidiaries. Any review, use or dissemination thereof by
anyone
other than the intended addressee is prohibited.If you are not the intended addressee please notify the writer
immediately
and destroy the e-mail. Argility (Proprietary) Limited and its subsidiaries distance themselves from and accept no
liability
for unauthorised use of their e-mail facilities or e-mails sent other than strictly for business purposes.



pgsql-general by date:

Previous
From: Alan Hodgson
Date:
Subject: Re: pg_standby replication problem
Next
From: Khangelani Gama
Date:
Subject: Re: pg_standby replication problem