Re: Postgres slave not catching up (on 9.2)

From: Andres Freund
Subject: Re: Postgres slave not catching up (on 9.2)
Date: ,
Msg-id: 20141109224446.GC28007@alap3.anarazel.de
(view: Whole thread, Raw)
In response to: Postgres slave not catching up (on 9.2)  (Ruben Domingo Gaspar Aparicio)
Responses: Re: Postgres slave not catching up (on 9.2)  (Ruben Domingo Gaspar Aparicio)
List: pgsql-performance

Tree view

Postgres slave not catching up (on 9.2)  (Ruben Domingo Gaspar Aparicio, )
 Re: Postgres slave not catching up (on 9.2)  (Robert Klemme, )
  Re: Postgres slave not catching up (on 9.2)  (Ruben Domingo Gaspar Aparicio, )
 Re: Postgres slave not catching up (on 9.2)  (Andres Freund, )
  Re: Postgres slave not catching up (on 9.2)  (Ruben Domingo Gaspar Aparicio, )
 Re: Postgres slave not catching up (on 9.2)  (Ruben Domingo Gaspar Aparicio, )
 Re: Postgres slave not catching up (on 9.2)  (Ruben Domingo Gaspar Aparicio, )

Hi,

On 2014-11-08 13:11:25 +0000, Ruben Domingo Gaspar Aparicio wrote:
> Hello,
>
> I have built up a hot-standby between a master running 9.2.4 and a slave running 9.2.9. I did the initial copy using
"pg_basebackup". My recovery.conf looks like: 
>
> standby_mode = 'on'
> primary_conninfo = 'host=pXXXXXXXX port=XXXX user=replicator'
> trigger_file = 'failover.now'
> restore_command = 'test -f /ORA/dbs02/PUPDBTST/archive/%f  && ln -fns /ORA/dbs02/PUPDBTST/archive/%f %p'
> archive_cleanup_command = '/usr/local/pgsql/postgresql-9.2.9/bin/pg_archivecleanup /ORA/dbs03/PUPDBTST/data %r'
>
> The slave (I don't have control on the master) is using 2 NFS file systems, one for WALs and another one for the
data,on Netapp controllers: 
>
> dbnasg401-12a:/vol/dodpupdbtst02 on /ORA/dbs02/PUPDBTST type nfs
(rw,remount,hard,nointr,rsize=32768,wsize=32768,tcp,actimeo=0,timeo=600)
> dbnasg403-12a:/vol/dodpupdbtst03 on /ORA/dbs03/PUPDBTST type nfs
(rw,remount,hard,nointr,rsize=32768,wsize=32768,tcp,actimeo=0,timeo=600)
>
> The streaming is working perfectly:
>
> /usr/local/pgsql/postgresql-9.2.9/bin/psql -U admin -h XXXXX -p XXXXX -d puppetdb -c "select pid,usesysid, usename,
application_name,client_addr, state, sent_location,write_location,replay_location,sync_state from pg_stat_replication;" 
>   pid   | usesysid |  usename   | application_name | client_addr |   state   | sent_location | write_location |
replay_location| sync_state 
>
--------+----------+------------+------------------+-------------+-----------+---------------+----------------+-----------------+------------
> 117659 |    16384 | replicator | walreceiver      | 10.16.7.137 | streaming | AA74/DD630978 | AA74/DD630978  |
A977/F84F0BE0  | async 
> (1 row)
>
> But the lag is increasing constantly, it looks the replay can not cope with:

I have a couple of questions:
1) Is the standby actually used for querying? Is it possible that replay
   frequently conflicts with active queries? As you don't have
   hot_standby_feedback enabled that seems quite possible.
2) Is the startup process on the standby CPU or IO bound?
3) Does the workload involve loads of temporary tables or generally
   transactions locking lots of tables exclusively in one transaction?

Greetings,

Andres Freund

--
 Andres Freund                       http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



pgsql-performance by date:

From: Andres Freund
Date:
Subject: Re: Postgres slave not catching up (on 9.2)
From: Ruben Domingo Gaspar Aparicio
Date:
Subject: Re: Postgres slave not catching up (on 9.2)