Weird spikes in delay for async streaming replication on 9.1 - Mailing list pgsql-admin

From David F. Skoll
Subject Weird spikes in delay for async streaming replication on 9.1
Date
Msg-id 20150214110050.2f032f05@ollie.roaringpenguin.com
Whole thread Raw
Responses Re: Weird spikes in delay for async streaming replication on 9.1
List pgsql-admin
Hi,

I have a two-database cluster.  The machines are geographically
separated and the nature of my application is that many read-only
queries can tolerate being "behind the times" by a few seconds.  So
machines near the hot-standby connect to the hot-standby for these
delay-tolerant queries in order to reduce traffic over the relatively
slow link between geographical locations.

I have a monitoring script that tests the actual delay for a
transaction on the master to appear on the hot-standby.  Every few
minutes, my script runs an update on the master and then sits in a
loop checking how long it takes to appear on the hot-standby.  99% of
the time, it's less than a second.

But every once in a while, the time spikes dramatically, to hundreds
or thousands of seconds, and that's too long... the delay-tolerant
queries are not *that* delay-tolerant, so we switch to sending them
all to the master.

See the graph: http://ibin.co/1rdm4ekiWmpM

I've tried to figure out what causes this, and the only events I can
find that correlate are a pg_dump on the master and possibly some
autovacuum jobs kicking off.  So my questions:

1) Can a long-running transaction on the master block subsequent
transactions from being consumed on the hot-standby, or am I totally
out to lunch?

2) If (1) is correct, is it still true in 9.4?

3) If (1) is false, does anyone have plausibly explanations for what
I'm seeing?  I don't think it's the link between the sites, because we
also monitor that and it seems to be fine.

Regards,

David.



pgsql-admin by date:

Previous
From: Dave Johansen
Date:
Subject: Updating .so files for functions?
Next
From: Tom Lane
Date:
Subject: Re: Updating .so files for functions?