Thread: Race condition between hot_standby_feedback and wal_receiver_status_interval ?

Race condition between hot_standby_feedback and wal_receiver_status_interval ?

From

Stuart Bishop

Date:

12 September 2012, 12:14:04

Hi.

I am running streaming replication with PostgreSQL 9.1.5, and using
hot_standby_feedback=on.

Per previous messages, I'm still experiencing query cancellations on
the hot standbys triggered by vacuums on the primary

(http://postgresql.1045698.n5.nabble.com/pg-dump-on-hot-standby-canceled-despite-hot-standby-feedback-on-td5719753.html).

pg_dump: Error message from server: ERROR: canceling statement due to
conflict with recovery
DETAIL: User was holding shared buffer pin for too long.
pg_dump: The command was: COPY public.webcatalog_machine (id,
owner_id, uuid, hostname, packages_checksum, package_list,
logo_checksum) TO stdout;

pg_dump: Error message from server: ERROR: canceling statement due to
conflict with recovery
DETAIL: User was holding a relation lock for too long.

Looking at the documentation for hot_standby_feedback, "this parameter
can be used to eliminate query cancels caused by cleanup records", but
"feedback messages will not be sent more frequently than once per
wal_receiver_status_interval".

I'm wondering if there is a race condition here, where if vacuum kicks
in on the primary before feedback information has been sent, then
records could be removed that are still needed on the hot standby.

My initial thought was vacuum_defer_cleanup_age could be used to check
this, but it describes itself correctly as fairly useless since it is
set in transactions rather than with a time interval and recommends
using hot_standby_feedback as an alternative.

--
Stuart Bishop <stuart@stuartbishop.net>
http://www.stuartbishop.net/