On 2016-11-28 22:14:55 +0100, Torsten Förtsch wrote:
> Hi,
>
> I am in the process of reviewing our configs for a number of 9.3 databases
> and found a replica with hot_standby_feedback=on. I remember when we set it
> long ago we were fighting cancelled queries. I also remember that it never
> really worked for us. In the end we set up 2 replicas, one suitable for
> short queries where we prefer low replication lag, and another one where we
> allow for long running queries but sacrifice timeliness
> (max_standby_*_delay=-1).
There's a few kind of conflicts against which hs_feedback doesn't
protect. E.g. exclusive locks on tables that are in use and such
(e.g. by vacuum truncating a table or an explicit drop table).
There's a table with some information about the causes of cancellations,
pg_stat_database_conflicts - did you check that?
> I have a hunch why hot_standby_feedback=on didn't work. But I never
> verified it. So, here it is. The key is this sentence:
>
> "Feedback messages will not be sent more frequently than once per
> wal_receiver_status_interval."
>
> That interval is 10 sec. So, assuming a transaction on the replica uses a
> row right after the message has been sent. Then there is a 10 sec window in
> which the master cannot know that the row is needed on the replica and can
> vacuum it. If then the transaction on the replica takes longer than
> max_standby_*_delay, the only option is to cancel it.
>
> Is that explanation correct?
No. That just means that we don't update the value more frequently. The
value reported is a "horizon" meaning that nothing older than the
reported value can be accessed.
Greetings,
Andres Freund