Il 26/06/15 15:43, marco.nenciarini@2ndquadrant.it ha scritto:
> The following bug has been logged on the website:
>
> Bug reference: 13473
> Logged by: Marco Nenciarini
> Email address: marco.nenciarini@2ndquadrant.it
> PostgreSQL version: 9.4.4
> Operating system: all
> Description:
>
> = Symptoms
>
> Let's have a simple master -> standby setup, with hot_standby_feedback
> activated,
> if a backend on standby is holding the cluster xmin and the master runs a
> VACUUM FREEZE
> on the same database of the standby's backend, it will generate a conflict
> and the query
> running on standby will be canceled.
>
> = How to reproduce it
>
> Run the following operation on an idle cluster.
>
> 1) connect to the standby and simulate a long running query:
>
> select pg_sleep(3600);
>
> 2) connect to the master and run the following script
>
> create table t(id int primary key);
> insert into t select generate_series(1, 10000);
> vacuum freeze verbose t;
> drop table t;
>
> 3) after 30 seconds the pg_sleep query on standby will be canceled.
>
> = Expected output
>
> The hot standby feedback should have prevented the query cancellation
>
> = Analysis
>
> Ive run postgres at DEBUG2 logging level, and I can confirm that the vacuum
> correctly see the OldestXmin propagated by the standby through the hot
> standby feedback.
> The issue is in heap_xlog_freeze function, which calls
> ResolveRecoveryConflictWithSnapshot as first thing, passing the cutoff_xid
> value as first argument.
> The cutoff_xid is the OldestXmin active when the vacuum, so it represents a
> running xid.
> The issue is that the function ResolveRecoveryConflictWithSnapshot expects
> as first argument of is latestRemovedXid, which represent the higher xid
> that has been actually removed, so there is an off-by-one error.
>
> I've been able to reproduce this issue for every version of postgres since
> 9.0 (9.0, 9.1, 9.2, 9.3, 9.4 and current master)
>
> = Proposed solution
>
> In the heap_xlog_freeze we need to subtract one to the value of cutoff_xid
> before passing it to ResolveRecoveryConflictWithSnapshot.
>
>
>
Attached a proposed patch that solves the issue.
Regards,
Marco
--
Marco Nenciarini - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
marco.nenciarini@2ndQuadrant.it | www.2ndQuadrant.it