On Thu, 2021-07-22 at 21:17 +0000, Bossart, Nathan wrote:
> As previously discussed [0], canceling synchronous replication waits
> can have the unfortunate side effect of making transactions visible on
> a primary server before they are replicated. A failover at this time
> would cause such transactions to be lost.
>
> AFAICT there are a variety of ways that the aforementioned problem may
> occur:
> 4. Query cancellations and backend terminations: This appears to be
> the only gap where there is no way to avoid potential data loss,
> and it is the main target of my proposal.
>
> Instead of blocking query cancellations and backend terminations, I
> think we should allow them to proceed, but we should keep the
> transactions marked in-progress so they do not yet become visible to
> sessions on the primary. Once replication has caught up to the
> the necessary point, the transactions can be marked completed, and
> they would finally become visible.
>
> The main advantages of this approach are 1) it still allows for
> canceling waits for synchronous replication and 2) it provides an
> opportunity to view and manage waits for synchronous replication
> outside of the standard cancellation/termination functionality. The
> tooling for 2 could even allow a session to begin waiting for
> synchronous replication again if it "inadvertently interrupted a
> replication wait..." [4]. I think the main disadvantage of this
> approach is that transactions committed by a session may not be
> immediately visible to the session when the command returns after
> canceling the wait for synchronous replication. Instead, the
> transactions would become visible in the future once the change is
> replicated. This may cause problems for an application if it doesn't
> handle this scenario carefully.
>
> What are folks' opinions on this idea? Is this something that is
> worth prototyping?
But that would mean that changes ostensibly rolled back (because the
cancel request succeeded) will later turn out to be committed after all,
just like it is now (only later). Where is the advantage?
Besides, there is no room for another transaction status in the
commit log.
Yours,
Laurenz Albe