Re: Error code for "terminating connection due to conflict with recovery" - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: Error code for "terminating connection due to conflict with recovery"
Date
Msg-id 1295614119.1803.10554.camel@ebony
Whole thread Raw
In response to Re: Error code for "terminating connection due to conflict with recovery"  (Florian Pflug <fgp@phlo.org>)
Responses Re: Error code for "terminating connection due to conflict with recovery"  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Fri, 2011-01-21 at 13:09 +0100, Florian Pflug wrote:
> > 
> >>> I'd also be in favor of changing the one that uses
> >>> ERRCODE_QUERY_CANCELLED to use ERRCODE_T_R_SERIALIZATION_FAILURE, as
> >>> the former might be taken to imply active user intervention, and for
> >>> consistency.
> >> 
> >> +1.
> > 
> > We already use ERRCODE_T_R_SERIALIZATION_FAILURE for retryable errors,
> > which is almost every error. So no change required there.
> > 
> > ERRCODE_ADMIN_SHUTDOWN is used only in situations where we cannot
> > reconnect or retry because the database we said we wished to connect to
> > no longer exists. That needs to be a different error code to a normal,
> > retryable error, so that pgpool can tell the difference between things
> > it can help with and things it cannot help with.
> 
> Yeah. Clients absolutely need to be able to distinguish transient and
> permanent errors. Otherwise, how would a client know when to retry
> a transaction (as he needs to in case of a serialization anomaly) and
> when to report the error to the user?
> 
> ERRCODE_T_R_SERIALIZATION_FAILURE  and ERRCODE_T_R_DEADLOCK_DETECTED
> are probably both assumed to be transient failure by client aready. So
> we should use those two for transient recovery conflicts (i.e. those
> which go away if you retry) and something else for the others (like
> database dropped)
> 
> This'd mean that the code is fine as it is, except that we should
> raise ERRCODE_T_R_DEADLOCK_DETECTED instead of ERRCODE_QUERY_CANCELED
> in CheckRecoveryConflictDeadlock(). I might be missing something though -
> Simon, what were your reasons for using ERRCODE_QUERY_CANCELED there?

Ah, thanks Florian. Now I understand. There are two related issues here.

1. The discussion around ERRCODE_ADMIN_SHUTDOWN is incorrect and the
specific patch should be rejected as is. No changes are required in
ProcessInterrupts(), nor new errcodes.

2. Robert is correct that CheckRecoveryConflictDeadlock() returns
ERRCODE_QUERY_CANCELED. Thanks to Florian for noting that we had
switched away from the original discussion onto another part of the
code, which confused me. I agree the use of ERRCODE_QUERY_CANCELED is a
mistake; CheckRecoveryConflictDeadlock() should return
ERRCODE_T_R_DEADLOCK_DETECTED. This was an omission from my commit of 12
May 2010.

Tatsuo, would you like to modify the patch to correct the issue in
CheckRecoveryConflictDeadlock() ? Or would you prefer me to fix?

This should be backpatched to 9.0.

-- Simon Riggs           http://www.2ndQuadrant.com/books/PostgreSQL Development, 24x7 Support, Training and Services



pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: Sync Rep for 2011CF1
Next
From: Florian Pflug
Date:
Subject: Re: SSI and Hot Standby