Re: COPY IN/BOTH vs. extended query mode - Mailing list pgsql-hackers
| From | Jeff Davis |
|---|---|
| Subject | Re: COPY IN/BOTH vs. extended query mode |
| Date | |
| Msg-id | dac86f9315450b12d419c602281fe682f2b24e84.camel@j-davis.com Whole thread Raw |
| In response to | [HACKERS] COPY IN/BOTH vs. extended query mode (Robert Haas <robertmhaas@gmail.com>) |
| List | pgsql-hackers |
On Mon, 2017-01-23 at 21:12 -0500, Robert Haas wrote:
> According to the documentation for COPY IN mode, "If the COPY command
> was issued via an extended-query message, the backend will now
> discard
> frontend messages until a Sync message is received, then it will
> issue
> ReadyForQuery and return to normal processing." I added a similar
> note to the documentation for COPY BOTH mode in
> 91fa8532f4053468acc08534a6aac516ccde47b7, and the documentation
> accurately describes the behavior of the server. However, this seems
> to make fully correct error handling for clients using libpq almost
> impossible, because PQsendQueryGuts() sends
> Parse-Bind-Describe-Execute-Sync in one shot without regard to
> whether
> the command that was just sent invoked COPY mode (cf. the note in
> CopyGetData about why we ignore Flush and Sync in that function).
>
> So imagine that the client uses libpq to send (via the extended query
> protocol) a COPY IN command (or some hypothetical command that starts
> COPY BOTH mode to begin). If the server throws an error before the
> Sync message is consumed, it will bounce back to PostgresMain which
> will set doing_extended_query_message = true after which it will
> consume messages, find the Sync, reset that flag, and send
> ReadyForQuery. On the other hand, if the server enters CopyBoth
> mode,
> consumes the Sync message in CopyGetData (or a similar function), and
> *then* throws an ERROR, the server will wait for a second Sync
> message
> from the client before issuing ReadyForQuery. There is no sensible
> way of coping with this problem in libpq, because there is no way for
> the client to know which part of the server code consumed the Sync
> message that it already sent. In short, from the client's point of
> view, if it enters COPY IN or COPY BOTH mode via the extend query
> protocol, and an error occurs on the server, the server MAY OR MAY
> NOT
> expect a further Sync message before issuing ReadyForQuery, and the
> client has no way of knowing -- except maybe waiting for a while to
> see what happens.
I investigated a bit deeper here, and I'm not sure this is a real
problem (aside from ambiguity in the protocol docs).
If you send "COPY ... FROM STDIN" using the extended query protocol in
libpq, the non-error message flow is something like:
-> Parse + Bind + Describe + Execute + Sync
[ server processes Parse + Bind + Describe + Execute ]
[ server enters copy-in mode ]
<- CopyInResponse
[ server ignores Sync ]
-> CopyData
[ server processes CopyData ]
-> CopyDone
[ server processes CopyDone ]
[ server exits copy-in mode ]
-> Sync
[ server processes Sync ]
<- ReadyForQuery
If an error happens before the server enters copy-in mode (e.g. syntax
error), then you get something like:
-> Parse + Bind + Describe + Execute + Sync
[ server processes
Parse, encounters error ]
<- ErrorResponse
[ server ignores Bind +
Describe + Execute ]
[ server processes Sync ]
<- ReadyForQuery
[
client never got CopyInResponse, so never sent copy messages ]
If an error happens after the CopyInResponse is sent (e.g. malformed
data), you get something like:
-> Parse + Bind + Describe + Execute + Sync
[ server processes Bind + Describe + Execute ]
[ server enters copy-in mode ]
<- CopyInResponse
[ server ignores Sync ]
-> CopyData
[ server processes CopyData, encounters error ]
[ server exits copy-in mode ]
<- ErrorResponse
-> CopyDone
[ server ignores CopyDone ]
-> Sync
[ server processes Sync ]
<- ReadyForQuery
If the backend is canceled after the server sends CopyInResponse but
before it consumes (and ignores) the Sync, you get something like:
-> Parse + Bind + Describe + Execute + Sync
[ server processes Bind + Describe + Execute ]
[ server enters copy-in mode ]
<- CopyInResponse
[ SIGINT, server encounters error ]
<- ErrorResponse
[ server exits copy-in mode ]
[ server processes Sync ]
<- ReadyForQuery
-> CopyData
[ server ignores CopyData ]
-> CopyDone
[ server ignores CopyDone ]
-> Sync
[ server processes Sync ]
<- ReadyForQuery
The last case is not great, because I could imagine it confusing a
client, but I'm not sure about exactly how, and maybe it's something we
can document around?
Regards,
Jeff Davis
pgsql-hackers by date: