Re: COPY IN/BOTH vs. extended query mode - Mailing list pgsql-hackers
From | Jeff Davis |
---|---|
Subject | Re: COPY IN/BOTH vs. extended query mode |
Date | |
Msg-id | dac86f9315450b12d419c602281fe682f2b24e84.camel@j-davis.com Whole thread Raw |
In response to | [HACKERS] COPY IN/BOTH vs. extended query mode (Robert Haas <robertmhaas@gmail.com>) |
List | pgsql-hackers |
On Mon, 2017-01-23 at 21:12 -0500, Robert Haas wrote: > According to the documentation for COPY IN mode, "If the COPY command > was issued via an extended-query message, the backend will now > discard > frontend messages until a Sync message is received, then it will > issue > ReadyForQuery and return to normal processing." I added a similar > note to the documentation for COPY BOTH mode in > 91fa8532f4053468acc08534a6aac516ccde47b7, and the documentation > accurately describes the behavior of the server. However, this seems > to make fully correct error handling for clients using libpq almost > impossible, because PQsendQueryGuts() sends > Parse-Bind-Describe-Execute-Sync in one shot without regard to > whether > the command that was just sent invoked COPY mode (cf. the note in > CopyGetData about why we ignore Flush and Sync in that function). > > So imagine that the client uses libpq to send (via the extended query > protocol) a COPY IN command (or some hypothetical command that starts > COPY BOTH mode to begin). If the server throws an error before the > Sync message is consumed, it will bounce back to PostgresMain which > will set doing_extended_query_message = true after which it will > consume messages, find the Sync, reset that flag, and send > ReadyForQuery. On the other hand, if the server enters CopyBoth > mode, > consumes the Sync message in CopyGetData (or a similar function), and > *then* throws an ERROR, the server will wait for a second Sync > message > from the client before issuing ReadyForQuery. There is no sensible > way of coping with this problem in libpq, because there is no way for > the client to know which part of the server code consumed the Sync > message that it already sent. In short, from the client's point of > view, if it enters COPY IN or COPY BOTH mode via the extend query > protocol, and an error occurs on the server, the server MAY OR MAY > NOT > expect a further Sync message before issuing ReadyForQuery, and the > client has no way of knowing -- except maybe waiting for a while to > see what happens. I investigated a bit deeper here, and I'm not sure this is a real problem (aside from ambiguity in the protocol docs). If you send "COPY ... FROM STDIN" using the extended query protocol in libpq, the non-error message flow is something like: -> Parse + Bind + Describe + Execute + Sync [ server processes Parse + Bind + Describe + Execute ] [ server enters copy-in mode ] <- CopyInResponse [ server ignores Sync ] -> CopyData [ server processes CopyData ] -> CopyDone [ server processes CopyDone ] [ server exits copy-in mode ] -> Sync [ server processes Sync ] <- ReadyForQuery If an error happens before the server enters copy-in mode (e.g. syntax error), then you get something like: -> Parse + Bind + Describe + Execute + Sync [ server processes Parse, encounters error ] <- ErrorResponse [ server ignores Bind + Describe + Execute ] [ server processes Sync ] <- ReadyForQuery [ client never got CopyInResponse, so never sent copy messages ] If an error happens after the CopyInResponse is sent (e.g. malformed data), you get something like: -> Parse + Bind + Describe + Execute + Sync [ server processes Bind + Describe + Execute ] [ server enters copy-in mode ] <- CopyInResponse [ server ignores Sync ] -> CopyData [ server processes CopyData, encounters error ] [ server exits copy-in mode ] <- ErrorResponse -> CopyDone [ server ignores CopyDone ] -> Sync [ server processes Sync ] <- ReadyForQuery If the backend is canceled after the server sends CopyInResponse but before it consumes (and ignores) the Sync, you get something like: -> Parse + Bind + Describe + Execute + Sync [ server processes Bind + Describe + Execute ] [ server enters copy-in mode ] <- CopyInResponse [ SIGINT, server encounters error ] <- ErrorResponse [ server exits copy-in mode ] [ server processes Sync ] <- ReadyForQuery -> CopyData [ server ignores CopyData ] -> CopyDone [ server ignores CopyDone ] -> Sync [ server processes Sync ] <- ReadyForQuery The last case is not great, because I could imagine it confusing a client, but I'm not sure about exactly how, and maybe it's something we can document around? Regards, Jeff Davis
pgsql-hackers by date: