Re: [HACKERS] COPY IN/BOTH vs. extended query mode - Mailing list pgsql-hackers

From Robert Haas
Subject Re: [HACKERS] COPY IN/BOTH vs. extended query mode
Date
Msg-id CA+TgmoahKj-7WP2Ax3pHqSfwdVJn_EKvH5=+bsUtS7AOFC43vg@mail.gmail.com
Whole thread Raw
In response to [HACKERS] COPY IN/BOTH vs. extended query mode  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: [HACKERS] COPY IN/BOTH vs. extended query mode
List pgsql-hackers
On Mon, Jan 23, 2017 at 9:12 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> According to the documentation for COPY IN mode, "If the COPY command
> was issued via an extended-query message, the backend will now discard
> frontend messages until a Sync message is received, then it will issue
> ReadyForQuery and return to normal processing."  I added a similar
> note to the documentation for COPY BOTH mode in
> 91fa8532f4053468acc08534a6aac516ccde47b7, and the documentation
> accurately describes the behavior of the server.  However, this seems
> to make fully correct error handling for clients using libpq almost
> impossible, because PQsendQueryGuts() sends
> Parse-Bind-Describe-Execute-Sync in one shot without regard to whether
> the command that was just sent invoked COPY mode (cf. the note in
> CopyGetData about why we ignore Flush and Sync in that function).
>
> So imagine that the client uses libpq to send (via the extended query
> protocol) a COPY IN command (or some hypothetical command that starts
> COPY BOTH mode to begin).  If the server throws an error before the
> Sync message is consumed, it will bounce back to PostgresMain which
> will set doing_extended_query_message = true after which it will
> consume messages, find the Sync, reset that flag, and send
> ReadyForQuery.  On the other hand, if the server enters CopyBoth mode,
> consumes the Sync message in CopyGetData (or a similar function), and
> *then* throws an ERROR, the server will wait for a second Sync message
> from the client before issuing ReadyForQuery.  There is no sensible
> way of coping with this problem in libpq, because there is no way for
> the client to know which part of the server code consumed the Sync
> message that it already sent.  In short, from the client's point of
> view, if it enters COPY IN or COPY BOTH mode via the extend query
> protocol, and an error occurs on the server, the server MAY OR MAY NOT
> expect a further Sync message before issuing ReadyForQuery, and the
> client has no way of knowing -- except maybe waiting for a while to
> see what happens.
>
> It does not appear to me that there is any good solution to this
> problem.  Fixing it on the server side would require a wire protocol
> change - e.g. one kind of Sync message that is used in a
> Parse-Bind-Describe-Execute-Sync sequence that only terminates
> non-COPY commands and another kind that is used to signal the end even
> of COPY.  Fixing it on the client side would require all clients to
> know prior to initiating an extended-query-protocol sequence whether
> or not the command was going to initiate COPY, which is an awful API
> even if didn't constitute an impossible-to-contemplate backward
> compatibility break.  Perhaps we will have to be content to document
> the fact that this part of the protocol is depressingly broken...
>
> ...unless of course somebody can see something that I'm missing here
> and the situation isn't as bad as it currently appears to me to be.

Anybody have any thoughts on this?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: [HACKERS] log_autovacuum_min_duration doesn't log VACUUMs
Next
From: Fujii Masao
Date:
Subject: [HACKERS] Re: [COMMITTERS] pgsql: Remove all references to "xlog" fromSQL-callable functions in p