Re: PQexec() hangs on OOM - Mailing list pgsql-bugs

From Michael Paquier
Subject Re: PQexec() hangs on OOM
Date
Msg-id CAB7nPqSBwLkJdj2VV=czMUERuE+TKyNmu3bCbfYr2YFyhOUVMw@mail.gmail.com
Whole thread Raw
In response to Re: PQexec() hangs on OOM  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: PQexec() hangs on OOM  (Michael Paquier <michael.paquier@gmail.com>)
List pgsql-bugs
On Mon, Sep 7, 2015 at 6:45 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Mon, Sep 7, 2015 at 1:40 PM, Michael Paquier <michael.paquier@gmail.com>
> wrote:
>>
>> On Sat, Sep 5, 2015 at 9:45 PM, Amit Kapila <amit.kapila16@gmail.com>
>> wrote:
>> >
>>
>> So, I think that the right approach would be to leave immediately
>> pqParseInput3 should an error happen, switching asyncStatus to leave
>> the loop in PQgetResult.
>
> Sure, if you think that works, then do it that way.
>
>> This seems as well to work correctly with
>> PGRES_COPY_BOTH (I emulated failures with pg_basebackup and errors
>> were caught up correctly. This brings back of course the introduction
>> of a new flag PGASYNC_FATAL
>
> I think we should try not to introduce this new flag, as that is not merely
> a flag, but rather a state in query-execution state machine.  Now if we
> introduce a new error state into that state-machine, then it doesn't seem
> like a good idea to do that only for some of the paths and doing it for all
> other paths is a call for somewhat larger verification cycle (to see if it
> works in all paths as previously).

Well, the previous patch you sent added code paths to manage the
failures with COPY protocol directly in the COPY routines in a way
similar to what the introduction of PGASYNC_FATAL is doing, except
that PGASYNC_FATAL has the advantage to let the end of PQgetResult
know that a failure has happened, centralizing the error check. Also,
it seems to me that switching getParamDescriptions to an error state
is going to need this flag, or at least a new flag to exit the loop
when parsing the input.

Regarding the other messages that could use this flag, I just
double-checked and it seems that all the other calls return an EOF
when they do not have enough data received from the backend or they
directly switch to another PGASYNC state, so they do not need a
specific fatal error handling. Please let me know if I missed
something.

In any case, attached are two patches:
- 0001 adds the OOM/failure handling for the BIND and COPY start
messages. This time the connection is not dropped. After a failure,
successive commands work as well, this addresses the previous issue
you reported.
- 0002 is a cleanup bonus, getRowDescriptions and getAnotherTuple have
some dead code that I think would be better removed, those are
remnants from a copy/paste from the similar code of protocol 2.
Regards,
--
Michael

Attachment

pgsql-bugs by date:

Previous
From: Amit Kapila
Date:
Subject: Re: PQexec() hangs on OOM
Next
From: Michael Paquier
Date:
Subject: Re: PQexec() hangs on OOM