Re: OOM in libpq and infinite loop with getCopyStart() - Mailing list pgsql-hackers

From Tom Lane
Subject Re: OOM in libpq and infinite loop with getCopyStart()
Date
Msg-id 24249.1459524615@sss.pgh.pa.us
Whole thread Raw
In response to Re: OOM in libpq and infinite loop with getCopyStart()  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: OOM in libpq and infinite loop with getCopyStart()  (Michael Paquier <michael.paquier@gmail.com>)
List pgsql-hackers
I wrote:
> So the core of my complaint is that we need to fix things so that, whether
> or not we are able to create the PGRES_FATAL_ERROR PGresult (and we'd
> better consider the behavior when we cannot), ...

BTW, the real Achilles' heel of any attempt to ensure sane behavior at
the OOM limit is this possibility of being unable to create a PGresult
with which to inform the client that we failed.

I wonder if we could make things better by keeping around an emergency
backup PGresult struct.  Something along these lines:

1. Add a field "PGresult *emergency_result" to PGconn.

2. At the very beginning of any PGresult-returning libpq function, check
to see if we have an emergency_result, and if not make one, ensuring
there's room in it for a reasonable-size error message; or maybe even
preload it with "out of memory" if we assume that's the only condition
it'll ever be used for.  If malloc fails at this point, just return NULL
without doing anything or changing any libpq state.  (Since a NULL result
is documented as possibly caused by OOM, this isn't violating any API.)

3. Subsequent operations never touch the emergency_result unless we're
up against an OOM, but it can be used to return a failure indication
to the client so long as we leave libpq in a state where additional
calls to PQgetResult would return NULL.

Basically this shifts the point where an unreportable OOM could happen
from somewhere in the depths of libpq to the very start of an operation,
where we're presumably in a clean state and OOM failure doesn't leave
us with a mess we can't clean up.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: OOM in libpq and infinite loop with getCopyStart()
Next
From: "Daniel Verite"
Date:
Subject: Re: raw output from copy