Home > mailing lists

Re: OOM in libpq and infinite loop with getCopyStart() - Mailing list pgsql-hackers

From	Tom Lane
Subject	Re: OOM in libpq and infinite loop with getCopyStart()
Date	April 1, 2016 06:48:48
Msg-id	25412.1459482512@sss.pgh.pa.us Whole thread Raw
In response to	Re: OOM in libpq and infinite loop with getCopyStart() (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: OOM in libpq and infinite loop with getCopyStart()
List	pgsql-hackers

Tree view

I thought about this patch a bit more...

When I first looked at the patch, I didn't believe that it worked at
all: it failed to return a PGRES_COPY_XXX result to the application,
and yet nonetheless switched libpq's asyncStatus to PGASYNC_COPY_XXX?
Wouldn't things be hopelessly confused?  I tried it out and saw that
indeed it seemed to work in psql, and after tracing through that found
that psql has no idea what's going on, but when psql issues its next
command PQexecStart manages to get us out of the copy state (barring
more OOM failures, anyway).  That seems a bit accidental, though,
and for sure it wasn't documented in the patch.  In any case, having
libpq in that state would have side-effects on how it responds to
application actions, which is the core of my complaint about
PQgetResult behavior.  Those side effects only make sense if the app
knows (or should have known) that the connection is in COPY state.

I think it might be possible to get saner behavior if we invent a
set of new asyncStatus values PGASYNC_COPY_XXX_FAILED, having the
semantics that we got the relevant CopyStart message from the backend
but were unable to report it to the application.  PQexecStart would
treat these the same as the corresponding normal PGASYNC_COPY_XXX
states and do what's needful to get out of the copy mode.  But in
PQgetResult, we would *not* treat those states as a reason to return a
PGRES_COPY_XXX result to the application.  Probably we'd just return
NULL, with the implication being "we're done so far as you're concerned,
please give a new command".  I might be underthinking it though.

Anyway, the short of my review is that we need more clarity of thought
about what state libpq is in after a failure like this, and what that
state looks like to the application, and how that should affect how
libpq reacts to application calls.

I'll set this back to Waiting on Author ...
        regards, tom lane

pgsql-hackers by date:

From: Amit Kapila
Date: 01 April 2016, 06:31:56
Subject: Re: Parallel Queries and PostGIS

From: Pavel Stehule
Date: 01 April 2016, 06:50:16
Subject: Re: IF (NOT) EXISTS in psql-completion

Re: OOM in libpq and infinite loop with getCopyStart() - Mailing list pgsql-hackers

Previous

Next