Re: Unhappy with error handling in psql's handleCopyOut() - Mailing list pgsql-hackers

From Stephen Frost
Subject Re: Unhappy with error handling in psql's handleCopyOut()
Date
Msg-id 20140211220311.GV2921@tamriel.snowman.net
Whole thread Raw
In response to Re: Unhappy with error handling in psql's handleCopyOut()  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Stephen Frost <sfrost@snowman.net> writes:
> > I've not gotten back to it yet, but I ran into a related-seeming issue
> > where psql would happily chew up 2G of memory trying to send "COPY
> > failed" notices when it gets disconnected from a server that it's trying
> > to send data to mid-COPY.  conn->sock was -1, connection was
> > 'CONNECTION_BAD', but the loop towards the end of handleCopyIn doesn't
> > care and nothing in libpq is changing PQresultStatus():
>
> [ scratches head... ]  Offhand I'd have expected PQgetResult to start
> returning error indications.  It definitely will do that if it gets
> error indications from the I/O layer.  Perhaps it didn't see any
> because asyncStatus had already been reset from PGASYNC_BUSY?

The I/O layer was definitely returning errors, I traced through that
using gdb (tho I don't have it open any more, had to move on to other
things; been planning to reproduce it).  Here's the full p *conn:

$9 = {pghost = 0x7fa742753300 "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX", pghostaddr = 0x0, pgport = 0x7fa742753340 "XXXXX",
pgunixsocket= 0x0, pgtty = 0x7fa742753360 "", connect_timeout = 0x0, client_encoding_initial = 0x0, pgoptions =
0x7fa742753380"", appname = 0x0, fbappname = 0x7fa7427533a0 "psql", dbName = 0x7fa7427532e0 "XXXX", replication = 0x0,
pguser= 0x7fa7427532c0 "XXXXXXXXXX", pgpass = 0x7fa742755070 "XXXXXXXXXXX", keepalives = 0x0, keepalives_idle = 0x0,
keepalives_interval= 0x0, keepalives_count = 0x0, sslmode = 0x7fa7427533c0 "prefer", sslcompression = 0x7fa7427533e0
"1",sslkey = 0x0, sslcert = 0x0, sslrootcert = 0x0, sslcrl = 0x0, requirepeer = 0x0, krbsrvname = 0x7fa742753400
"postgres",Pfdebug = 0x0, noticeHooks = {noticeRec = 0x7fa73fed3ec0 <defaultNoticeReceiver>, noticeRecArg = 0x0,
noticeProc= 0x7fa740572f20 <NoticeProcessor>, noticeProcArg = 0x0}, events = 0x0, nEvents = 0, eventArraySize = 0,
status= CONNECTION_BAD, asyncStatus = PGASYNC_COPY_IN, xactStatus = PQTRANS_INTRANS, queryclass = PGQUERY_SIMPLE,
last_query= 0x7fa7427809f0 "COPY XXXXXXXXXXXXXXXXXXXXXXXXXX (XXXXXX, XXXXXX, XXXXXX, XXXXXX, XXXXXX) FROM stdin;",
last_sqlstate= "\000\000\000\000\000", options_valid = 1 '\001', nonblocking = 0 '\000', singleRowMode = 0 '\000',
copy_is_binary= 0 '\000', copy_already_done = 0, notifyHead = 0x0, notifyTail = 0x0, sock = -1, laddr = {addr =
{ss_family= 2, __ss_align = 0,     __ss_padding = '\000' <repeats 111 times>}, salen = 16}, raddr = {addr = {ss_family
=2, __ss_align = 0, __ss_padding = '\000' <repeats 111 times>}, salen = 16}, pversion = 196608, sversion = 90302,
auth_req_received= 1 '\001', password_needed = 1 '\001', dot_pgpass_used = 1 '\001', sigpipe_so = 0 '\000',
sigpipe_flag= 0 '\000', addrlist = 0x0, addr_cur = 0x0, addrlist_family = 0, setenv_state = SETENV_STATE_IDLE, next_eo
=0x0, send_appname = 1 '\001', be_pid = 25857, be_key = 393001970, md5Salt = "XXXXXX", pstatus = 0x7fa742788630,
client_encoding= 6, std_strings = 1 '\001', verbosity = PQERRORS_DEFAULT, lobjfuncs = 0x0, inBuffer = 0x7fa74274a740
"G",inBufSize = 16384, inStart = 0, inCursor = 0, inEnd = 0, outBuffer = 0x7fa63a495010 "00229878919\t1", '0' <repeats
12times>, "13462\tsupplemental\tt\nd", outBufSize = 2147475456, outCount = 2147475445, outMsgStart = 2147475446,
outMsgEnd= 2147475450, rowBuf = 0x7fa742752760, rowBufLen = 32, result = 0x0, next_result = 0x0, allow_ssl_try = 1
'\001',wait_ssl_try = 0 '\000', ssl = 0x0, peer = 0x0, engine = 0x0, gctx = 0x0, gtarg_nam = 0x0, ginbuf = {length = 0,
value= 0x0}, goutbuf = {length = 0, value = 0x0}, errorMessage = {data = 0x7fa742752970 "cannot allocate memory for
outputbuffer\n", len = 41, maxlen = 256}, workBuffer = {data = 0x7fa742752a80 "SAVEPOINT", len = 9, maxlen = 256}} 

You can see asyncStatus is still PGASYNC_COPY_IN and also how
outBufSize grew up to the 2G mark, along with the 'cannot allocate'
error.

The p *res looked like:

$2 = {ntups = 0, numAttributes = 0, attDescs = 0x0, tuples = 0x0, tupArrSize = 0, numParameters = 0, paramDescs = 0x0,
resultStatus= PGRES_COPY_IN, cmdStatus =
"\000\177\000\000@\000\000\000\000\000\000\000!\000\000\000\000\000\000\000\210ף?\247\177\000\000\000BxB\247\177\000\000`\000\000\000\000\000\000\000\060\000\000\000\000\000\000\000\300dxB\247\177\000\000\000\000\000",
binary= 0, noticeHooks = {noticeRec = 0x7fa73fed3ec0 <defaultNoticeReceiver>, noticeRecArg = 0x0, noticeProc =
0x7fa740572f20<NoticeProcessor>, noticeProcArg = 0x0}, events = 0x0, nEvents = 0, client_encoding = 6, errMsg = 0x0,
errFields= 0x0, null_field = "", curBlock = 0x0, curOffset = 0, spaceLeft = 0} 

> If so, maybe we need an explicit test on the connection being good before
> we return valid PGRES_COPY_IN etc results.

Right, PQresultStatus() just checks if the PGresult is non-NULL, it's
not doing any checking of the connection state itself.  I had been
trying to work out if something else should have been responsible for
changing asyncStatus away from PGASYNC_COPY_IN when the connection
failure is detected, but having PQresultStatus() check the connection
status seems like it would work.
Thanks,
    Stephen

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Unhappy with error handling in psql's handleCopyOut()
Next
From: Alvaro Herrera
Date:
Subject: Re: truncating pg_multixact/members