Thread: BUG #18210: libpq: PQputCopyData sometimes fails in non-blocking mode over GSSAPI encrypted connection
BUG #18210: libpq: PQputCopyData sometimes fails in non-blocking mode over GSSAPI encrypted connection
The following bug has been logged on the website: Bug reference: 18210 Logged by: Lars Kanis Email address: lars@greiz-reinsdorf.de PostgreSQL version: 16.1 Operating system: Ubuntu-23.10 Description: I'm the maintainer of the ruby binding of libpq and this issue happens in Ruby, but it isn't related to the Ruby environment. It is only that the ruby binding always runs with the libpq non-blocking mode enabled internally, so that this issue happens. The error "GSSAPI caller failed to retransmit all data needing to be retried" is raised here: https://github.com/postgres/postgres/blob/eeb0ebad79d9350305d9e111fbac76e20fa4b2fe/src/interfaces/libpq/fe-secure-gssapi.c#L110 It happens only in non-blocking mode over GSSAPI encrypted connections. It isn't reliable and depends on the network timing. When sending a 7MB file in alternating pieces of 27KB and 180 Byte per PQputCopyData() there is a 50% chance to get the failure over the local network. It doesn't happen if TLS is used instead. I added some debug prints into libpq to show the amounts of data transferred. It also shows the gdb backtrace of where the error is raised. It like so: ``` [...lots of data blocks successfully transferred...] PQputCopyData() size=186 PQputCopyData() size=27714 pqPutMsgEnd() conn->outCount=30162 toSend=24576 pg_GSS_write() len=24576 PqGSSSendConsumed=0 PQputCopyData() size=189 PQputCopyData() size=27405 pqPutMsgEnd() conn->outCount=33190 toSend=32768 pg_GSS_write() len=32768 PqGSSSendConsumed=0 PQputCopyData() size=185 PQputCopyData() size=27481 pqPutMsgEnd() conn->outCount=28098 toSend=24576 pg_GSS_write() len=24576 PqGSSSendConsumed=0 PQputCopyData() size=182 pqPutMsgEnd() conn->outCount=11965 toSend=8192 pg_GSS_write() len=8192 PqGSSSendConsumed=8256 Breakpoint 1, 0x00007ffff20fe811 in appendPQExpBufferStr () from /home/kanis/ruby-pg/ports/x86_64-linux/lib/libpq.so.5 (gdb) bt #0 0x00007ffff20fe811 in appendPQExpBufferStr () from /home/kanis/ruby-pg/ports/x86_64-linux/lib/libpq.so.5 #1 0x00007ffff2103545 in pg_GSS_write () from /home/kanis/ruby-pg/ports/x86_64-linux/lib/libpq.so.5 #2 0x00007ffff20fbbad in pqsecure_write () from /home/kanis/ruby-pg/ports/x86_64-linux/lib/libpq.so.5 #3 0x00007ffff20f569d in pqSendSome () from /home/kanis/ruby-pg/ports/x86_64-linux/lib/libpq.so.5 #4 0x00007ffff20f517c in pqPutMsgEnd () from /home/kanis/ruby-pg/ports/x86_64-linux/lib/libpq.so.5 #5 0x00007ffff20f07ba in PQputCopyData () from /home/kanis/ruby-pg/ports/x86_64-linux/lib/libpq.so.5 #6 0x00007ffff1f259e2 in ?? () from /home/kanis/.rvm/gems/ruby-3.2.1/gems/pg-1.5.4-x86_64-linux/lib/3.2/pg_ext.so #7 0x00007ffff7c67002 in rb_nogvl () from /home/kanis/.rvm/rubies/ruby-3.2.1/bin/../lib/libruby.so.3.2 #8 0x00007ffff1f261ac in gvl_PQputCopyData () from /home//kanis/.rvm/gems/ruby-3.2.1/gems/pg-1.5.4-x86_64-linux/lib/3.2/pg_ext.so #9 0x00007ffff1f2af90 in ?? () from /home/kanis/.rvm/gems/ruby-3.2.1/gems/pg-1.5.4-x86_64-linux/lib/3.2/pg_ext.so #10 0x00007ffff7c9ebff in vm_call_cfunc_with_frame () from /home//kanis/.rvm/rubies/ruby-3.2.1/bin/../lib/libruby.so.3.2 #11 0x00007ffff7caefa0 in vm_exec_core () from /home/kanis/.rvm/rubies/ruby-3.2.1/bin/../lib/libruby.so.3.2 #12 0x00007ffff7cb4a23 in rb_vm_exec () from /home/kanis/.rvm/rub [...more ruby stack...] ``` The problem occurs when pqPutMsgEnd() sends some data per pqSendSome(), if there is more that 8192 Bytes in the output buffer. The data size is truncated to the next lower 8KB boundary. This happens here: https://github.com/postgres/postgres/blob/eeb0ebad79d9350305d9e111fbac76e20fa4b2fe/src/interfaces/libpq/fe-misc.c#L544 Due to the truncation it can happen that the boundary is before the boundary of the previous call of pqSendSome(). This in turn is raised by pg_GSS_write() which can not transfer any data in this case. So the error message "GSSAPI caller failed to retransmit all data needing to be retried" is not about the application calling libpq. In fact the 7MB file always fits into libpq's internal buffer memory, so that no retransmits are done on the application level or at the level of the ruby binding. The issue is fixed if I change libpq at either pqPutMsgEnd() or at pg_GSS_write(). So either avoid the truncation of send size to next 8KB boundary in pqPutMsgEnd() like so: ``` if (conn->outCount >= 8192) { int toSend = conn->outCount; // - (conn->outCount % 8192); if (pqSendSome(conn, toSend) < 0) return EOF; ``` ... or just returning 0 Bytes in pg_GSS_write() instead of raising an error: ``` if (len < PqGSSSendConsumed) { return 0; appendPQExpBufferStr(&conn->errorMessage, "GSSAPI caller failed to retransmit all data needing to be retried\n"); ``` Unfortunately so far I'm unable to provide a repro script. Maybe the above information is enough to understand the problem. Regards, Lars Kanis
Re: BUG #18210: libpq: PQputCopyData sometimes fails in non-blocking mode over GSSAPI encrypted connection
PG Bug reporting form <noreply@postgresql.org> writes: > The error "GSSAPI caller failed to retransmit all data needing to be > retried" is raised here: > https://github.com/postgres/postgres/blob/eeb0ebad79d9350305d9e111fbac76e20fa4b2fe/src/interfaces/libpq/fe-secure-gssapi.c#L110 > It happens only in non-blocking mode over GSSAPI encrypted connections. It > isn't reliable and depends on the network timing. When sending a 7MB file in > alternating pieces of 27KB and 180 Byte per PQputCopyData() there is a 50% > chance to get the failure over the local network. It doesn't happen if TLS > is used instead. Hmm. > The issue is fixed if I change libpq at either pqPutMsgEnd() or at > pg_GSS_write(). Both of those seem like pretty obviously wrong "fixes". However, I'm not quite sure I see the problem clearly enough to propose a fix I'd believe in. > Unfortunately so far I'm unable to provide a repro script. Maybe the above > information is enough to understand the problem. A repro script would be really really helpful here. regards, tom lane
Re: BUG #18210: libpq: PQputCopyData sometimes fails in non-blocking mode over GSSAPI encrypted connection
I wrote: > PG Bug reporting form <noreply@postgresql.org> writes: >> The error "GSSAPI caller failed to retransmit all data needing to be >> retried" is raised here: >> https://github.com/postgres/postgres/blob/eeb0ebad79d9350305d9e111fbac76e20fa4b2fe/src/interfaces/libpq/fe-secure-gssapi.c#L110 >> It happens only in non-blocking mode over GSSAPI encrypted connections. It >> isn't reliable and depends on the network timing. When sending a 7MB file in >> alternating pieces of 27KB and 180 Byte per PQputCopyData() there is a 50% >> chance to get the failure over the local network. It doesn't happen if TLS >> is used instead. > A repro script would be really really helpful here. After consuming more caffeine, I was able to repro it with the not-intended-for-commit hack in 0001 attached. (On my machine, the test just hangs up upon failing, because the hacked-up logic in copy.c doesn't cope very well with the failure. I don't think that is copy.c's fault though, it's just an incomplete hack.) I concur with the conclusion that it's really pqPutMsgEnd's fault. By deciding not to send the last partial block that's in the outBuffer, it runs the risk of not presenting some data that it did present the last time, and that can trigger the "failed to retransmit all data" error. This happens because GSS's gss_MaxPktSize is a bit less than 16K (it's 16320 on my machine). So if we initially present 24K of data (3 blocks), and pg_GSS_write successfully encrypts and sends one packet of data, then it will encrypt all the rest. But if its second pqsecure_raw_write call fails with EINTR, it will return with bytes_sent = 16320 (and PqGSSSendConsumed = 8256), causing us to reduce the outBuffer contents to 8256 bytes plus whatever partial block we didn't try to send. If we don't fill outBuffer to at least 16K before trying again, we'll try to send just 8192 bytes, and kaboom. (This is why the alternating-long-and-short-lines business is important.) The quick hack in 0002 attached fixes it, but I can't say that I like this solution: it's propagating a bit of ugliness that ought to be localized in pg_GSS_write out to callers. I wonder if we should drop the idea of returning a positive bytecount after a partial write, and just return the pqsecure_raw_write result, and not reset PqGSSSendConsumed until we write everything presented. In edge cases maybe that would result in some buffer bloat, but it doesn't seem worse than what happens when the very first pqsecure_raw_write returns EINTR. In any case, the backend needs a look to see whether it requires a similar fix. We don't do nonblock mode there, but I don't think that means we can never get EINTR. regards, tom lane diff --git a/src/bin/psql/copy.c b/src/bin/psql/copy.c index dbbbdb8898..2b9c278ef7 100644 --- a/src/bin/psql/copy.c +++ b/src/bin/psql/copy.c @@ -505,7 +505,7 @@ handleCopyOut(PGconn *conn, FILE *copystream, PGresult **res) */ /* read chunk size for COPY IN - size is not critical */ -#define COPYBUFSIZ 8192 +#define COPYBUFSIZ 32768 bool handleCopyIn(PGconn *conn, FILE *copystream, bool isbinary, PGresult **res) @@ -531,6 +531,8 @@ handleCopyIn(PGconn *conn, FILE *copystream, bool isbinary, PGresult **res) goto copyin_cleanup; } + (void) PQsetnonblocking(conn, 1); + /* Prompt if interactive input */ if (isatty(fileno(copystream))) { @@ -649,16 +651,17 @@ handleCopyIn(PGconn *conn, FILE *copystream, bool isbinary, PGresult **res) } /* - * If the buffer is full, or we've reached the EOF, flush it. - * - * Make sure there's always space for four more bytes in the - * buffer, plus a NUL terminator. That way, an EOF marker is - * never split across two fgets() calls, which simplifies the - * logic. + * Flush after every line for test purposes. */ - if (buflen >= COPYBUFSIZ - 5 || (copydone && buflen > 0)) + if (buflen > 0) { - if (PQputCopyData(conn, buf, buflen) <= 0) + int pcdres; + + while ((pcdres = PQputCopyData(conn, buf, buflen)) == 0) + { + /* busy-wait */ + } + if (pcdres < 0) { OK = false; break; @@ -686,6 +689,8 @@ handleCopyIn(PGconn *conn, FILE *copystream, bool isbinary, PGresult **res) copyin_cleanup: + (void) PQsetnonblocking(conn, 0); + /* * Clear the EOF flag on the stream, in case copying ended due to an EOF * signal. This allows an interactive TTY session to perform another COPY diff --git a/src/test/kerberos/t/001_auth.pl b/src/test/kerberos/t/001_auth.pl index 0deb9bffc8..711bbc1e29 100644 --- a/src/test/kerberos/t/001_auth.pl +++ b/src/test/kerberos/t/001_auth.pl @@ -419,6 +419,25 @@ test_query( 'gssencmode=require', 'sending 100K lines works'); +my $bigdata = ''; +for (my $rpt = 0; $rpt < 1000; $rpt++) +{ + $bigdata .= 'xyzzy' x (27*1024/5) . "\n"; + $bigdata .= 'qwert' x (180/5) . "\n"; +} + +test_query( + $node, + 'test1', + "CREATE TEMP TABLE mytab (f1 text);\n" + . "COPY mytab FROM STDIN;\n" + . $bigdata + . "\\.\n" + . "SELECT COUNT(*) FROM mytab;", + qr/^2000$/s, + 'gssencmode=require', + 'sending alternating-size lines works'); + # require_auth=gss succeeds if required. $node->connect_ok( $node->connstr('postgres') diff --git a/src/interfaces/libpq/fe-misc.c b/src/interfaces/libpq/fe-misc.c index 660cdec93c..5a74a95968 100644 --- a/src/interfaces/libpq/fe-misc.c +++ b/src/interfaces/libpq/fe-misc.c @@ -541,7 +541,19 @@ pqPutMsgEnd(PGconn *conn) if (conn->outCount >= 8192) { - int toSend = conn->outCount - (conn->outCount % 8192); + int toSend = conn->outCount; + + /* + * Normally we only send full 8K blocks. However, if we don't present + * the last partial block we can trigger pg_GSS_write's "failed to + * retransmit all data" failure, since rounding toSend down could + * result in not presenting some data that we did present on an + * earlier try. The full-block heuristic is pretty useless anyway + * when encrypting, since encryption overhead will throw off the + * number. Hence, in GSSAPI mode just send all we have. + */ + if (!conn->gssenc) + toSend -= toSend % 8192; if (pqSendSome(conn, toSend) < 0) return EOF;
Re: BUG #18210: libpq: PQputCopyData sometimes fails in non-blocking mode over GSSAPI encrypted connection
Am 22.11.23 um 16:11 schrieb Tom Lane:
A repro script would be really really helpful here.
After fiddling a bit longer I was able to write a reproducer in C. It needs a Kerberos enabled environment and it reliably fails as described when running over Internet (by VPN connection). Then the output is:
PQputCopyData text2 command failed: GSSAPI caller failed to retransmit all data needing to be retried
The program is here: https://gist.github.com/larskanis/5546a20365b0ca64e294a32b138f441e
I wasn't able to reproduce the error on a LAN connection with the above program. Even adding a trigger on the copy-table to slow the server down or adding PQflush/PQconsumeInput calls didn't work to trigger the GSSAPI error. In contrast the original program sometimes fails even on a LAN connection.
Regards,
Lars Kanis
Re: BUG #18210: libpq: PQputCopyData sometimes fails in non-blocking mode over GSSAPI encrypted connection
I wrote: > I wonder if we should drop the idea of returning a positive bytecount > after a partial write, and just return the pqsecure_raw_write result, > and not reset PqGSSSendConsumed until we write everything presented. > In edge cases maybe that would result in some buffer bloat, but it > doesn't seem worse than what happens when the very first > pqsecure_raw_write returns EINTR. Here's a patch that fixes it along those lines. I like this better, I think, not least because it removes the assumption that "interesting" pqsecure_raw_write failures will recur on the next try. Still need to look at syncing the backend with this. regards, tom lane diff --git a/src/interfaces/libpq/fe-secure-gssapi.c b/src/interfaces/libpq/fe-secure-gssapi.c index 7e373236e9..ea8b0020d2 100644 --- a/src/interfaces/libpq/fe-secure-gssapi.c +++ b/src/interfaces/libpq/fe-secure-gssapi.c @@ -79,8 +79,8 @@ * On success, returns the number of data bytes consumed (possibly less than * len). On failure, returns -1 with errno set appropriately. If the errno * indicates a non-retryable error, a message is added to conn->errorMessage. - * For retryable errors, caller should call again (passing the same data) - * once the socket is ready. + * For retryable errors, caller should call again (passing the same or more + * data) once the socket is ready. */ ssize_t pg_GSS_write(PGconn *conn, const void *ptr, size_t len) @@ -90,19 +90,25 @@ pg_GSS_write(PGconn *conn, const void *ptr, size_t len) gss_buffer_desc input, output = GSS_C_EMPTY_BUFFER; ssize_t ret = -1; - size_t bytes_sent = 0; size_t bytes_to_encrypt; size_t bytes_encrypted; gss_ctx_id_t gctx = conn->gctx; /* - * When we get a failure, we must not tell the caller we have successfully - * transmitted everything, else it won't retry. Hence a "success" - * (positive) return value must only count source bytes corresponding to - * fully-transmitted encrypted packets. The amount of source data - * corresponding to the current partly-transmitted packet is remembered in + * When we get a retryable failure, we must not tell the caller we have + * successfully transmitted everything, else it won't retry. For + * simplicity, we claim we haven't transmitted anything until we have + * successfully transmitted all "len" bytes. Between calls, the amount of + * the current input data that's already been encrypted and placed into + * PqGSSSendBuffer (and perhaps transmitted) is remembered in * PqGSSSendConsumed. On a retry, the caller *must* be sending that data * again, so if it offers a len less than that, something is wrong. + * + * Note: it may seem attractive to report partial write completion once + * we've successfully sent any encrypted packets. However, that can cause + * problems for callers; notably, pqPutMsgEnd's heuristic to send only + * full 8K blocks interacts badly with such a hack. We won't save much, + * typically, by letting callers discard data early, so don't risk it. */ if (len < PqGSSSendConsumed) { @@ -140,33 +146,20 @@ pg_GSS_write(PGconn *conn, const void *ptr, size_t len) retval = pqsecure_raw_write(conn, PqGSSSendBuffer + PqGSSSendNext, amount); if (retval <= 0) - { - /* - * Report any previously-sent data; if there was none, reflect - * the pqsecure_raw_write result up to our caller. When there - * was some, we're effectively assuming that any interesting - * failure condition will recur on the next try. - */ - if (bytes_sent) - return bytes_sent; return retval; - } /* * Check if this was a partial write, and if so, move forward that * far in our buffer and try again. */ - if (retval != amount) + if (retval < amount) { PqGSSSendNext += retval; continue; } - /* We've successfully sent whatever data was in that packet. */ - bytes_sent += PqGSSSendConsumed; - - /* All encrypted data was sent, our buffer is empty now. */ - PqGSSSendLength = PqGSSSendNext = PqGSSSendConsumed = 0; + /* We've successfully sent whatever data was in the buffer. */ + PqGSSSendLength = PqGSSSendNext = 0; } /* @@ -192,7 +185,7 @@ pg_GSS_write(PGconn *conn, const void *ptr, size_t len) /* * Create the next encrypted packet. Any failure here is considered a - * hard failure, so we return -1 even if bytes_sent > 0. + * hard failure, so we return -1 even if some data has been sent. */ major = gss_wrap(&minor, gctx, 1, GSS_C_QOP_DEFAULT, &input, &conf_state, &output); @@ -236,10 +229,13 @@ pg_GSS_write(PGconn *conn, const void *ptr, size_t len) } /* If we get here, our counters should all match up. */ - Assert(bytes_sent == len); - Assert(bytes_sent == bytes_encrypted); + Assert(len == PqGSSSendConsumed); + Assert(len == bytes_encrypted); + + /* We're reporting all the data as sent, so reset PqGSSSendConsumed. */ + PqGSSSendConsumed = 0; - ret = bytes_sent; + ret = bytes_encrypted; cleanup: /* Release GSSAPI buffer storage, if we didn't already */ diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h index c745facfec..22bc682ffc 100644 --- a/src/interfaces/libpq/libpq-int.h +++ b/src/interfaces/libpq/libpq-int.h @@ -583,8 +583,8 @@ struct pg_conn int gss_SendLength; /* End of data available in gss_SendBuffer */ int gss_SendNext; /* Next index to send a byte from * gss_SendBuffer */ - int gss_SendConsumed; /* Number of *unencrypted* bytes consumed - * for current contents of gss_SendBuffer */ + int gss_SendConsumed; /* Number of source bytes encrypted but + * not yet reported as sent */ char *gss_RecvBuffer; /* Received, encrypted data */ int gss_RecvLength; /* End of data available in gss_RecvBuffer */ char *gss_ResultBuffer; /* Decryption of data in gss_RecvBuffer */
Re: BUG #18210: libpq: PQputCopyData sometimes fails in non-blocking mode over GSSAPI encrypted connection
Greetings, * Tom Lane (tgl@sss.pgh.pa.us) wrote: > I wrote: > > I wonder if we should drop the idea of returning a positive bytecount > > after a partial write, and just return the pqsecure_raw_write result, > > and not reset PqGSSSendConsumed until we write everything presented. > > In edge cases maybe that would result in some buffer bloat, but it > > doesn't seem worse than what happens when the very first > > pqsecure_raw_write returns EINTR. > > Here's a patch that fixes it along those lines. I like this better, > I think, not least because it removes the assumption that "interesting" > pqsecure_raw_write failures will recur on the next try. Thanks for looking into this. I agree with this approach and took a look through the patch and it all makes sense to me. > Still need to look at syncing the backend with this. I see you made similar changes to the backend and those also look good. Thanks again, Stephen