Thread: Full socket send buffer prevents cancel, timeout
I've recently been investigating long-running statements that, despite statement_timeout settings and pg_cancel_backend() attempts, remain visible in pg_stat_activity and continue to hold locks. When this happens, a process trace and a debugger show that the backend is blocked at the send() in secure_write(), netstat shows that the backend's send buffer is full, and a packet sniffer shows that the client TCP stack is sending "win 0", suggesting that the client has a full receive buffer because the application has stopped reading data. Unfortunately we have limited ability to continue the investigation at the client. Here's an excerpt from internal_flush(): while (bufptr < bufend) { int r; r = secure_write(MyProcPort, bufptr, bufend - bufptr); if (r <= 0) { if (errno == EINTR) continue; /* Ok if we were interrupted */ If the write is interrupted by a timeout or cancel, can anything be done here or elsewhere to abort the statement and release its locks? I realize that the full send buffer complicates the matter because of the inability to send any more data to the client, but I'm wondering if the backend can do anything to get rid of statements from such misbehaving applications. We've reluctantly tried SIGTERM but that doesn't work either. SIGQUIT and SIGABRT would kill the entire backend. Since these statements won't go away, they hold locks that can block other transactions and they cause vacuum to leave behind dead rows that it could otherwise clean up. I noticed this problem in 8.1.14 but I can reproduce it in later versions as well. I can provide a test case if anybody's interested (it's easy: use PQsendQuery() to execute a query that returns enough data to fill both the client's and the server's socket buffers, then go to sleep without reading the response). -- Michael Fuhr
Michael Fuhr <mike@fuhr.org> writes: > I've recently been investigating long-running statements that, > despite statement_timeout settings and pg_cancel_backend() attempts, > remain visible in pg_stat_activity and continue to hold locks. When > this happens, a process trace and a debugger show that the backend > is blocked at the send() in secure_write(), netstat shows that the > backend's send buffer is full, and a packet sniffer shows that the > client TCP stack is sending "win 0", suggesting that the client has > a full receive buffer because the application has stopped reading > data. > If the write is interrupted by a timeout or cancel, can anything > be done here or elsewhere to abort the statement and release its > locks? The best thing would really be to kill the client. The backend can't take it upon itself to interrupt the send, because that would result in loss of protocol message sync, and without knowing how many bytes got sent there's really no way to recover. The only escape from the backend side would be to abort the session --- and even that's a bit problematic since we'd probably try to issue an error message somewhere on the way out, which isn't going to work either if the send buffer is full. regards, tom lane
On Sat, Oct 25, 2008 at 12:36:24PM -0400, Tom Lane wrote: > Michael Fuhr <mike@fuhr.org> writes: > > If the write is interrupted by a timeout or cancel, can anything > > be done here or elsewhere to abort the statement and release its > > locks? > > The best thing would really be to kill the client. Unfortunately the people running the database don't have control over the client. They'd like to kill the connection at the database end but we haven't yet found a reliable way to do that short of an immediate shutdown, which interrupts service and can lead to a long recovery. Are we missing any other possibilities? > The backend can't take it upon itself to interrupt the send, because > that would result in loss of protocol message sync, and without > knowing how many bytes got sent there's really no way to recover. > The only escape from the backend side would be to abort the session --- > and even that's a bit problematic since we'd probably try to issue an > error message somewhere on the way out, which isn't going to work > either if the send buffer is full. Yeah, I've already explained those difficulties. I was hoping that discussion might generate ideas on how to deal with them. -- Michael Fuhr
Michael Fuhr wrote: >On Sat, Oct 25, 2008 at 12:36:24PM -0400, Tom Lane wrote: >> The backend can't take it upon itself to interrupt the send, because >> that would result in loss of protocol message sync, and without >> knowing how many bytes got sent there's really no way to recover. >> The only escape from the backend side would be to abort the session --- >> and even that's a bit problematic since we'd probably try to issue an >> error message somewhere on the way out, which isn't going to work >> either if the send buffer is full. >Yeah, I've already explained those difficulties. I was hoping that >discussion might generate ideas on how to deal with them. What about simply closing the filedescriptor upon discovering a non-empty sendbuffer upon timeout/querycancel? -- Sincerely, Stephen R. van den Berg. Teamwork is essential -- it allows you to blame someone else.
"Stephen R. van den Berg" <srb@cuci.nl> writes: > What about simply closing the filedescriptor upon discovering a > non-empty sendbuffer upon timeout/querycancel? So in other words, convert any network glitch, no matter how small, into an instant fatal error? regards, tom lane
Tom Lane wrote: >"Stephen R. van den Berg" <srb@cuci.nl> writes: >> What about simply closing the filedescriptor upon discovering a >> non-empty sendbuffer upon timeout/querycancel? >So in other words, convert any network glitch, no matter how small, >into an instant fatal error? The fact that a timeout or querycancel has taken place, indicates that this does *not* act on just any network glitch. The preferred exact logic would look something like: a. Take note of the time the last write returned as Tlastwritten. b. Whenever a timeout occurs, or a querycancel is being requested: c. Check if the sendbuffer is empty. d. If the sendbuffer is non-empty *and* Tlastwritten is longer ago than some Tkill (say 128 seconds), then close the filedescriptor. In all other cases, just hang on tight. -- Sincerely, Stephen R. van den Berg. Teamwork is essential -- it allows you to blame someone else.
"Stephen R. van den Berg" <srb@cuci.nl> writes: > Tom Lane wrote: >> "Stephen R. van den Berg" <srb@cuci.nl> writes: >>> What about simply closing the filedescriptor upon discovering a >>> non-empty sendbuffer upon timeout/querycancel? >> So in other words, convert any network glitch, no matter how small, >> into an instant fatal error? > The fact that a timeout or querycancel has taken place, indicates that > this does *not* act on just any network glitch. No, just any one that is unfortunate enough to occur at the moment of a query timeout or cancel. regards, tom lane