Thread: BUG #3855: backend sends corrupted data on EHOSTDOWN error
The following bug has been logged online: Bug reference: 3855 Logged by: Scot Loach Email address: sloach@sandvine.com PostgreSQL version: 8.2.4 Operating system: freebsd 6.1 Description: backend sends corrupted data on EHOSTDOWN error Details: On FreeBSD, it is possible for a send() call on the backend socket to return an error code of EHOSTDOWN. This error can happen, for example, if a host on the local LAN is temporarily unreachable. In this case, the socket is not closed, and it may recover from this state. If it recovers, it is possible that the backend will continue sending results from a query, but it will have dropped some data from the reply. This causes the client to be out of sync with the server, which usually causes it to read an invalid length byte. This can cause various issues, such as clients crashing or, more commonly, blocking forever while trying to read a large response the server will never send. This is due to the way the backend handles errors. The following code (pqcomm.c:1075) is what happens when an error occurs on the write: /* * We drop the buffered data anyway so that processing can * continue, even though we'll probably quit soon. */ PqSendPointer = 0; return EOF; This sets PqSendPointer to 0, which effectively clears any data that was waiting to be sent. This EOF error propagates up the stack to pqformat.c: void pq_endmessage(StringInfo buf) { /* msgtype was saved in cursor field */ (void) pq_putmessage(buf->cursor, buf->data, buf->len); /* no need to complain about any failure, since pqcomm.c already did */ pfree(buf->data); buf->data = NULL; } In other words, postgres seems to be expecting that the connection will somehow be closed. Which in most errors, does happen; the stack will close the TCP connection and no harm will be done. But in the case of this particular error, the connection stays open, the client is waiting forever for bytes the server will never send, and the server is idle in its transaction, holding locks and waiting for a command from the client that will never come. The backend should either close the connection itself in this case, or handle the error better by not clearing the send buffer.
"Scot Loach" <sloach@sandvine.com> writes: > On FreeBSD, it is possible for a send() call on the backend socket to return > an error code of EHOSTDOWN. That's fine as long as the error condition is reasonably persistent. I think what you are describing is a bug in FreeBSD's TCP stack: it obviously isn't making adequately good-faith efforts to deliver the data it's been handed. regards, tom lane
On Tue, 2008-01-08 at 01:50 +0000, Scot Loach wrote: > The following bug has been logged online: > > Bug reference: 3855 > Logged by: Scot Loach > Email address: sloach@sandvine.com > PostgreSQL version: 8.2.4 > Operating system: freebsd 6.1 > Description: backend sends corrupted data on EHOSTDOWN error > Details: > This is a FreeBSD bug. http://www.freebsd.org/cgi/query-pr.cgi?pr=100172 It has been fixed here: http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/netinet/tcp_output.c in revision 1.112.2.1. I ran into this bug too, and it was very frustrating! For me, it manifested itself as SSL errors. You can demonstrate the problem with SSH as well (inducing an ARP failure will terminate the SSH session, when TCP should protect you against that), so it is clearly not a PostgreSQL bug. Thanks to "Andrew - Supernews" (a PostgreSQL user) for tracking this bug down. Regards, Jeff Davis
This may be true, but I still think PostgreSQL should be more defensive and actively terminate the connection when this happens (like ssh does) scot. =20 -----Original Message----- From: Jeff Davis [mailto:pgsql@j-davis.com]=20 Sent: Tuesday, January 08, 2008 12:52 PM To: Scot Loach Cc: pgsql-bugs@postgresql.org Subject: Re: [BUGS] BUG #3855: backend sends corrupted data on EHOSTDOWNerror On Tue, 2008-01-08 at 01:50 +0000, Scot Loach wrote: > The following bug has been logged online: >=20 > Bug reference: 3855 > Logged by: Scot Loach > Email address: sloach@sandvine.com > PostgreSQL version: 8.2.4 > Operating system: freebsd 6.1 > Description: backend sends corrupted data on EHOSTDOWN error > Details:=20 >=20 This is a FreeBSD bug.=20 http://www.freebsd.org/cgi/query-pr.cgi?pr=3D100172 It has been fixed here: http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/netinet/tcp_output.c in revision 1.112.2.1. I ran into this bug too, and it was very frustrating! For me, it manifested itself as SSL errors. You can demonstrate the problem with SSH as well (inducing an ARP failure will terminate the SSH session, when TCP should protect you against that), so it is clearly not a PostgreSQL bug. Thanks to "Andrew - Supernews" (a PostgreSQL user) for tracking this bug down. Regards, Jeff Davis
On Tue, 2008-01-08 at 12:57 -0500, Scot Loach wrote: > This may be true, but I still think PostgreSQL should be more defensive > and actively terminate the connection when this happens (like ssh does) I think postgresql's behavior is well within reason. Let me explain: What is happening is that FreeBSD *actually sends the data* before returning EHOSTDOWN as an error, and leaving the TCP connection open! At the time I was tracking this problem down, I wrote a C program to demonstrate that fact. This is the core of the reason why it's a protocol violation in PostgreSQL (or SSL error) rather than a disconnection. I think PostgreSQL is making the assumption here that an unrecognized error code from send() that leaves the connection in a good state, is a temporary error that may be resolved. Thus, PostgreSQL assumes that due to the error, no data was written, and re-sends the data, succeeding this time. I reason that the openssl library makes similar assumptions (i.e. assuming an error means the data wasn't sent, and resets some internal SSL protocol state), otherwise I wouldn't get SSL errors afterward, but it would manifest itself as a PostgreSQL protocol violation regardless of whether you're using SSL or not. If the OS leaves a TCP connection open, I think it is perfectly reasonable for an application to assume that the OS has sent exactly as many bytes as it said it sent; no more, no less. I would lean toward the opinion that postgresql works just fine now, and that TCP is explicitly designed to prevent these kinds of problems, and we only see this problem because FreeBSD 6.1 TCP is broken. Regards, Jeff Davis
I agree this would be fine if PostgreSQL works the way you say below. However, PostgreSQL does not look at the # of bytes written and continue sending after that many bytes. PostgreSQL actually simply clears its buffer of bytes to send on this error, in this code: pqcomm.c:1075 /* * We drop the buffered data anyway so that processing can * continue, even though we'll probably quit soon. */ PqSendPointer =3D 0; return EOF; The result as I saw on a system where this was occurring, was that when PostgreSQL was sending back a large result set, there was simply a fragment of it missing. scot. =20 -----Original Message----- From: Jeff Davis [mailto:pgsql@j-davis.com]=20 Sent: Tuesday, January 08, 2008 2:02 PM To: Scot Loach Cc: pgsql-bugs@postgresql.org Subject: RE: [BUGS] BUG #3855: backend sends corrupted data onEHOSTDOWNerror On Tue, 2008-01-08 at 12:57 -0500, Scot Loach wrote: > This may be true, but I still think PostgreSQL should be more=20 > defensive and actively terminate the connection when this happens=20 > (like ssh does) I think postgresql's behavior is well within reason. Let me explain: What is happening is that FreeBSD *actually sends the data* before returning EHOSTDOWN as an error, and leaving the TCP connection open! At the time I was tracking this problem down, I wrote a C program to demonstrate that fact. This is the core of the reason why it's a protocol violation in PostgreSQL (or SSL error) rather than a disconnection. I think PostgreSQL is making the assumption here that an unrecognized error code from send() that leaves the connection in a good state, is a temporary error that may be resolved. Thus, PostgreSQL assumes that due to the error, no data was written, and re-sends the data, succeeding this time. I reason that the openssl library makes similar assumptions (i.e. assuming an error means the data wasn't sent, and resets some internal SSL protocol state), otherwise I wouldn't get SSL errors afterward, but it would manifest itself as a PostgreSQL protocol violation regardless of whether you're using SSL or not. If the OS leaves a TCP connection open, I think it is perfectly reasonable for an application to assume that the OS has sent exactly as many bytes as it said it sent; no more, no less. I would lean toward the opinion that postgresql works just fine now, and that TCP is explicitly designed to prevent these kinds of problems, and we only see this problem because FreeBSD 6.1 TCP is broken. Regards, Jeff Davis
On Tue, 2008-01-08 at 14:06 -0500, Scot Loach wrote: > I agree this would be fine if PostgreSQL works the way you say below. > > However, PostgreSQL does not look at the # of bytes written and continue > sending after that many bytes. PostgreSQL actually simply clears its > buffer of bytes to send on this error, in this code: > > pqcomm.c:1075 > /* > * We drop the buffered data anyway so that processing can > * continue, even though we'll probably quit soon. > */ > PqSendPointer = 0; > return EOF; > > > The result as I saw on a system where this was occurring, was that when > PostgreSQL was sending back a large result set, there was simply a > fragment of it missing. I think I see what you are saying. I was thinking about fe-misc.c, where it explicitly says (in the default case of a switch statement of the return value): /* We don't assume it's a fatal error... */ conn->outCount = 0; return -1; (but that's on the frontend, obviously) I think the problem you're talking about comes from the callers of pq_putmessage, which simply ignore any return value at all (and thus do not retransmit the message). I agree that is a problem (assuming I understand what's going on). Regards, Jeff Davis
Yes that is what I am trying to explain. So I think this is still a bug that should be fixed in the backend code. scot. =20 -----Original Message----- From: Jeff Davis [mailto:pgsql@j-davis.com]=20 Sent: Tuesday, January 08, 2008 2:40 PM To: Scot Loach Cc: pgsql-bugs@postgresql.org Subject: RE: [BUGS] BUG #3855: backend sends corrupted data onEHOSTDOWNerror On Tue, 2008-01-08 at 14:06 -0500, Scot Loach wrote: > I agree this would be fine if PostgreSQL works the way you say below. >=20 > However, PostgreSQL does not look at the # of bytes written and=20 > continue sending after that many bytes. PostgreSQL actually simply=20 > clears its buffer of bytes to send on this error, in this code: >=20 > pqcomm.c:1075 > /* > * We drop the buffered data anyway so that processing can > * continue, even though we'll probably quit soon. > */ > PqSendPointer =3D 0; > return EOF; >=20 >=20 > The result as I saw on a system where this was occurring, was that=20 > when PostgreSQL was sending back a large result set, there was simply=20 > a fragment of it missing. I think I see what you are saying. I was thinking about fe-misc.c, where it explicitly says (in the default case of a switch statement of the return value): /* We don't assume it's a fatal error... */ conn->outCount =3D 0; return -1; (but that's on the frontend, obviously) I think the problem you're talking about comes from the callers of pq_putmessage, which simply ignore any return value at all (and thus do not retransmit the message). I agree that is a problem (assuming I understand what's going on). Regards, Jeff Davis
Email removed from patch queue --- Tom indicates this is an operating system bug. Perhaps if we get more bug reports we will have to address it. --------------------------------------------------------------------------- Scot Loach wrote: > Yes that is what I am trying to explain. > So I think this is still a bug that should be fixed in the backend code. > > scot. > > > -----Original Message----- > From: Jeff Davis [mailto:pgsql@j-davis.com] > Sent: Tuesday, January 08, 2008 2:40 PM > To: Scot Loach > Cc: pgsql-bugs@postgresql.org > Subject: RE: [BUGS] BUG #3855: backend sends corrupted data > onEHOSTDOWNerror > > On Tue, 2008-01-08 at 14:06 -0500, Scot Loach wrote: > > I agree this would be fine if PostgreSQL works the way you say below. > > > > However, PostgreSQL does not look at the # of bytes written and > > continue sending after that many bytes. PostgreSQL actually simply > > clears its buffer of bytes to send on this error, in this code: > > > > pqcomm.c:1075 > > /* > > * We drop the buffered data anyway so that processing can > > * continue, even though we'll probably quit soon. > > */ > > PqSendPointer = 0; > > return EOF; > > > > > > The result as I saw on a system where this was occurring, was that > > when PostgreSQL was sending back a large result set, there was simply > > a fragment of it missing. > > I think I see what you are saying. I was thinking about fe-misc.c, where > it explicitly says (in the default case of a switch statement of the > return value): > > /* We don't assume it's a fatal error... */ > conn->outCount = 0; > return -1; > > (but that's on the frontend, obviously) > > I think the problem you're talking about comes from the callers of > pq_putmessage, which simply ignore any return value at all (and thus do > not retransmit the message). I agree that is a problem (assuming I > understand what's going on). > > Regards, > Jeff Davis > > > ---------------------------(end of broadcast)--------------------------- > TIP 5: don't forget to increase your free space map settings -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://postgres.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +