Thread: Backend message type 0x50 arrived while idle
Hi... We're using an older version of PostgreSQL (6.5.1, to be precise), and we're running into a problem where during certain inserts and queries we receive a message "Backend message type 0x50 arrived while idle". We've also seen backend message 0x45. Are these defined somewhere? We're trying to figure out why this is happening, and if upgrading to a newer version of PostgreSQL might fix the problem, and I'm not finding much in the way of documentation that gets into the meaning of the various backend messages. Any information you can give me would be appreciated, because this is getting really frustrating. Thanks. Robin -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Robin Wynn Northrop Grumman Information Technology Email: rwynn@northropgrumman.com Phone: (315) 336-0500 Ext 2247 Fax: (315) 336-4455 -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
On Tue, 3 Dec 2002, Wynn, Robin wrote: > Hi... > > We're using an older version of PostgreSQL (6.5.1, to be precise), and we're > running into a problem where during certain inserts and queries we receive a > message "Backend message type 0x50 arrived while idle". We've also seen > backend message 0x45. Are these defined somewhere? We're trying to figure > out why this is happening, and if upgrading to a newer version of PostgreSQL > might fix the problem, and I'm not finding much in the way of documentation > that gets into the meaning of the various backend messages. Any > information you can give me would be appreciated, because this is getting > really frustrating. Thanks. I'm not sure what those messages mean, but 6.5.x is VERY old. Upgrading is worth it in so many ways, that even if it didn't fix this it would be worth it. Most likely this will be fixed by upgrading, but it's not likely you'll get a lot of help on troubleshooting a version that old. Since the release 6.5.x, 7.0.x, 7.1.x, 7.2.x and now 7.3 have come out. For one thing, 7.3 is literally about 100 times faster than 6.5.x, and it's definitely more reliable and has a lot fewer bugs in it.
I know... the problem is two-fold. One, I had to have that in writing so I could justify upgrading, and two, we release in a few days and can't really change the database until that's complete (requires too many changes to existing things). I was just hoping someone here would know what that message meant so I could either document it, or prove it's not the database that's got a problem, it's our interaction with it. I've seen a number of people ask this question on a number of different versions of PostgreSQL (I believe the latest I've seen is 7.1.x), so I suspect that it's something we're doing... I just need to know what that message means, so I know how to troubleshoot it on our end. It's not incredibly descriptive :^P Robin -----Original Message----- From: scott.marlowe [mailto:scott.marlowe@ihs.com] Sent: Tuesday, December 03, 2002 3:35 PM To: Wynn, Robin Cc: pgsql-general@postgresql.org Subject: Re: [GENERAL] Backend message type 0x50 arrived while idle On Tue, 3 Dec 2002, Wynn, Robin wrote: > Hi... > > We're using an older version of PostgreSQL (6.5.1, to be precise), and we're > running into a problem where during certain inserts and queries we receive a > message "Backend message type 0x50 arrived while idle". We've also seen > backend message 0x45. Are these defined somewhere? We're trying to figure > out why this is happening, and if upgrading to a newer version of PostgreSQL > might fix the problem, and I'm not finding much in the way of documentation > that gets into the meaning of the various backend messages. Any > information you can give me would be appreciated, because this is getting > really frustrating. Thanks. I'm not sure what those messages mean, but 6.5.x is VERY old. Upgrading is worth it in so many ways, that even if it didn't fix this it would be worth it. Most likely this will be fixed by upgrading, but it's not likely you'll get a lot of help on troubleshooting a version that old. Since the release 6.5.x, 7.0.x, 7.1.x, 7.2.x and now 7.3 have come out. For one thing, 7.3 is literally about 100 times faster than 6.5.x, and it's definitely more reliable and has a lot fewer bugs in it.
On Tue, 3 Dec 2002, Wynn, Robin wrote: > I know... the problem is two-fold. One, I had to have that in writing so I > could justify upgrading, and two, we release in a few days and can't really > change the database until that's complete (requires too many changes to > existing things). I was just hoping someone here would know what that > message meant so I could either document it, or prove it's not the database > that's got a problem, it's our interaction with it. I've seen a number of > people ask this question on a number of different versions of PostgreSQL (I > believe the latest I've seen is 7.1.x), so I suspect that it's something > we're doing... I just need to know what that message means, so I know how to > troubleshoot it on our end. It's not incredibly descriptive :^P > OK, I was just wandering around the source code, and this same error can happen in 6.5.x or 7.3.whatever. The docs inside the source of postgresql-7.3/src/interfaces/libpq/fe-exec.c say: /* * NOTIFY and WARNING messages can happen in any state besides * COPY OUT; always process them right away. * * Most other messages should only be processed while in BUSY state. * (In particular, in READY state we hold off further parsing * until the application collects the current PGresult.) * * However, if the state is IDLE then we got trouble; we need to deal * with the unexpected message somehow. */ And: /* * Unexpected message in IDLE state; need to recover somehow. * ERROR messages are displayed using the notice processor; * anything else is just dropped on the floor after displaying * a suitable warning notice. (An ERROR is very possibly the * backend telling us why it is about to close the connection, * so we don't want to just discard it...) */ So, are there any other messages to go with this error? Are you using libpq to interface? It sounds like maybe your client app is sending data when it shouldn't.
On Tue, Dec 03, 2002 at 12:43:09PM -0800, Wynn, Robin wrote: > I know... the problem is two-fold. One, I had to have that in writing so I > could justify upgrading, and two, we release in a few days and can't really > change the database until that's complete (requires too many changes to > existing things). I was just hoping someone here would know what that > message meant so I could either document it, or prove it's not the database > that's got a problem, it's our interaction with it. I've seen a number of > people ask this question on a number of different versions of PostgreSQL (I > believe the latest I've seen is 7.1.x), so I suspect that it's something > we're doing... I just need to know what that message means, so I know how to > troubleshoot it on our end. It's not incredibly descriptive :^P To be honest, it's totally worth it upgrading and nowhere near as hard as you think. A while ago we moved from 6.5.x to 7.0.x and we didn't have to change a line of code. Since then we've tested our stuff with 7.2.x and again, it's worked without a single modification. If you get the time, setup a test machine, just install everything as usual but put 7.3.x on instead. I think you'll be pleasently suprised. FWIW, those errors probably relate to actual errors signalled by the backend not correctly flagged/noticed by the client. If you have logging turned on in the backend you'll probably get much more helpful messages out of the log. -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Support bacteria! They're the only culture some people have.
Attachment
There are no other messages that I can see... that doesn't mean there aren't any, of course, just that the legacy code I've inherited doesn't process them. I actually suspect that I have a handle on what's going on... what I think it boils down to is that there are two threads (there are more, but these two are what I think are causing the problem). One periodically updates an element called pct_complete in a table called JTRANSACTION, the other periodically does a select from that same table to get the current value of pct_complete. Both have the same connection ID. Randomly, we hit (what I suspect is) a race condition where both things are happening at the same time, and the one performing the SELECT hangs. That happens to be the parent thread, for what that's worth. At any rate, it always seems to be the case (at least, so far) that when the backend message comes through, the parent thread is hung. So, I'm going to dig around some more and see what's been recommended for avoiding this condition... I'll also try making a new connection with one of the threads (thus, a different backend, from what I understand) and see if that avoids this problem. Any other suggestions? Could this theoretically happen with an INSERT/SELECT combination, or is it unique to the UPDATE/SELECT pairing? Robin -----Original Message----- From: scott.marlowe [mailto:scott.marlowe@ihs.com] Sent: Tuesday, December 03, 2002 5:04 PM To: Wynn, Robin Cc: pgsql-general@postgresql.org Subject: Re: [GENERAL] Backend message type 0x50 arrived while idle On Tue, 3 Dec 2002, Wynn, Robin wrote: > I know... the problem is two-fold. One, I had to have that in writing so I > could justify upgrading, and two, we release in a few days and can't really > change the database until that's complete (requires too many changes to > existing things). I was just hoping someone here would know what that > message meant so I could either document it, or prove it's not the database > that's got a problem, it's our interaction with it. I've seen a number of > people ask this question on a number of different versions of PostgreSQL (I > believe the latest I've seen is 7.1.x), so I suspect that it's something > we're doing... I just need to know what that message means, so I know how to > troubleshoot it on our end. It's not incredibly descriptive :^P > OK, I was just wandering around the source code, and this same error can happen in 6.5.x or 7.3.whatever. The docs inside the source of postgresql-7.3/src/interfaces/libpq/fe-exec.c say: /* * NOTIFY and WARNING messages can happen in any state besides * COPY OUT; always process them right away. * * Most other messages should only be processed while in BUSY state. * (In particular, in READY state we hold off further parsing * until the application collects the current PGresult.) * * However, if the state is IDLE then we got trouble; we need to deal * with the unexpected message somehow. */ And: /* * Unexpected message in IDLE state; need to recover somehow. * ERROR messages are displayed using the notice processor; * anything else is just dropped on the floor after displaying * a suitable warning notice. (An ERROR is very possibly the * backend telling us why it is about to close the connection, * so we don't want to just discard it...) */ So, are there any other messages to go with this error? Are you using libpq to interface? It sounds like maybe your client app is sending data when it shouldn't.
Reading about updates and selects rings a bell: does not a select wait until the update either commits or rolls back? >>> "Wynn, Robin" <RWynn@northropgrumman.com> 12/04/02 02:34pm >>> There are no other messages that I can see... that doesn't mean there aren't any, of course, just that the legacy code I've inherited doesn't process them. I actually suspect that I have a handle on what's going on... what I think it boils down to is that there are two threads (there are more, but these two are what I think are causing the problem). One periodically updates an element called pct_complete in a table called JTRANSACTION, the other periodically does a select from that same table to get the current value of pct_complete. Both have the same connection ID. Randomly, we hit (what I suspect is) a race condition where both things are happening at the same time, and the one performing the SELECT hangs. That happens to be the parent thread, for what that's worth. At any rate, it always seems to be the case (at least, so far) that when the backend message comes through, the parent thread is hung. So, I'm going to dig around some more and see what's been recommended for avoiding this condition... I'll also try making a new connection with one of the threads (thus, a different backend, from what I understand) and see if that avoids this problem. Any other suggestions? Could this theoretically happen with an INSERT/SELECT combination, or is it unique to the UPDATE/SELECT pairing? Robin -----Original Message----- From: scott.marlowe [mailto:scott.marlowe@ihs.com] Sent: Tuesday, December 03, 2002 5:04 PM To: Wynn, Robin Cc: pgsql-general@postgresql.org Subject: Re: [GENERAL] Backend message type 0x50 arrived while idle On Tue, 3 Dec 2002, Wynn, Robin wrote: > I know... the problem is two-fold. One, I had to have that in writing so I > could justify upgrading, and two, we release in a few days and can't really > change the database until that's complete (requires too many changes to > existing things). I was just hoping someone here would know what that > message meant so I could either document it, or prove it's not the database > that's got a problem, it's our interaction with it. I've seen a number of > people ask this question on a number of different versions of PostgreSQL (I > believe the latest I've seen is 7.1.x), so I suspect that it's something > we're doing... I just need to know what that message means, so I know how to > troubleshoot it on our end. It's not incredibly descriptive :^P > OK, I was just wandering around the source code, and this same error can happen in 6.5.x or 7.3.whatever. The docs inside the source of postgresql-7.3/src/interfaces/libpq/fe-exec.c say: /* * NOTIFY and WARNING messages can happen in any state besides * COPY OUT; always process them right away. * * Most other messages should only be processed while in BUSY state. * (In particular, in READY state we hold off further parsing * until the application collects the current PGresult.) * * However, if the state is IDLE then we got trouble; we need to deal * with the unexpected message somehow. */ And: /* * Unexpected message in IDLE state; need to recover somehow. * ERROR messages are displayed using the notice processor; * anything else is just dropped on the floor after displaying * a suitable warning notice. (An ERROR is very possibly the * backend telling us why it is about to close the connection, * so we don't want to just discard it...) */ So, are there any other messages to go with this error? Are you using libpq to interface? It sounds like maybe your client app is sending data when it shouldn't. ---------------------------(end of broadcast)--------------------------- TIP 6: Have you searched our list archives? http://archives.postgresql.org
"Wynn, Robin" <RWynn@northropgrumman.com> writes: > I actually suspect that I have a handle on what's going on... what I > think it boils down to is that there are two threads (there are more, but > these two are what I think are causing the problem). One periodically > updates an element called pct_complete in a table called JTRANSACTION, the > other periodically does a select from that same table to get the current > value of pct_complete. Both have the same connection ID. Randomly, we hit > (what I suspect is) a race condition where both things are happening at the > same time, and the one performing the SELECT hangs. That happens to be the > parent thread, for what that's worth. At any rate, it always seems to be > the case (at least, so far) that when the backend message comes through, the > parent thread is hung. So, I'm going to dig around some more and see what's > been recommended for avoiding this condition... I'll also try making a new > connection with one of the threads (thus, a different backend, from what I > understand) and see if that avoids this problem. Any other suggestions? > Could this theoretically happen with an INSERT/SELECT combination, or is it > unique to the UPDATE/SELECT pairing? AFAIK it's extremely bad practice in general to share a connection between two threads, unless you protect it with some kind of lock to avoid simultaneous use. Using a connection per thread is a much better idea. The only issue with that is that one thread won't see results of an in-progress transaction until the other thread commits. -Doug
"Hegyvari Krisztian" <Hegyvari.Krisztian@ardents.hu> writes: > Reading about updates and selects rings a bell: does not a select > wait until the update either commits or rolls back? No, that's what MVCC is all about. ;) -Doug
Doug McNaught <doug@mcnaught.org> writes: > AFAIK it's extremely bad practice in general to share a connection > between two threads, unless you protect it with some kind of lock to > avoid simultaneous use. I suspect Doug's put his finger on the problem --- are you trying to use the same PGconn object in both threads? Not a good idea at all. libpq isn't thread-aware (mainly because of the portability problems that would ensue), and it *will* break if you try to use the same PGconn concurrently in two different threads. regards, tom lane
Doug McNaught wrote: >"Wynn, Robin" <RWynn@northropgrumman.com> writes: > > > >AFAIK it's extremely bad practice in general to share a connection >between two threads, unless you protect it with some kind of lock to >avoid simultaneous use. Using a connection per thread is a much >better idea. The only issue with that is that one thread won't see >results of an in-progress transaction until the other thread commits. > > I agree with you on seperate connections, and as for the multiple threads seeing the latest values of data or data sets, ... no new problem has been added, its just that now the source of the data object(s) are a TCP connectiion away, instead of a RAM memory controller (or mem manager) away.... >-Doug > >---------------------------(end of broadcast)--------------------------- >TIP 5: Have you checked our extensive FAQ? > >http://www.postgresql.org/users-lounge/docs/faq.html > >