Thread: 7.3.2 closing connections, sometimes
I hate to post as vague a description as this, but I don't think the devil is in the details this time. I may be wrong ... This project is running 7.3.2 on a RedHat 9 system. We plan to upgrade in a few weeks to Fedora Core and Postgres 8, so maybe this problem is not worth wasting too much time on, right now. This is a SOAP server, Apache with mod_perl, connecting to Postgres via DBI/DBD::Pg. Sometimes it gets in a mood, for want of a better term, where a specific SQL statement fails with the good ole message "server closed the connection unexpectedly". It will fail like this for several hours, then suddenly start working again. The SQL that it fails on works perfectly in psql; it always returns the exact data expected. It's a small table of perhaps a dozen lines, and does not change very often. I would suspect hardware except that a new machine behaves just the same. One of the puzzles is that nothing shows up in the log. The log is configured thusly: server_min_messages = notice client_min_messages = notice log_min_error_statement = error And yet only the only messages that show up are start and stop. I changed log_min_error_statement to notice like the others, and it hasn't failed since, but I doubt this is the cause, because it has gone thru these mood swings before without having changed the log level. It's not the soap server disconnecting from the SOAP client, because the server continues to log things. Did 7.3.2 have any problems that might cause random disconnects, or diconnects for some obscure but documented reason? Google found some, but none of them apply here, as far as I can tell. Or are there useful changes to logging that might track this down? Strace generated about 50MB of log file, too much for me! -- ... _._. ._ ._. . _._. ._. ___ .__ ._. . .__. ._ .. ._. Felix Finch: scarecrow repairman & rocket surgeon / felix@crowfix.com GPG = E987 4493 C860 246C 3B1E 6477 7838 76E9 182E 8151 ITAR license #4933 I've found a solution to Fermat's Last Theorem but I see I've run out of room o
felix@crowfix.com wrote: > I hate to post as vague a description as this, but I don't think the > devil is in the details this time. I may be wrong ... > > This project is running 7.3.2 on a RedHat 9 system. We plan to > upgrade in a few weeks to Fedora Core and Postgres 8, so maybe this > problem is not worth wasting too much time on, right now. > > This is a SOAP server, Apache with mod_perl, connecting to Postgres > via DBI/DBD::Pg. Sometimes it gets in a mood, for want of a better > term, where a specific SQL statement fails with the good ole message > "server closed the connection unexpectedly". It will fail like this This message is from the backend exiting abruptly. Is isn't an "ERROR" as we define it for logging purposes. That's why there is nothing in the logs. I recommend turning on log_statement which prints before the query is run. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
> felix@crowfix.com wrote: >> This is a SOAP server, Apache with mod_perl, connecting to Postgres >> via DBI/DBD::Pg. Sometimes it gets in a mood, for want of a better >> term, where a specific SQL statement fails with the good ole message >> "server closed the connection unexpectedly". It will fail like this The specific statement being what exactly? Bruce Momjian <pgman@candle.pha.pa.us> writes: > This message is from the backend exiting abruptly. Is isn't an "ERROR" > as we define it for logging purposes. That's why there is nothing in > the logs. Nonetheless I'd expect there to be at least a postmaster complaint about a crashed backend --- assuming that that's what's going on. Do the other active connections get forcibly closed when this happens? regards, tom lane
On Wed, Jul 06, 2005 at 02:32:31PM -0400, Bruce Momjian wrote: > > This message is from the backend exiting abruptly. Is isn't an "ERROR" > as we define it for logging purposes. That's why there is nothing in > the logs. I recommend turning on log_statement which prints before the > query is run. I hadn't thought of the error that way. I do have query logging on, and if I run that query directly, it finds the data I'd expect. It's a small table, or really three of them, all small for the time being. select it.id, it.it_class_id, it.it_code_version_id, it.it_data_version, it.note, it_class.class, it_class.id, it_code_version.version, it_code_version.id, it_class.id, it_code_version.id from it join it_class on (it_class.id = it.it_class_id) join it_code_version on (it_code_version.id = it.it_code_version_id) where class = ? AND version = ? AND it_data_version > ? -- ... _._. ._ ._. . _._. ._. ___ .__ ._. . .__. ._ .. ._. Felix Finch: scarecrow repairman & rocket surgeon / felix@crowfix.com GPG = E987 4493 C860 246C 3B1E 6477 7838 76E9 182E 8151 ITAR license #4933 I've found a solution to Fermat's Last Theorem but I see I've run out of room o
On Wed, Jul 06, 2005 at 03:10:40PM -0400, Tom Lane wrote: > > felix@crowfix.com wrote: > >> This is a SOAP server, Apache with mod_perl, connecting to Postgres > >> via DBI/DBD::Pg. Sometimes it gets in a mood, for want of a better > >> term, where a specific SQL statement fails with the good ole message > >> "server closed the connection unexpectedly". It will fail like this > > The specific statement being what exactly? select it.id, it.it_class_id, it.it_code_version_id, it.it_data_version, it.note, it_class.class, it_class.id, it_code_version.version, it_code_version.id, it_class.id, it_code_version.id from it join it_class on (it_class.id = it.it_class_id) join it_code_version on (it_code_version.id = it.it_code_version_id) where class = ? AND version = ? AND it_data_version > ? > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > This message is from the backend exiting abruptly. Is isn't an "ERROR" > > as we define it for logging purposes. That's why there is nothing in > > the logs. > > Nonetheless I'd expect there to be at least a postmaster complaint about > a crashed backend --- assuming that that's what's going on. Do the > other active connections get forcibly closed when this happens? Haven't had any others open, it's a dev system. But I'll try leaving a psql session open. Right now it's gotten itself into the mood of always working, so it might have to wait a while. Could a corrupt db cause these mood swings? And if so, would that persist even across dropdb / creatdb? -- ... _._. ._ ._. . _._. ._. ___ .__ ._. . .__. ._ .. ._. Felix Finch: scarecrow repairman & rocket surgeon / felix@crowfix.com GPG = E987 4493 C860 246C 3B1E 6477 7838 76E9 182E 8151 ITAR license #4933 I've found a solution to Fermat's Last Theorem but I see I've run out of room o
felix@crowfix.com wrote: > > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > > This message is from the backend exiting abruptly. Is isn't an "ERROR" > > > as we define it for logging purposes. That's why there is nothing in > > > the logs. > > > > Nonetheless I'd expect there to be at least a postmaster complaint about > > a crashed backend --- assuming that that's what's going on. Do the > > other active connections get forcibly closed when this happens? > > Haven't had any others open, it's a dev system. But I'll try leaving > a psql session open. Right now it's gotten itself into the mood of > always working, so it might have to wait a while. > > Could a corrupt db cause these mood swings? And if so, would that > persist even across dropdb / creatdb? Yes, that is possible, but usually it would fail consistently. Have you run memtest and disk diagnostics? -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
On Wed, Jul 06, 2005 at 05:44:44PM -0400, Bruce Momjian wrote: > felix@crowfix.com wrote: > > > > Could a corrupt db cause these mood swings? And if so, would that > > persist even across dropdb / creatdb? > > Yes, that is possible, but usually it would fail consistently. Have you > run memtest and disk diagnostics? I moved the disks to a new machine, same problem, which doesn't rule out disk problems. We were getting a second machine ready for testing this problem, but my boss has decided to upgrade to 8.0.3 tonight for himself, and probably very soon after for the rest of us, and the problem is in the work mood right now, so we will no doubt follow the general principle of changing many things at once to make tracking things down more fun :-) -- ... _._. ._ ._. . _._. ._. ___ .__ ._. . .__. ._ .. ._. Felix Finch: scarecrow repairman & rocket surgeon / felix@crowfix.com GPG = E987 4493 C860 246C 3B1E 6477 7838 76E9 182E 8151 ITAR license #4933 I've found a solution to Fermat's Last Theorem but I see I've run out of room o