Thread: 7.3.2 closing connections, sometimes

7.3.2 closing connections, sometimes

From
felix@crowfix.com
Date:
I hate to post as vague a description as this, but I don't think the
devil is in the details this time.  I may be wrong ...

This project is running 7.3.2 on a RedHat 9 system.  We plan to
upgrade in a few weeks to Fedora Core and Postgres 8, so maybe this
problem is not worth wasting too much time on, right now.

This is a SOAP server, Apache with mod_perl, connecting to Postgres
via DBI/DBD::Pg.  Sometimes it gets in a mood, for want of a better
term, where a specific SQL statement fails with the good ole message
"server closed the connection unexpectedly".  It will fail like this
for several hours, then suddenly start working again.  The SQL that it
fails on works perfectly in psql; it always returns the exact data
expected.  It's a small table of perhaps a dozen lines, and does not
change very often.  I would suspect hardware except that a new machine
behaves just the same.

One of the puzzles is that nothing shows up in the log.  The log is
configured thusly:

    server_min_messages = notice
    client_min_messages = notice
    log_min_error_statement = error

And yet only the only messages that show up are start and stop.  I
changed log_min_error_statement to notice like the others, and it
hasn't failed since, but I doubt this is the cause, because it has
gone thru these mood swings before without having changed the log
level.  It's not the soap server disconnecting from the SOAP client,
because the server continues to log things.

Did 7.3.2 have any problems that might cause random disconnects, or
diconnects for some obscure but documented reason?  Google found some,
but none of them apply here, as far as I can tell.

Or are there useful changes to logging that might track this down?
Strace generated about 50MB of log file, too much for me!

--
            ... _._. ._ ._. . _._. ._. ___ .__ ._. . .__. ._ .. ._.
     Felix Finch: scarecrow repairman & rocket surgeon / felix@crowfix.com
  GPG = E987 4493 C860 246C 3B1E  6477 7838 76E9 182E 8151 ITAR license #4933
I've found a solution to Fermat's Last Theorem but I see I've run out of room o

Re: 7.3.2 closing connections, sometimes

From
Bruce Momjian
Date:
felix@crowfix.com wrote:
> I hate to post as vague a description as this, but I don't think the
> devil is in the details this time.  I may be wrong ...
>
> This project is running 7.3.2 on a RedHat 9 system.  We plan to
> upgrade in a few weeks to Fedora Core and Postgres 8, so maybe this
> problem is not worth wasting too much time on, right now.
>
> This is a SOAP server, Apache with mod_perl, connecting to Postgres
> via DBI/DBD::Pg.  Sometimes it gets in a mood, for want of a better
> term, where a specific SQL statement fails with the good ole message
> "server closed the connection unexpectedly".  It will fail like this

This message is from the backend exiting abruptly.  Is isn't an "ERROR"
as we define it for logging purposes.  That's why there is nothing in
the logs.  I recommend turning on log_statement which prints before the
query is run.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: 7.3.2 closing connections, sometimes

From
Tom Lane
Date:
> felix@crowfix.com wrote:
>> This is a SOAP server, Apache with mod_perl, connecting to Postgres
>> via DBI/DBD::Pg.  Sometimes it gets in a mood, for want of a better
>> term, where a specific SQL statement fails with the good ole message
>> "server closed the connection unexpectedly".  It will fail like this

The specific statement being what exactly?

Bruce Momjian <pgman@candle.pha.pa.us> writes:
> This message is from the backend exiting abruptly.  Is isn't an "ERROR"
> as we define it for logging purposes.  That's why there is nothing in
> the logs.

Nonetheless I'd expect there to be at least a postmaster complaint about
a crashed backend --- assuming that that's what's going on.  Do the
other active connections get forcibly closed when this happens?

            regards, tom lane

Re: 7.3.2 closing connections, sometimes

From
felix@crowfix.com
Date:
On Wed, Jul 06, 2005 at 02:32:31PM -0400, Bruce Momjian wrote:
>
> This message is from the backend exiting abruptly.  Is isn't an "ERROR"
> as we define it for logging purposes.  That's why there is nothing in
> the logs.  I recommend turning on log_statement which prints before the
> query is run.

I hadn't thought of the error that way.  I do have query logging on,
and if I run that query directly, it finds the data I'd expect.  It's
a small table, or really three of them, all small for the time being.

    select it.id, it.it_class_id, it.it_code_version_id,
    it.it_data_version, it.note, it_class.class, it_class.id,
    it_code_version.version, it_code_version.id, it_class.id,
    it_code_version.id from it join it_class on (it_class.id =
    it.it_class_id) join it_code_version on (it_code_version.id =
    it.it_code_version_id) where class = ? AND version = ? AND
    it_data_version > ?

--
            ... _._. ._ ._. . _._. ._. ___ .__ ._. . .__. ._ .. ._.
     Felix Finch: scarecrow repairman & rocket surgeon / felix@crowfix.com
  GPG = E987 4493 C860 246C 3B1E  6477 7838 76E9 182E 8151 ITAR license #4933
I've found a solution to Fermat's Last Theorem but I see I've run out of room o

Re: 7.3.2 closing connections, sometimes

From
felix@crowfix.com
Date:
On Wed, Jul 06, 2005 at 03:10:40PM -0400, Tom Lane wrote:
> > felix@crowfix.com wrote:
> >> This is a SOAP server, Apache with mod_perl, connecting to Postgres
> >> via DBI/DBD::Pg.  Sometimes it gets in a mood, for want of a better
> >> term, where a specific SQL statement fails with the good ole message
> >> "server closed the connection unexpectedly".  It will fail like this
>
> The specific statement being what exactly?

    select it.id, it.it_class_id, it.it_code_version_id,
    it.it_data_version, it.note, it_class.class, it_class.id,
    it_code_version.version, it_code_version.id, it_class.id,
    it_code_version.id from it join it_class on (it_class.id =
    it.it_class_id) join it_code_version on (it_code_version.id =
    it.it_code_version_id) where class = ? AND version = ? AND
    it_data_version > ?

> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > This message is from the backend exiting abruptly.  Is isn't an "ERROR"
> > as we define it for logging purposes.  That's why there is nothing in
> > the logs.
>
> Nonetheless I'd expect there to be at least a postmaster complaint about
> a crashed backend --- assuming that that's what's going on.  Do the
> other active connections get forcibly closed when this happens?

Haven't had any others open, it's a dev system.  But I'll try leaving
a psql session open.  Right now it's gotten itself into the mood of
always working, so it might have to wait a while.

Could a corrupt db cause these mood swings?  And if so, would that
persist even across dropdb / creatdb?

--
            ... _._. ._ ._. . _._. ._. ___ .__ ._. . .__. ._ .. ._.
     Felix Finch: scarecrow repairman & rocket surgeon / felix@crowfix.com
  GPG = E987 4493 C860 246C 3B1E  6477 7838 76E9 182E 8151 ITAR license #4933
I've found a solution to Fermat's Last Theorem but I see I've run out of room o

Re: 7.3.2 closing connections, sometimes

From
Bruce Momjian
Date:
felix@crowfix.com wrote:
> > Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > > This message is from the backend exiting abruptly.  Is isn't an "ERROR"
> > > as we define it for logging purposes.  That's why there is nothing in
> > > the logs.
> >
> > Nonetheless I'd expect there to be at least a postmaster complaint about
> > a crashed backend --- assuming that that's what's going on.  Do the
> > other active connections get forcibly closed when this happens?
>
> Haven't had any others open, it's a dev system.  But I'll try leaving
> a psql session open.  Right now it's gotten itself into the mood of
> always working, so it might have to wait a while.
>
> Could a corrupt db cause these mood swings?  And if so, would that
> persist even across dropdb / creatdb?

Yes, that is possible, but usually it would fail consistently.  Have you
run memtest and disk diagnostics?

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: 7.3.2 closing connections, sometimes

From
felix@crowfix.com
Date:
On Wed, Jul 06, 2005 at 05:44:44PM -0400, Bruce Momjian wrote:
> felix@crowfix.com wrote:
> >
> > Could a corrupt db cause these mood swings?  And if so, would that
> > persist even across dropdb / creatdb?
>
> Yes, that is possible, but usually it would fail consistently.  Have you
> run memtest and disk diagnostics?

I moved the disks to a new machine, same problem, which doesn't rule
out disk problems.  We were getting a second machine ready for testing
this problem, but my boss has decided to upgrade to 8.0.3 tonight for
himself, and probably very soon after for the rest of us, and the
problem is in the work mood right now, so we will no doubt follow the
general principle of changing many things at once to make tracking
things down more fun :-)

--
            ... _._. ._ ._. . _._. ._. ___ .__ ._. . .__. ._ .. ._.
     Felix Finch: scarecrow repairman & rocket surgeon / felix@crowfix.com
  GPG = E987 4493 C860 246C 3B1E  6477 7838 76E9 182E 8151 ITAR license #4933
I've found a solution to Fermat's Last Theorem but I see I've run out of room o