Thread: Backend crashes (6.5.2 linux)

Backend crashes (6.5.2 linux)

From
Michael Simms
Date:
Hi

I am seeing the backend crash on a table I am using for a
searchengine, and I cannot find any answers with the list archives.

Welcome to the POSTGRESQL interactive sql monitor: Please read the file COPYRIGHT for copyright terms of POSTGRESQL
[PostgreSQL 6.5.2 on i686-pc-linux-gnu, compiled by gcc egcs-2.91.66]
  type \? for help on slash commands  type \q to quit  type \g or terminate with semicolon to execute queryYou are
currentlyconnected to the database: search
 

search=> update search_url set stale=941424005 where lowerurl='http://criswell.bizland.com';
pqReadData() -- backend closed the channel unexpectedly.       This probably means the backend terminated abnormally
  before or while processing the request.
 
We have lost the connection to the backend, so further processing is impossible.  Terminating.

psql search
Welcome to the POSTGRESQL interactive sql monitor: Please read the file COPYRIGHT for copyright terms of POSTGRESQL
[PostgreSQL 6.5.2 on i686-pc-linux-gnu, compiled by gcc egcs-2.91.66]
  type \? for help on slash commands  type \q to quit  type \g or terminate with semicolon to execute queryYou are
currentlyconnected to the database: search
 

search=> select count(*) from search_url;   
count
-----
20334
(1 row)

The postmaster stderr says:

/usr/bin/postmaster: reaping dead processes...
/usr/bin/postmaster: CleanupProc: pid 9788 exited with status 0
FATAL 1:  my bits moved right off the end of the world!
/usr/bin/postmaster: reaping dead processes...
/usr/bin/postmaster: CleanupProc: pid 9792 exited with status 0

The postmaster stdout says:

StartTransactionCommand
ProcessQuery
proc_exit(0) [#0]
shmem_exit(0) [#0]
exit(0)
CommitTransactionCommand

The stdout IS moving by very quickly so I cant be sure this message matches up
with this error, but it certainly seems to (all the rest are
CommitTransactionCommand/StartTransactionCommand/ProcessQuery)

My biggest problem is that I am using the C libraries, and PQexec()
does not return, gdb shows it is sitting in a select() inside
#0  0xc91954e in __select ()
#1  0xc851428 in pgresStatus ()
#2  0xc84a9ea in PQgetResult ()
#3  0xc84ab77 in PQexec ()

and it hangs there forever (well, 10 minutes so far)

Now, the line I am doing an update on, I can select quite happily, and
it returns the value I expect.

Extra info:
#uname -a
Linux ewtoo.org 2.2.12 #2 SMP Fri Oct 1 21:50:14 BST 1999 i686 unknown

#free -k            total       used       free     shared    buffers     cached
Mem:        387476     381112       6364     761628      52904     166224
-/+ buffers/cache:     161984     225492
Swap:       526168      19816     506352

Anyone else seen this?
                    ~Michael


Re: [HACKERS] Backend crashes (6.5.2 linux)

From
Tom Lane
Date:
Michael Simms <grim@argh.demon.co.uk> writes:
> The postmaster stderr says:
> FATAL 1:  my bits moved right off the end of the world!

Hmm.  That error is coming out of the btree index code.  Vadim knows
that code better than anyone else, so he might have something to say
here, but my past-midnight recollection is that we've seen that error
being triggered when there are oversize entries in the index (where
"oversize" = "more than half a disk page").  It's a bug, for sure,
but what you probably want right now is a workaround.  Do you have any
entries in indexed columns that are over 4K, and can you get rid of them?

> My biggest problem is that I am using the C libraries, and PQexec()
> does not return, gdb shows it is sitting in a select() inside
> #0  0xc91954e in __select ()
> #1  0xc851428 in pgresStatus ()
> #2  0xc84a9ea in PQgetResult ()
> #3  0xc84ab77 in PQexec ()

Huh?  PQgetResult does not call pgresStatus ... not least because the
latter is an array, not a function.  Your gdb is lying to you.  Maybe
you have a problem with gdb looking at a different version of the
library than what's actually executing?
        regards, tom lane


Re: [HACKERS] Backend crashes (6.5.2 linux)

From
Bruce Momjian
Date:
> Michael Simms <grim@argh.demon.co.uk> writes:
> > The postmaster stderr says:
> > FATAL 1:  my bits moved right off the end of the world!
> 

That's my favorite error message.  Can we make it print more often?
Pru-Hahaha...  :-)

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] Backend crashes (6.5.2 linux)

From
Vadim Mikheev
Date:
Tom Lane wrote:
> 
> Michael Simms <grim@argh.demon.co.uk> writes:
> > The postmaster stderr says:
> > FATAL 1:  my bits moved right off the end of the world!
> 
> Hmm.  That error is coming out of the btree index code.  Vadim knows
> that code better than anyone else, so he might have something to say
> here, but my past-midnight recollection is that we've seen that error
> being triggered when there are oversize entries in the index (where
> "oversize" = "more than half a disk page").  It's a bug, for sure,
> but what you probably want right now is a workaround.  Do you have any
> entries in indexed columns that are over 4K, and can you get rid of them?

This FATAL means that index is broken (some prev insertion
was interrupted by elog(ERROR) or backend crash) - try to rebuild...
WAL should fix this bug.

Vadim