Thread: 7.0.3 _bt_restscan: my bits moved right off the end of the world!

7.0.3 _bt_restscan: my bits moved right off the end of the world!

From
Matthew
Date:
Postgre 7.0.3, on RedHat Linux 6.2 stock 2.2.16 kernel.  Nothing special I
can think of, this server has been up and in use for the last 128 days with
no problem.  Last night while cron was performing the nightly vacuuming of
all databases on one of our servers, I got this from cron.

Vacuuming cms
FATAL 1:  _bt_restscan: my bits moved right off the end of the world!Recreate index history_id_key.
pqReadData() -- backend closed the channel unexpectedly.This probably means the backend terminated abnormallybefore or
whileprocessing the request.
 
connection to server was lost

So this morning I did the following:

cms=# drop index history_id_key;
DROP
cms=# create unique index history_id_key on history(id);
CREATE
cms=# vacuum;
NOTICE:  Index locks_operator_id_ndx: pointer to EmptyPage (blk 44 off 2) -
fixing
NOTICE:  Index locks_operator_id_ndx: pointer to EmptyPage (blk 44 off 1) -
fixing
NOTICE:  Index locks_operator_id_ndx: pointer to EmptyPage (blk 44 off 4) -
fixing
NOTICE:  Index locks_operator_id_ndx: pointer to EmptyPage (blk 44 off 3) -
fixing
NOTICE:  Index locks_operator_id_ndx: pointer to EmptyPage (blk 44 off 5) -
fixing
NOTICE:  Index locks_id_key: pointer to EmptyPage (blk 44 off 2) - fixing
NOTICE:  Index locks_id_key: pointer to EmptyPage (blk 44 off 1) - fixing
NOTICE:  Index locks_id_key: pointer to EmptyPage (blk 44 off 4) - fixing
NOTICE:  Index locks_id_key: pointer to EmptyPage (blk 44 off 3) - fixing
NOTICE:  Index locks_id_key: pointer to EmptyPage (blk 44 off 5) - fixing
NOTICE:  Index locks_case_id_ndx: pointer to EmptyPage (blk 44 off 5) -
fixing
NOTICE:  Index locks_case_id_ndx: pointer to EmptyPage (blk 44 off 2) -
fixing
NOTICE:  Index locks_case_id_ndx: pointer to EmptyPage (blk 44 off 1) -
fixing
NOTICE:  Index locks_case_id_ndx: pointer to EmptyPage (blk 44 off 4) -
fixing
NOTICE:  Index locks_case_id_ndx: pointer to EmptyPage (blk 44 off 3) -
fixing
NOTICE:  Message from PostgreSQL backend:       The Postmaster has informed me that some other backend died
abnormally and possibly corrupted shared memory.       I have rolled back the current transaction and am going to
terminate
your database system connection and exit.       Please reconnect to the database system and repeat your query.
pqReadData() -- backend closed the channel unexpectedly.       This probably means the backend terminated abnormally
  before or while processing the request.
 
The connection to the server was lost. Attempting reset: Failed.

I have since stopped the database server and all my users are dead in the
water at the moment.  I took postgres down to single user mode and I'm doing
a vacuum and was considering doing an iccpclean.  Any other suggestions?
dump & restore? Any Idea what happened?

Thank you,

Matthew O'Connor




Re: 7.0.3 _bt_restscan: my bits moved right off the end of the world!

From
Tom Lane
Date:
Matthew <matt@ctlno.com> writes:
> [ a tale of woe ]

It looks like dropping and rebuilding *all* the indexes on your history
table would be a good move (possibly with a vacuum of the table while
the indexes are removed).  You might want to do a COPY out to try to
save the table data before the vacuum, in case there is corruption in
the table as well as the indexes.

Before you do all that, though, how big is the database?  Would you be
able/willing to tar up the whole $PGDATA tree and let some of us analyze
it?
        regards, tom lane


Re: 7.0.3 _bt_restscan: my bits moved right off the end of the world!

From
Tom Lane
Date:
Matthew <matt@ctlno.com> writes:
>     FYI now when I try to use psql to connect to the database I get this
> error:
>     bash$ psql cms
>     psql: FATAL 1:  cannot find attribute 1 of relation pg_trigger

So the indexes on pg_attribute are hosed too.  I wonder whether that was
the original source of the problem, and the rest of this is
side-effects?

I am starting to think that you'd best initdb and reload, but there is
one more thing to try: run REINDEX on the whole database in standalone
mode.  See the documentation for the procedure; I'm not too clear on it
since I've never had to do it myself.
        regards, tom lane


Re: 7.0.3 _bt_restscan: my bits moved right off the end of the world!

From
Tom Lane
Date:
Matthew <matt@ctlno.com> writes:
>     What do you mean by the whole database?  I have already executed:

>     reindex database cms force

(checks manual...)  That appears to be the right syntax.  If you did
that in a standalone backend with the appropriate command line options
(-O and -P) then I think you've done all you can.  Time for a reload :-(
        regards, tom lane