Re: [HACKERS] still getting FATAL errors on btree's... - Mailing list pgsql-hackers

From The Hermit Hacker
Subject Re: [HACKERS] still getting FATAL errors on btree's...
Date
Msg-id Pine.NEB.3.95.980413131420.22892O-100000@hub.org
Whole thread Raw
In response to Re: [HACKERS] still getting FATAL errors on btree's...  (The Hermit Hacker <scrappy@hub.org>)
Responses Re: [HACKERS] still getting FATAL errors on btree's...
List pgsql-hackers
On Mon, 13 Apr 1998, The Hermit Hacker wrote:

> On Mon, 13 Apr 1998, Bruce Momjian wrote:
>
> > >
> > >
> > > I'm still getting the following BTP_CHAIN errors on my btree index.  Funny
> > > thing is that its the *same* index each time, and this is after dropping
> > > and rebulding it...
> > >
> > > ...where next to investigate?  Recommendations?  IMHO, this is critical
> > > enough to hold off a v6.3.2 release :(
> >
> > Obiously there is something strange going on, or many more people would
> > be seeing it.  The question is what.
> >
> > Could it be the data?  Concentrate on that table, load only half, and
> > see if it happens.  Try loading first half of the file twice, to the
> > file is the same size, but the data is only from the first half.  Try it
> > with the second half too.  Does the problem change.  If so, there is
> > something in the data that is causing the problem.
>
>     Well, I kinda figured it had something to do with the data, but
> narrowing it down (500+k records) is something that isn't that easy :(
>
>     I know its the radhist_userid index, which is indexed on one
> field, userid...if there was some way of translating location in the index
> with a record number...?
>
>     Oh well...will continue to investigate and use your ideas...

This is very quickly doing downhill ;(

I took all entries in radhist newer then 01/01/98 and copied them into
radnew, then deleted those entries (first bad move), then I did an 'alter
table' to move radhist to radhist_old, and another 'alter table' to move
radnew back to radhist...

Totally locked up postmaster, so I had to kill off the processes (second
bad move)...

 ls -lt rad*
-rw-------  1 postgres  wheel  77144064 Apr 13 13:10 radhist_old
-rw-------  1 postgres  wheel   3842048 Apr 13 13:08 radlog
-rw-------  1 postgres  wheel   1073152 Apr 13 13:07 radlog_userid
-rw-------  1 postgres  wheel   1646592 Apr 13 13:07 radlog_uniq_id
-rw-------  1 postgres  wheel    999424 Apr 13 13:07 radlog_stop_time
-rw-------  1 postgres  wheel   1294336 Apr 13 13:07 radlog_start_time
-rw-------  1 postgres  wheel  36921344 Apr 13 12:55 radhist
-rw-------  1 postgres  wheel   6864896 Apr  6 10:14 radold


Now, I can't access radhist, even though the database is there:

acctng=> select * from radhist;
ERROR:  radhist: Table does not exist.

Checked the pg_class table, and radnew still existed, but radhist didn't,
so did the following to "fix" it...

update pg_class set relname = 'radhist' where relname = 'radnew';

Any particular reason why that was a bad idea?  I appears to have
worked...




pgsql-hackers by date:

Previous
From: The Hermit Hacker
Date:
Subject: Re: [HACKERS] still getting FATAL errors on btree's...
Next
From: Bruce Momjian
Date:
Subject: Re: [HACKERS] still getting FATAL errors on btree's...