Thread: still getting FATAL errors on btree's...

still getting FATAL errors on btree's...

From
The Hermit Hacker
Date:
I'm still getting the following BTP_CHAIN errors on my btree index.  Funny
thing is that its the *same* index each time, and this is after dropping
and rebulding it...

...where next to investigate?  Recommendations?  IMHO, this is critical
enough to hold off a v6.3.2 release :(


FATAL 1:  btree: BTP_CHAIN flag was expected in radhist_userid (access =
bt_read)
FATAL 1:  btree: BTP_CHAIN flag was expected in radhist_userid (access =
bt_read)
FATAL 1:  btree: BTP_CHAIN flag was expected in radhist_userid (access =
bt_read)
FATAL 1:  btree: BTP_CHAIN flag was expected in radhist_userid (access =
bt_read)
FATAL 1:  btree: BTP_CHAIN flag was expected in radhist_userid (access =
bt_read)



Re: [HACKERS] still getting FATAL errors on btree's...

From
Bruce Momjian
Date:
>
>
> I'm still getting the following BTP_CHAIN errors on my btree index.  Funny
> thing is that its the *same* index each time, and this is after dropping
> and rebulding it...
>
> ...where next to investigate?  Recommendations?  IMHO, this is critical
> enough to hold off a v6.3.2 release :(

Obiously there is something strange going on, or many more people would
be seeing it.  The question is what.

Could it be the data?  Concentrate on that table, load only half, and
see if it happens.  Try loading first half of the file twice, to the
file is the same size, but the data is only from the first half.  Try it
with the second half too.  Does the problem change.  If so, there is
something in the data that is causing the problem.

Is it something that we can repeat?  Can you put it on the ftp server
with a script so others can check it?  If you load just that table into
an empty database, does the problem still occur?


>
>
> FATAL 1:  btree: BTP_CHAIN flag was expected in radhist_userid (access =
> bt_read)
> FATAL 1:  btree: BTP_CHAIN flag was expected in radhist_userid (access =
> bt_read)
> FATAL 1:  btree: BTP_CHAIN flag was expected in radhist_userid (access =
> bt_read)
> FATAL 1:  btree: BTP_CHAIN flag was expected in radhist_userid (access =
> bt_read)
> FATAL 1:  btree: BTP_CHAIN flag was expected in radhist_userid (access =
> bt_read)
>
>
>
>


--
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)

Re: [HACKERS] still getting FATAL errors on btree's...

From
The Hermit Hacker
Date:
On Mon, 13 Apr 1998, Bruce Momjian wrote:

> >
> >
> > I'm still getting the following BTP_CHAIN errors on my btree index.  Funny
> > thing is that its the *same* index each time, and this is after dropping
> > and rebulding it...
> >
> > ...where next to investigate?  Recommendations?  IMHO, this is critical
> > enough to hold off a v6.3.2 release :(
>
> Obiously there is something strange going on, or many more people would
> be seeing it.  The question is what.
>
> Could it be the data?  Concentrate on that table, load only half, and
> see if it happens.  Try loading first half of the file twice, to the
> file is the same size, but the data is only from the first half.  Try it
> with the second half too.  Does the problem change.  If so, there is
> something in the data that is causing the problem.

    Well, I kinda figured it had something to do with the data, but
narrowing it down (500+k records) is something that isn't that easy :(

    I know its the radhist_userid index, which is indexed on one
field, userid...if there was some way of translating location in the index
with a record number...?

    Oh well...will continue to investigate and use your ideas...



Re: [HACKERS] still getting FATAL errors on btree's...

From
The Hermit Hacker
Date:
On Mon, 13 Apr 1998, The Hermit Hacker wrote:

> On Mon, 13 Apr 1998, Bruce Momjian wrote:
>
> > >
> > >
> > > I'm still getting the following BTP_CHAIN errors on my btree index.  Funny
> > > thing is that its the *same* index each time, and this is after dropping
> > > and rebulding it...
> > >
> > > ...where next to investigate?  Recommendations?  IMHO, this is critical
> > > enough to hold off a v6.3.2 release :(
> >
> > Obiously there is something strange going on, or many more people would
> > be seeing it.  The question is what.
> >
> > Could it be the data?  Concentrate on that table, load only half, and
> > see if it happens.  Try loading first half of the file twice, to the
> > file is the same size, but the data is only from the first half.  Try it
> > with the second half too.  Does the problem change.  If so, there is
> > something in the data that is causing the problem.
>
>     Well, I kinda figured it had something to do with the data, but
> narrowing it down (500+k records) is something that isn't that easy :(
>
>     I know its the radhist_userid index, which is indexed on one
> field, userid...if there was some way of translating location in the index
> with a record number...?
>
>     Oh well...will continue to investigate and use your ideas...

This is very quickly doing downhill ;(

I took all entries in radhist newer then 01/01/98 and copied them into
radnew, then deleted those entries (first bad move), then I did an 'alter
table' to move radhist to radhist_old, and another 'alter table' to move
radnew back to radhist...

Totally locked up postmaster, so I had to kill off the processes (second
bad move)...

 ls -lt rad*
-rw-------  1 postgres  wheel  77144064 Apr 13 13:10 radhist_old
-rw-------  1 postgres  wheel   3842048 Apr 13 13:08 radlog
-rw-------  1 postgres  wheel   1073152 Apr 13 13:07 radlog_userid
-rw-------  1 postgres  wheel   1646592 Apr 13 13:07 radlog_uniq_id
-rw-------  1 postgres  wheel    999424 Apr 13 13:07 radlog_stop_time
-rw-------  1 postgres  wheel   1294336 Apr 13 13:07 radlog_start_time
-rw-------  1 postgres  wheel  36921344 Apr 13 12:55 radhist
-rw-------  1 postgres  wheel   6864896 Apr  6 10:14 radold


Now, I can't access radhist, even though the database is there:

acctng=> select * from radhist;
ERROR:  radhist: Table does not exist.

Checked the pg_class table, and radnew still existed, but radhist didn't,
so did the following to "fix" it...

update pg_class set relname = 'radhist' where relname = 'radnew';

Any particular reason why that was a bad idea?  I appears to have
worked...




Re: [HACKERS] still getting FATAL errors on btree's...

From
Bruce Momjian
Date:
> Checked the pg_class table, and radnew still existed, but radhist didn't,
> so did the following to "fix" it...
>
> update pg_class set relname = 'radhist' where relname = 'radnew';
>
> Any particular reason why that was a bad idea?  I appears to have
> worked...

I believe this is what alter table does.

--
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)

Re: [HACKERS] still getting FATAL errors on btree's...

From
The Hermit Hacker
Date:
On Mon, 13 Apr 1998, Bruce Momjian wrote:

> > Checked the pg_class table, and radnew still existed, but radhist didn't,
> > so did the following to "fix" it...
> >
> > update pg_class set relname = 'radhist' where relname = 'radnew';
> >
> > Any particular reason why that was a bad idea?  I appears to have
> > worked...
>
> I believe this is what alter table does.

    That's what I think too...I was just worried that it might do
something else on top of it all:(



Re: [HACKERS] still getting FATAL errors on btree's...

From
Maarten Boekhold
Date:
On Mon, 13 Apr 1998, Bruce Momjian wrote:

> >
> >
> > I'm still getting the following BTP_CHAIN errors on my btree index.  Funny
> > thing is that its the *same* index each time, and this is after dropping
> > and rebulding it...
> >
> > ...where next to investigate?  Recommendations?  IMHO, this is critical
> > enough to hold off a v6.3.2 release :(
>
> Obiously there is something strange going on, or many more people would
> be seeing it.  The question is what.

I have seen the same message on a 6.2 system. However, after I had
dropped and rebuilt the indices, the porblem disappaered completely and I
haven't seen it since.

Maarten

_____________________________________________________________________________
| TU Delft, The Netherlands, Faculty of Information Technology and Systems  |
|                   Department of Electrical Engineering                    |
|           Computer Architecture and Digital Technique section             |
|                          M.Boekhold@et.tudelft.nl                         |
-----------------------------------------------------------------------------