Re: catalog corruption bug - Mailing list pgsql-hackers

From Jeremy Drake
Subject Re: catalog corruption bug
Date
Msg-id Pine.LNX.4.63.0601071650370.15097@garibaldi.apptechsys.com
Whole thread Raw
In response to Re: catalog corruption bug  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: catalog corruption bug
List pgsql-hackers
On Sat, 7 Jan 2006, Tom Lane wrote:

> Jeremy Drake <pgsql@jdrake.com> writes:
> > On Sat, 7 Jan 2006, Tom Lane wrote:
> >> I'll go fix CatCacheRemoveCList, but I think this is not the bug
> >> we're looking for.
>
> A bit of a leap in the dark, but: maybe the triggering event for this
> situation is not a "VACUUM pg_amop" but a global cache reset due to
> sinval message buffer overrun.  It's fairly clear how that would lead
> to the CatCacheRemoveCList bug.  The duplicate-key failure could be an
> unrelated bug triggered by the same condition.  I have no idea yet what
> the mechanism could be, but cache reset is a sufficiently seldom-exercised
> code path that it's entirely plausible that there are bugs lurking in it.
>
> If this is correct then we could vastly increase the probability of
> seeing the bug by setting up something to force cache resets at a high
> rate.  If you're interested I could put together a code patch for that.

I tried that function you sent, while running my other code.  It died, but
not the same way.  None of my processes had the unique constraint error,
but two had failed during commit.  Both of them died in that same place as
the last one, on pg_amop.

I think I am going to just run without the function running this time and
see if it does the duplicate type error and if it will generate two cores.




-- 
To kick or not to kick...-- Somewhere on IRC, inspired by Shakespeare


pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: plperl vs LC_COLLATE (was Re: Possible savepoint bug)
Next
From: Andrew Dunstan
Date:
Subject: Re: plperl vs LC_COLLATE (was Re: Possible savepoint bug)