Re: catalog corruption bug - Mailing list pgsql-hackers

From Jeremy Drake
Subject Re: catalog corruption bug
Date
Msg-id Pine.LNX.4.63.0601071106090.15097@garibaldi.apptechsys.com
Whole thread Raw
In response to Re: catalog corruption bug  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: catalog corruption bug
List pgsql-hackers
On Sat, 7 Jan 2006, Tom Lane wrote:

> Jeremy Drake <pgsql@jdrake.com> writes:
> > Am I correct in interpreting this as the hash opclass for Oid?
>
> However, AFAICS the only consequence of this bug is to trigger
> that Assert failure if you've got Asserts enabled.  Dead catcache
> entries aren't actually harmful except for wasting some space.
> So I don't think this is related to your pg_type duplicate key
> problem.
>
> One weak spot in this theory is the assumption that somebody was
> vacuuming pg_amop.  It seems unlikely that autovacuum would do so
> since the table never changes (unless you had reached the point
> where an anti-XID-wraparound vacuum was needed, which is unlikely
> in itself).  Do you have any background processes that do full-database
> VACUUMs?

No.  Just the autovacuum, which is actually the process which had the
assert failure.

This appears to give the current xid
(gdb) p *s
$10 = { transactionId = 13568516, subTransactionId = 1, name = 0x0, savepointLevel = 0, state = TRANS_COMMIT,
blockState= TBLOCK_STARTED, nestingLevel = 1, curTransactionContext = 0x9529c0, curTransactionOwner = 0x92eb40,
childXids= 0x0, currentUser = 0, prevXactReadOnly = 0 '\0', parent = 0x0
 
}

>
> I'll go fix CatCacheRemoveCList, but I think this is not the bug
> we're looking for.

Incidentally, one of my processes did get that error at the same time.
All of the other processes had an error
DBD::Pg::st execute failed: server closed the connection unexpectedly       This probably means the server terminated
abnormally      before or while processing the request.
 

But this one had the DBD::Pg::st execute failed: ERROR:  duplicate key
violates unique constraint "pg_type_typname_nsp_index"

It looks like my kernel did not have the option to append the pid to core
files ,so perhaps they both croaked at the same time but only this one got
to write a core file?

I will enable this and try again, see if I can't get it to make 2 cores.

BTW, nothing of any interest made it into the backend log regarding what
assert(s) failed.



pgsql-hackers by date:

Previous
From: Joachim Wieland
Date:
Subject: Re: CIDR/INET improvements
Next
From: "Qingqing Zhou"
Date:
Subject: Re: Warm-up cache may have its virtue