Home > mailing lists

Re: [HACKERS] Server crash (FailedAssertion) due to catcache refcount mis-handling - Mailing list pgsql-hackers

From	Tom Lane
Subject	Re: [HACKERS] Server crash (FailedAssertion) due to catcache refcount mis-handling
Date	August 8, 2017 18:36:17
Msg-id	4244.1502206577@sss.pgh.pa.us Whole thread Raw
In response to	[HACKERS] Server crash (FailedAssertion) due to catcache refcount mis-handling (Jeevan Chalke <jeevan.chalke@enterprisedb.com>)
Responses	Re: [HACKERS] Server crash (FailedAssertion) due to catcache refcount mis-handling
List	pgsql-hackers

Tree view

Jeevan Chalke <jeevan.chalke@enterprisedb.com> writes:
> We have observed a random server crash (FailedAssertion), while running few
> tests at our end. Stack-trace is attached.

> By looking at the stack-trace, and as discussed it with my team members;
> what we have observed that in SearchCatCacheList(), we are incrementing
> refcount and then decrementing it at the end. However for some reason, if
> we are in TRY() block (where we increment the refcount), and hit with any
> interrupt, we failed to decrement the refcount due to which later we get
> assertion failure.

Hm.  So SearchCatCacheList has a PG_TRY block that is meant to release
those refcounts, but if you hit the backend with a SIGTERM while it's
in that function, control goes out through elog(FATAL) which doesn't
execute the PG_CATCH cleanup.  But it does do AbortTransaction which
calls AtEOXact_CatCache, and that is expecting that all the cache
refcounts have reached zero.

We could respond to this by using PG_ENSURE_ERROR_CLEANUP there instead
of plain PG_TRY.  But I have an itchy feeling that there may be a lot
of places with similar issues.  Should we be revisiting the basic way
that elog(FATAL) works, to make it less unlike elog(ERROR)?
        regards, tom lane

pgsql-hackers by date:

From: Robert Haas
Date: 08 August 2017, 17:49:52
Subject: Re: [HACKERS] pl/perl extension fails on Windows

From: amul sul
Date: 08 August 2017, 18:45:38
Subject: Re: [HACKERS] reload-through-the-top-parent switch the partition table

Re: [HACKERS] Server crash (FailedAssertion) due to catcache refcount mis-handling - Mailing list pgsql-hackers

Previous

Next