Strange failure in LWLock on skink in REL9_5_STABLE - Mailing list pgsql-hackers

From Thomas Munro
Subject Strange failure in LWLock on skink in REL9_5_STABLE
Date
Msg-id CAEepm=0vLh5oX2Ve+wS5BiauEoRA2j5R3G7CRwBRxYzc+9zg5g@mail.gmail.com
Whole thread Raw
Responses Re: Strange failure in LWLock on skink in REL9_5_STABLE
List pgsql-hackers
Hello,

Andres pinged me off-list to point out this failure after my commit fb389498be:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2018-09-20%2005%3A24%3A34

Change Set for this build:
fb389498be Tue Sep 18 11:19:22 2018 UTC  Allow DSM allocation to be interrupted.

The failure looks like this:

! FATAL:  semop(id=332464133) failed: Invalid argument
! CONTEXT:  SQL statement "CREATE TEMP TABLE brin_result (cid tid)"
! PL/pgSQL function inline_code_block line 22 at SQL statement
! PANIC:  queueing for lock while waiting on another one
! server closed the connection unexpectedly
! This probably means the server terminated abnormally
! before or while processing the request.
! connection to server was lost

I don't immediately see any connection between that particular commit,
which relates to the treatment of signals while allocating a DSM
segment, and the location of the first failure, which is in a
statement that is creating a temporary table.  On the other hand skink
has been very stable lately.  I'm also not sure how the FATAL error
and the PANIC are related (LWLockQueueSelf() has discovered that
MyProc->lwWaiting is already set).  Though it's possible that the root
problem was something happening in any of the other parallel tests
running, I don't see how any of those (lock security_label tablesample
object_address rowsecurity collate spgist privileges matview
replica_identity brin gin gist groupingsets) would reach code touched
by that commit in 9.5, but I don't currently have any other ideas
about what happened here.

-- 
Thomas Munro
http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: "Tsunakawa, Takayuki"
Date:
Subject: RE: Changing the setting of wal_sender_timeout per standby
Next
From: Andres Freund
Date:
Subject: Re: logical decoding bug when mapped relation with toast contents isrewritten repeatedly