Re: Assertion failing in master, predicate.c - Mailing list pgsql-hackers

From Mark Dilger
Subject Re: Assertion failing in master, predicate.c
Date
Msg-id c166215f-ef57-0890-2501-8ab374d2c5a9@gmail.com
Whole thread Raw
In response to Re: Assertion failing in master, predicate.c  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Assertion failing in master, predicate.c  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers

On 11/21/19 8:03 PM, Tom Lane wrote:
> Mark Dilger <hornschnorter@gmail.com> writes:
>> I have winnowed down the test a bit further.  The attached
>> smaller patch still triggers the same assertion as the prior
>> patch did.
> 
> FWIW, I can reproduce the assertion failure with your first test,
> but not with this simplified one.

Thanks for checking!

> I also confirm that it only happens in HEAD, not v12.  I've not
> actually bisected, but a look at the git history for predicate.c
> sure makes it look like db2687d1f ("Optimize PredicateLockTuple")
> must be to blame.

`git bisect` shows the problem occurs earlier than that, and by
chance the first bad commit was one of yours.  I'm not surprised
that your commit was regarding LISTEN/NOTIFY, as the error is
always triggered with a LISTEN statement.  (I've now hit this
many times in many tests of multiple SQL statements, and the
last statement before the error is always a LISTEN.)

I'm cc'ing Martijn because he is mentioned in that commit.


51004c7172b5c35afac050f4d5849839a06e8d3b is the first bad commit
commit 51004c7172b5c35afac050f4d5849839a06e8d3b
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date:   Sun Sep 22 11:46:29 2019 -0400

     Make some efficiency improvements in LISTEN/NOTIFY.

     Move the responsibility for advancing the NOTIFY queue tail pointer
     from the listener(s) to the notification sender, and only have the
     sender do it once every few queue pages, rather than after every batch
     of notifications as at present.  This reduces the number of times we
     execute asyncQueueAdvanceTail, and reduces contention when there are
     multiple listeners (since that function requires exclusive lock).
     This change relies on the observation that we don't really need the 
tail
     pointer to be exactly up-to-date.  It's certainly not necessary to
     attempt to release disk space more often than once per SLRU segment.
     The only other usage of the tail pointer is that an incoming listener,
     if it's the only listener in its database, will need to scan the queue
     forward from the tail; but that's surely a less performance-critical
     path than routine sending and receiving of notifies.  We compromise by
     advancing the tail pointer after every 4 pages of output, so that it
     shouldn't get more than a few pages behind.

     Also, when sending signals to other backends after adding notify
     message(s) to the queue, recognize that only backends in our own
     database are going to care about those messages, so only such
     backends really need to be awakened promptly.  Backends in other
     databases should get kicked if they're well behind on reading the
     queue, else they'll hold back the global tail pointer; but wakening
     them for every single message is pointless.  This change can
     substantially reduce signal traffic if listeners are spread among
     many databases.  It won't help for the common case of only a single
     active database, but the extra check costs very little.

     Martijn van Oosterhout, with some adjustments by me

     Discussion: 
https://postgr.es/m/CADWG95vtRBFDdrx1JdT1_9nhOFw48KaeTev6F_LtDQAFVpSPhA@mail.gmail.com
     Discussion: 
https://postgr.es/m/CADWG95uFj8rLM52Er80JnhRsTbb_AqPP1ANHS8XQRGbqLrU+jA@mail.gmail.com


-- 
Mark Dilger



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: backup manifests
Next
From: Robert Haas
Date:
Subject: Re: backup manifests