Re: [HACKERS] [PATCH] Assert that the correct locks are held whencalling PageGetLSN() - Mailing list pgsql-hackers

From Jacob Champion
Subject Re: [HACKERS] [PATCH] Assert that the correct locks are held whencalling PageGetLSN()
Date
Msg-id CABAq_6G3TGnBLFRrMXsO8Z_p8eF1atjTj0J4QVG+TQ9c0_S-Tw@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] [PATCH] Assert that the correct locks are held whencalling PageGetLSN()  (Michael Paquier <michael.paquier@gmail.com>)
List pgsql-hackers
Hi Michael,

On Mon, Nov 6, 2017 at 6:18 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> Did you really test WAL replay?

Is there a way to test this other than installcheck-world? The only
failure we've run into at the moment is in the snapshot-too-old tests.
Maybe we're not configuring with all the tests enabled?

>> The assertion added caught at least one code path where TestForOldSnapshot
>> calls PageGetLSN without holding any lock.  The snapshot_too_old test in
>> "check-world" failed due to the assertion failure.  This needs to be fixed,
>> see the open question in the opening mail on this thread.
>
> Good point. I am looping Kevin Grittner here for his input, as the
> author and committer of old_snapshot_threshold. Things can be
> addressed with a separate patch by roughly moving the check on
> PageGetLSN() to TestForOldSnapshot_impl() and use BufferGetLSNAtomic()
> instead.

It still doesn't pass the tests, as not all code paths ensure that a
content lock is held before calling TestForOldSnapshot.
BufferGetLSNAtomic only adds the spinlock.

> The commit fest has lost view of this entry, and so did I. So I have
> added a new entry:
> https://commitfest.postgresql.org/16/1371/

Thank you!

> BufferGetLSNAtomic() could really use LWLockHeldByMe(). Could you
> consider it with an extra patch on top of 0001?

The LWLockHeldByMe() calls are added to BufferGetLSNAtomic in patch
0002 (because it does all its work through PageGetLSN).

> It seems to me that 0001 is good for a committer lookup, that will get
> rid of all existing bugs. For 0002, what you are proposing is still
> not a good idea for anything using page copies.

I think there is still significant confusion here. To be clear: this
patch is intended to add no changes for local page copies. As I've
tried to say (three or four times now): to our understanding, local
page copies are not allocated in the shared BufferBlocks region and
are therefore not subject to the new assertions. Am I missing a corner
case, or completely misunderstanding your point? I never got a direct
response to my earlier questions on this.

--Jacob


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: [HACKERS] Remove secondary checkpoint
Next
From: Andres Freund
Date:
Subject: Re: [HACKERS] [POC] Faster processing at Gather node