Re: Get rid of WALBufMappingLock - Mailing list pgsql-hackers

From Alexander Korotkov
Subject Re: Get rid of WALBufMappingLock
Date
Msg-id CAPpHfds==kM3YJoTX64rV-CVsan8ARk3w32tgdCZ69BzD01i_g@mail.gmail.com
Whole thread Raw
In response to Re: Get rid of WALBufMappingLock  (Alexander Korotkov <aekorotkov@gmail.com>)
List pgsql-hackers
Hi!

On Fri, Feb 14, 2025 at 4:11 PM Yura Sokolov <y.sokolov@postgrespro.ru> wrote:
> 14.02.2025 17:09, Yura Sokolov пишет:
> > 14.02.2025 13:24, Alexander Korotkov пишет:
> >> On Fri, Feb 14, 2025 at 11:45 AM Pavel Borisov <pashkin.elfe@gmail.com> wrote:
> >>> On Fri, 14 Feb 2025 at 00:59, Alexander Korotkov <aekorotkov@gmail.com> wrote:
> >>>> On Thu, Feb 13, 2025 at 6:39 PM Pavel Borisov <pashkin.elfe@gmail.com> wrote:
> >>>>> On Thu, 13 Feb 2025 at 14:08, Alexander Korotkov <aekorotkov@gmail.com> wrote:
> >>>>>>
> >>>>>> On Thu, Feb 13, 2025 at 11:45 AM Yura Sokolov <y.sokolov@postgrespro.ru> wrote:
> >>>>>>> 13.02.2025 12:34, Alexander Korotkov пишет:
> >>>>>>>> On Wed, Feb 12, 2025 at 8:16 PM Yura Sokolov <y.sokolov@postgrespro.ru> wrote:
> >>>>>>>>> 08.02.2025 13:07, Alexander Korotkov пишет:
> >>>>>>>>>> On Fri, Feb 7, 2025 at 1:39 PM Alexander Korotkov <aekorotkov@gmail.com> wrote:
> >>>>>>>>>>> Good, thank you.  I think 0001 patch is generally good, but needs some
> >>>>>>>>>>> further polishing, e.g. more comments explaining how does it work.
> >>>>>>>>>
> >>>>>>>>> I tried to add more comments. I'm not good at, so recommendations are welcome.
> >>>>>>>>>
> >>>>>>>>>> Two things comes to my mind worth rechecking about 0001.
> >>>>>>>>>> 1) Are XLogCtl->InitializeReserved, XLogCtl->InitializedUpTo and
> >>>>>>>>>> XLogCtl->xlblocks always page-aligned?  Because algorithm seems to be
> >>>>>>>>>> sensitive to that.  If so, I would propose to explicitly comment that
> >>>>>>>>>> and add corresponding asserts.
> >>>>>>>>>
> >>>>>>>>> They're certainly page aligned, since they are page borders.
> >>>>>>>>> I added assert on alignment of InitializeReserved for the sanity.
> >>>>>>>>>
> >>>>>>>>>> 2) Check if there are concurrency issues between
> >>>>>>>>>> AdvanceXLInsertBuffer() and switching to the new WAL file.
> >>>>>>>>>
> >>>>>>>>> There are no issues:
> >>>>>>>>> 1. CopyXLogRecordToWAL for isLogSwitch follows same protocol, ie uses
> >>>>>>>>> GetXLogBuffer to zero-out WAL page.
> >>>>>>>>> 2. WALINSERT_SPECIAL_SWITCH forces exclusive lock on all insertion locks,
> >>>>>>>>> so switching wal is not concurrent. (Although, there is no need in this
> >>>>>>>>> exclusiveness, imho.)
> >>>>>>>>
> >>>>>>>> Good, thank you.  I've also revised commit message and comments.
> >>>>>>>>
> >>>>>>>> But I see another issue with this patch.  In the worst case, we do
> >>>>>>>> XLogWrite() by ourselves, and it could potentially could error out.
> >>>>>>>> Without patch, that would cause WALBufMappingLock be released and
> >>>>>>>> XLogCtl->InitializedUpTo not advanced.  With the patch, that would
> >>>>>>>> cause other processes infinitely waiting till we finish the
> >>>>>>>> initialization.
> >>>>>>>>
> >>>>>>>> Possible solution would be to save position of the page to be
> >>>>>>>> initialized, and set it back to XLogCtl->InitializeReserved on error
> >>>>>>>> (everywhere we do LWLockReleaseAll()).  We also must check that on
> >>>>>>>> error we only set XLogCtl->InitializeReserved to the past, because
> >>>>>>>> there could be multiple concurrent failures.  Also we need to
> >>>>>>>> broadcast XLogCtl->InitializedUpToCondVar to wake up waiters.
> >>>>>>>
> >>>>>>> The single place where AdvanceXLInsertBuffer is called outside of critical
> >>>>>>> section is in XLogBackgroundFlush. All other call stacks will issue server
> >>>>>>> restart if XLogWrite will raise error inside of AdvanceXLInsertBuffer.
> >>>>>>>
> >>>>>>> XLogBackgroundFlush explicitely avoids writing buffers by passing
> >>>>>>> opportunistic=true parameter.
> >>>>>>>
> >>>>>>> Therefore, error in XLogWrite will not cause hang in AdvanceXLInsertBuffer
> >>>>>>> since server will shutdown/restart.
> >>>>>>>
> >>>>>>> Perhaps, we just need to insert `Assert(CritSectionCount > 0);` before call
> >>>>>>> to XLogWrite here?
> >>>>>>
> >>>>>> You're correct.  I just reflected this in the next revision of the patch.
> >>>>>
> >>>>> I've looked at the patchset v6.
> >>>>
> >>>> Oh, sorry, I really did wrong.  I've done git format-patch for wrong
> >>>> local branch for v5 and v6.  Patches I've sent for v5 and v6 are
> >>>> actually the same as my v1.  This is really pity.  Please, find the
> >>>> right version of patchset attached.
> >>>
> >>> I've rechecked v7. In v6 a proposal from [1] was not reflected. Now it
> >>> landed in v7.
> >>>
> >>> Other changes are not regarding code behavior. The things from my
> >>> previous review that still could apply to v7:
> >>>
> >>> For 0001:
> >>>
> >>> Comment change proposed:
> >>> "lock-free with cooperation with" -> "lock-free accompanied by changes
> >>> to..." (maybe other variant)
> >>
> >> Good catch.  I've rephrased this comment even more.
> >>
> >>> I propose a new define:
> >>> #define FirstValidXLogRecPtr 1
> >>> While FirstValidXLogRecPtr = InvalidXLogRecPtr + 1 is true in the code
> >>> that has no semantical meaning and it's better to avoid using direct
> >>> arithmetics to relate meaning of FirstValidXLogRecPtr from
> >>> InvalidXLogRecPtr.
> >>
> >> Makes sense, but I'm not sure if this change is required at all.  I've
> >> reverted this to the state of master, and everything seems to work.
> >>
> >>> For 0002 both comments proposals from my message applied to v6 apply
> >>> to v7 as well
> >>
> >> Thank you for pointing.  For now, I'm concentrated on improvements on
> >> 0001.  Probably Yura could work on your notes to 0002.
> >
> > I wrote good commit message for 0002 with calculated probabilities and
> > simple Ruby program which calculates them to explain choice of 2
> > conditional attempts. (At least I hope the message is good). And added
> > simple comment before `int attempts = 2;`
> >
> > Also I simplified 0002 a bit to look a bit prettier (ie without goto), and
> > added static assert on NUM_XLOGINSERT_LOCKS being power of 2.
> >
> > (0001 patch is same as for v8)
>
> Oops, forgot to add StaticAssert into v9-0002.

Thank you.  I'm planning to push 0001 if there is no objections.  And
I'm planning to do more review/revision of 0002.

------
Regards,
Alexander Korotkov
Supabase



pgsql-hackers by date:

Previous
From: Andrei Lepikhov
Date:
Subject: Re: explain analyze rows=%.0f
Next
From: Andrei Lepikhov
Date:
Subject: Re: Proposal - Allow extensions to set a Plan Identifier