Re: Assertion failure with summarize_wal enabled during pg_createsubscriber - Mailing list pgsql-hackers

From Alexander Korotkov
Subject Re: Assertion failure with summarize_wal enabled during pg_createsubscriber
Date
Msg-id CAPpHfdtT=xAwEQL-fgRLz7vGXABFnnFReienNBJmfPwRZP8-NA@mail.gmail.com
Whole thread Raw
In response to Re: Assertion failure with summarize_wal enabled during pg_createsubscriber  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Assertion failure with summarize_wal enabled during pg_createsubscriber
List pgsql-hackers
On Mon, Jul 29, 2024 at 7:20 PM Robert Haas <robertmhaas@gmail.com> wrote:
> On Fri, Jul 26, 2024 at 4:10 PM Alexander Korotkov <aekorotkov@gmail.com> wrote:
> > 0002 could also use pg_indent and pgperltidy.  And I have couple other
> > notes regarding 0002.
>
> As Tom said elsewhere, we don't have a practice of requiring perltidy
> for every commit, and I normally don't bother. The tree is
> pgindent-clean currently so I believe I got that part right.

Got it, thanks.

> > >    In the process of fixing these bugs, I realized that the logic to wait
> > >    for WAL summarization to catch up was spread out in a way that made
> > >    it difficult to reuse properly, so this code refactors things to make
> > >    it easier.
> >
> > It would be nice to split refactoring out of material logic changed.
> > This way it would be easier to review and would look cleaner in the
> > git history.
>
> I support that idea in general but felt it was overkill in this case:
> it's new code, and there was only one existing caller of the function
> that got refactored, and I'm not a huge fan of cluttering the git
> history with a bunch of tiny little refactoring commits to fix a
> single bug. I might have changed it if I'd seen this note before
> committing, though.

I understand your point.  I'm also not huge fan of a flood of small
commits.  Nevertheless, I find splitting refactoring from other
changes generally useful.  That could be a single commit of many small
refactorings, not many small commits.  The point for me is easier
review: you can expect refactoring commit to contain "isomorphic"
changes, while other commits implementing material logic changes.  But
that might be a committer preference though.

> > >    To make this fix work, also teach the WAL summarizer that after a
> > >    promotion has occurred, no more WAL can appear on the previous
> > >    timeline: previously, the WAL summarizer wouldn't switch to the new
> > >    timeline until we actually started writing WAL there, but that meant
> > >    that when the startup process was waiting for the WAL summarizer, it
> > >    was waiting for an action that the summarizer wasn't yet prepared to
> > >    take.
> >
> > I think this is a pretty long sentence, and I'm not sure I can
> > understand it correctly.  Does startup process wait for the WAL
> > summarizer without this patch?  If not, it's not clear for me that
> > word "previously" doesn't distribute to this part of sentence.
> > Breaking this into multiple sentences could improve the clarity for
> > me.
>
> Yes, I think that phrasing is muddled. It's too late to amend the
> commit message, but for posterity: I initially thought that I could
> fix this bug by just teaching the startup process to wait for WAL
> summarization before performing the .partial renaming, but that was
> not enough by itself. The problem is that the WAL summarizer would
> read all the records that were present in the final WAL file on the
> old timeline, but it wouldn't actually write out a summary file,
> because that only happens when it reaches XLOG_CHECKPOINT_REDO or a
> timeline switch point. Since no WAL had appeared on the new timeline
> yet, it didn't view the end of the old timeline as a switch point, so
> it just sat there waiting, without ever writing out a file; and the
> startup process sat there waiting for it. So the second part of the
> fix is to make the WAL summarizer realize that once the startup
> process has chosen a new timeline ID, no more WAL is going to appear
> on the old timeline, and thus it can (and should) write out the final
> summary file for the old timeline and prepare to read WAL from the new
> timeline.

Great, thank you for the explanation.  Now that's clear.

------
Regards,
Alexander Korotkov
Supabase



pgsql-hackers by date:

Previous
From: "Daniel Verite"
Date:
Subject: Re: Fixing backslash dot for COPY FROM...CSV
Next
From: "David E. Wheeler"
Date:
Subject: Re: Proposal: Document ABI Compatibility