Thread: Assertion failure with summarize_wal enabled during pg_createsubscriber

Hi,

In HEAD, I encountered the following assertion failure when I enabled summarize_wal
and ran pg_createsubscriber.

2024-07-01 14:42:15.697 JST [19195] LOG:  database system is ready to accept connections
TRAP: failed Assert("switchpoint >= state->EndRecPtr"), File: "walsummarizer.c", Line: 1382, PID: 19200
0   postgres                            0x0000000105c46c5d ExceptionalCondition + 189
1   postgres                            0x000000010590b1e4 summarizer_read_local_xlog_page + 340
2   postgres                            0x00000001054e401e ReadPageInternal + 542
3   postgres                            0x00000001054e24c0 XLogDecodeNextRecord + 464
4   postgres                            0x00000001054e2283 XLogReadAhead + 67
5   postgres                            0x00000001054e2185 XLogReadRecord + 53
6   postgres                            0x000000010590a3ab SummarizeWAL + 1115
7   postgres                            0x000000010590963a WalSummarizerMain + 1242
8   postgres                            0x00000001058fd10a postmaster_child_launch + 234
9   postgres                            0x000000010590133d StartChildProcess + 29
10  postgres                            0x0000000105904582 MaybeStartWalSummarizer + 82
11  postgres                            0x0000000105901af1 ServerLoop + 1153
12  postgres                            0x00000001059007ca PostmasterMain + 6554
13  postgres                            0x00000001057a3782 main + 818
14  dyld                                0x00007ff80e5e2366 start + 1942
2024-07-01 14:42:15.912 JST [19195] LOG:  WAL summarizer process (PID 19200) was terminated by signal 6: Abort trap: 6
2024-07-01 14:42:15.913 JST [19195] LOG:  terminating any other active server processes

Here are the steps to reproduce this issue.

--------------------------------
initdb -D pub
cat <<EOF >> pub/postgresql.conf
wal_level = 'logical'
summarize_wal = on
EOF
pg_ctl -D pub start
pgbench -i
pgbench -T 600 &
pg_basebackup -D sub -c fast -R
pg_createsubscriber -d postgres -D sub -p 5433 -P "port=5432"
--------------------------------

Is this the known issue?

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION



On Mon, Jul 01, 2024 at 02:54:56PM +0900, Fujii Masao wrote:
> In HEAD, I encountered the following assertion failure when I enabled summarize_wal
> and ran pg_createsubscriber.
>
> Is this the known issue?

Nope. So, Open Item, here we go.
--
Michael

Attachment
On Mon, Jul 1, 2024 at 2:08 AM Michael Paquier <michael@paquier.xyz> wrote:
> Nope. So, Open Item, here we go.

Some initial investigation:

In this test case, pg_subscriber, during the "starting the subscriber"
section of its work, starts up the database in the "sub" directory as
a standby. It enters standby mode, begins redo, and is then promoted,
selecting timeline 2. The WAL summarizer is supposed to end
summarization at the point at which timeline 1 ended and then resume
summarizing from the beginning of timeline 2. But instead, it fails an
assertion:

                    Assert(switchpoint >= state->EndRecPtr);

This assertion is trying to verify that, when a new timeline is
spawned, we don't read past the switchpoint on the original timeline.
Here, we have apparently done that. In one test, I got switchpoint =
0/51000510, state->EndRecPtr = 0/51000600. According to pg_waldump, on
timeline 1, we have this record at that LSN:

rmgr: Heap        len (rec/tot):     54/    54, tx:    2313637, lsn:
0/51000510, prev 0/510004D0, desc: DELETE xmax: 2313637, off: 3,
infobits: [KEYS_UPDATED], flags: 0x00, blkref #0: rel 1663/5/6104 blk
0

And on timeline 2, we have this at that LSN:

rmgr: XLOG        len (rec/tot):    114/   114, tx:          0, lsn:
0/51000510, prev 0/510004D0, desc: CHECKPOINT_SHUTDOWN redo
0/51000510; tli 2; prev tli 1; fpw true; xid 0:2313638; oid 24576;
multi 1; offset 0; oldest xid 730 in DB 1; oldest multi 1 in DB 1;
oldest/newest commit timestamp xid: 0/0; oldest running xid 0;
shutdown

It appears that pg_subscriber creates a recovery.conf that includes:

recovery_target_timeline = 'latest'
recovery_target_inclusive = true
recovery_target_lsn = '%X/%X'

...where %X/%X represents a valid LSN.

I think the problem here is that the WAL summarizer believes that when
a new timeline appears, it should pick up from where the old timeline
ended. And here, that doesn't happen: the new timeline branches off
before the end of the old timeline, because of the recovery target.

I'm not yet sure what should be done about this. The obvious answer is
"remove the assertion," and maybe that is all we need to do. However,
I'm not quite sure what the actual behavior will be if we just do
that, so I think more investigation is needed. I'll keep looking at
this, although given the US holiday I may not have results until next
week.

--
Robert Haas
EDB: http://www.enterprisedb.com



On Wed, Jul 3, 2024 at 1:07 PM Robert Haas <robertmhaas@gmail.com> wrote:
> I think the problem here is that the WAL summarizer believes that when
> a new timeline appears, it should pick up from where the old timeline
> ended. And here, that doesn't happen: the new timeline branches off
> before the end of the old timeline, because of the recovery target.
>
> I'm not yet sure what should be done about this. The obvious answer is
> "remove the assertion," and maybe that is all we need to do. However,
> I'm not quite sure what the actual behavior will be if we just do
> that, so I think more investigation is needed. I'll keep looking at
> this, although given the US holiday I may not have results until next
> week.

Here is a draft patch that attempts to fix this problem. I'm not
certain that it's completely correct, but it does seem to fix the
reported issue.

--
Robert Haas
EDB: http://www.enterprisedb.com

Attachment
On Wed, Jul 10, 2024 at 5:01 PM Robert Haas <robertmhaas@gmail.com> wrote:
> Here is a draft patch that attempts to fix this problem. I'm not
> certain that it's completely correct, but it does seem to fix the
> reported issue.

I tried to write a test case for this and discovered that there are
actually two separate problems in this area. First, as shown by the
assertion failure reported by Fujii Masao, the WAL summarizer thinks
that it should never need to back up to an earlier LSN, and the test
case he provided shows that this is incorrect. Second, the WAL
summarizer can end up in a bad state after the startup process renames
the last WAL file on the old timeline to a .partial file. If this
happens before the file has been summarized, then the WAL summarizer
can't access it any more and errors out. Promotion also removes WAL
files from the old timeline completely, but only if they're after the
switch point, and summarization doesn't care about those anyway. So
the partial file seems to be the only problem case.

In theory, the problem with the partial file could be handled in a
variety of ways: we could teach summarization to read the partial
file, perhaps, or postpone adding the .partial suffix until after
summarization has happened. But in practice, given where we are in the
release cycle, the only reasonable approach that I can see is to have
promotion wait for summarization to catch up, so that's what I did in
0003.

0002 is the same as what I posted previously as 0001, and teaches the
summarizer about backing up when we switch timelines. 0001 adds a
missing call to ConditionVariableCancelSleep; AFAIK, that omission has
no important consequences, but still seems like it should be fixed.

--
Robert Haas
EDB: http://www.enterprisedb.com

Attachment
I went ahead and committed 0001, which is pretty trivial. Hopefully,
at least, it's trivial enough that I didn't mess it up.

Here's a rebase of 0002 and 0003, now 0001 and 0002, with one minor
fix to hopefully avoid annoying CI.

I'm still hoping for some review/feedback/testing of these before I
commit them, but I also don't want to wait too long.

Thanks,

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Attachment
On Mon, Jul 22, 2024 at 11:02 AM Robert Haas <robertmhaas@gmail.com> wrote:
> I'm still hoping for some review/feedback/testing of these before I
> commit them, but I also don't want to wait too long.

Hearing nothing, pushed 0001.

--
Robert Haas
EDB: http://www.enterprisedb.com



On Fri, Jul 26, 2024 at 10:09:22AM -0400, Robert Haas wrote:
> On Mon, Jul 22, 2024 at 11:02 AM Robert Haas <robertmhaas@gmail.com> wrote:
>> I'm still hoping for some review/feedback/testing of these before I
>> commit them, but I also don't want to wait too long.
> 
> Hearing nothing, pushed 0001.

nitpick:  I think this one needs a pgindent.

-- 
nathan



On Fri, Jul 26, 2024 at 11:51 AM Nathan Bossart
<nathandbossart@gmail.com> wrote:
> nitpick:  I think this one needs a pgindent.

Ugh, sorry. I thought of that while I was working on the commit but
then I messed up some other aspect of it and this went out of my head.

Fixed now, I hope.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: Assertion failure with summarize_wal enabled during pg_createsubscriber

From
Alexander Korotkov
Date:
On Fri, Jul 26, 2024 at 7:02 PM Robert Haas <robertmhaas@gmail.com> wrote:
> On Fri, Jul 26, 2024 at 11:51 AM Nathan Bossart
> <nathandbossart@gmail.com> wrote:
> > nitpick:  I think this one needs a pgindent.
>
> Ugh, sorry. I thought of that while I was working on the commit but
> then I messed up some other aspect of it and this went out of my head.
>
> Fixed now, I hope.

0002 could also use pg_indent and pgperltidy.  And I have couple other
notes regarding 0002.

>    In the process of fixing these bugs, I realized that the logic to wait
>    for WAL summarization to catch up was spread out in a way that made
>    it difficult to reuse properly, so this code refactors things to make
>    it easier.

It would be nice to split refactoring out of material logic changed.
This way it would be easier to review and would look cleaner in the
git history.

>    To make this fix work, also teach the WAL summarizer that after a
>    promotion has occurred, no more WAL can appear on the previous
>    timeline: previously, the WAL summarizer wouldn't switch to the new
>    timeline until we actually started writing WAL there, but that meant
>    that when the startup process was waiting for the WAL summarizer, it
>    was waiting for an action that the summarizer wasn't yet prepared to
>    take.

I think this is a pretty long sentence, and I'm not sure I can
understand it correctly.  Does startup process wait for the WAL
summarizer without this patch?  If not, it's not clear for me that
word "previously" doesn't distribute to this part of sentence.
Breaking this into multiple sentences could improve the clarity for
me.

------
Regards,
Alexander Korotkov
Supabase



On Fri, Jul 26, 2024 at 4:10 PM Alexander Korotkov <aekorotkov@gmail.com> wrote:
> 0002 could also use pg_indent and pgperltidy.  And I have couple other
> notes regarding 0002.

As Tom said elsewhere, we don't have a practice of requiring perltidy
for every commit, and I normally don't bother. The tree is
pgindent-clean currently so I believe I got that part right.

> >    In the process of fixing these bugs, I realized that the logic to wait
> >    for WAL summarization to catch up was spread out in a way that made
> >    it difficult to reuse properly, so this code refactors things to make
> >    it easier.
>
> It would be nice to split refactoring out of material logic changed.
> This way it would be easier to review and would look cleaner in the
> git history.

I support that idea in general but felt it was overkill in this case:
it's new code, and there was only one existing caller of the function
that got refactored, and I'm not a huge fan of cluttering the git
history with a bunch of tiny little refactoring commits to fix a
single bug. I might have changed it if I'd seen this note before
committing, though.

> >    To make this fix work, also teach the WAL summarizer that after a
> >    promotion has occurred, no more WAL can appear on the previous
> >    timeline: previously, the WAL summarizer wouldn't switch to the new
> >    timeline until we actually started writing WAL there, but that meant
> >    that when the startup process was waiting for the WAL summarizer, it
> >    was waiting for an action that the summarizer wasn't yet prepared to
> >    take.
>
> I think this is a pretty long sentence, and I'm not sure I can
> understand it correctly.  Does startup process wait for the WAL
> summarizer without this patch?  If not, it's not clear for me that
> word "previously" doesn't distribute to this part of sentence.
> Breaking this into multiple sentences could improve the clarity for
> me.

Yes, I think that phrasing is muddled. It's too late to amend the
commit message, but for posterity: I initially thought that I could
fix this bug by just teaching the startup process to wait for WAL
summarization before performing the .partial renaming, but that was
not enough by itself. The problem is that the WAL summarizer would
read all the records that were present in the final WAL file on the
old timeline, but it wouldn't actually write out a summary file,
because that only happens when it reaches XLOG_CHECKPOINT_REDO or a
timeline switch point. Since no WAL had appeared on the new timeline
yet, it didn't view the end of the old timeline as a switch point, so
it just sat there waiting, without ever writing out a file; and the
startup process sat there waiting for it. So the second part of the
fix is to make the WAL summarizer realize that once the startup
process has chosen a new timeline ID, no more WAL is going to appear
on the old timeline, and thus it can (and should) write out the final
summary file for the old timeline and prepare to read WAL from the new
timeline.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: Assertion failure with summarize_wal enabled during pg_createsubscriber

From
Alexander Korotkov
Date:
On Mon, Jul 29, 2024 at 7:20 PM Robert Haas <robertmhaas@gmail.com> wrote:
> On Fri, Jul 26, 2024 at 4:10 PM Alexander Korotkov <aekorotkov@gmail.com> wrote:
> > 0002 could also use pg_indent and pgperltidy.  And I have couple other
> > notes regarding 0002.
>
> As Tom said elsewhere, we don't have a practice of requiring perltidy
> for every commit, and I normally don't bother. The tree is
> pgindent-clean currently so I believe I got that part right.

Got it, thanks.

> > >    In the process of fixing these bugs, I realized that the logic to wait
> > >    for WAL summarization to catch up was spread out in a way that made
> > >    it difficult to reuse properly, so this code refactors things to make
> > >    it easier.
> >
> > It would be nice to split refactoring out of material logic changed.
> > This way it would be easier to review and would look cleaner in the
> > git history.
>
> I support that idea in general but felt it was overkill in this case:
> it's new code, and there was only one existing caller of the function
> that got refactored, and I'm not a huge fan of cluttering the git
> history with a bunch of tiny little refactoring commits to fix a
> single bug. I might have changed it if I'd seen this note before
> committing, though.

I understand your point.  I'm also not huge fan of a flood of small
commits.  Nevertheless, I find splitting refactoring from other
changes generally useful.  That could be a single commit of many small
refactorings, not many small commits.  The point for me is easier
review: you can expect refactoring commit to contain "isomorphic"
changes, while other commits implementing material logic changes.  But
that might be a committer preference though.

> > >    To make this fix work, also teach the WAL summarizer that after a
> > >    promotion has occurred, no more WAL can appear on the previous
> > >    timeline: previously, the WAL summarizer wouldn't switch to the new
> > >    timeline until we actually started writing WAL there, but that meant
> > >    that when the startup process was waiting for the WAL summarizer, it
> > >    was waiting for an action that the summarizer wasn't yet prepared to
> > >    take.
> >
> > I think this is a pretty long sentence, and I'm not sure I can
> > understand it correctly.  Does startup process wait for the WAL
> > summarizer without this patch?  If not, it's not clear for me that
> > word "previously" doesn't distribute to this part of sentence.
> > Breaking this into multiple sentences could improve the clarity for
> > me.
>
> Yes, I think that phrasing is muddled. It's too late to amend the
> commit message, but for posterity: I initially thought that I could
> fix this bug by just teaching the startup process to wait for WAL
> summarization before performing the .partial renaming, but that was
> not enough by itself. The problem is that the WAL summarizer would
> read all the records that were present in the final WAL file on the
> old timeline, but it wouldn't actually write out a summary file,
> because that only happens when it reaches XLOG_CHECKPOINT_REDO or a
> timeline switch point. Since no WAL had appeared on the new timeline
> yet, it didn't view the end of the old timeline as a switch point, so
> it just sat there waiting, without ever writing out a file; and the
> startup process sat there waiting for it. So the second part of the
> fix is to make the WAL summarizer realize that once the startup
> process has chosen a new timeline ID, no more WAL is going to appear
> on the old timeline, and thus it can (and should) write out the final
> summary file for the old timeline and prepare to read WAL from the new
> timeline.

Great, thank you for the explanation.  Now that's clear.

------
Regards,
Alexander Korotkov
Supabase



On Wed, Jul 31, 2024 at 04:49:54PM +0300, Alexander Korotkov wrote:
> On Mon, Jul 29, 2024 at 7:20 PM Robert Haas <robertmhaas@gmail.com> wrote:
>> I support that idea in general but felt it was overkill in this case:
>> it's new code, and there was only one existing caller of the function
>> that got refactored, and I'm not a huge fan of cluttering the git
>> history with a bunch of tiny little refactoring commits to fix a
>> single bug. I might have changed it if I'd seen this note before
>> committing, though.
>
> I understand your point.  I'm also not huge fan of a flood of small
> commits.  Nevertheless, I find splitting refactoring from other
> changes generally useful.  That could be a single commit of many small
> refactorings, not many small commits.  The point for me is easier
> review: you can expect refactoring commit to contain "isomorphic"
> changes, while other commits implementing material logic changes.

For review, it also tends to matter a lot to me, especially if the
same areas of code are changed across multiple commits.  That's more
annoying for authors as the splits are annoying to maintain.  For a
single caller introduced, what Robert has done is fine IMO.

> But that might be a committer preference though.

I tend to prefer refactorings if it comes to a cleaner git history,
still that's always case-by-case, and all of us have our own habits.
--
Michael

Attachment