RE: POC: enable logical decoding when wal_level = 'replica' without a server restart - Mailing list pgsql-hackers

From Hayato Kuroda (Fujitsu)
Subject RE: POC: enable logical decoding when wal_level = 'replica' without a server restart
Date
Msg-id OSCPR01MB14966ABF290EF426FC93BCF34F5E6A@OSCPR01MB14966.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: POC: enable logical decoding when wal_level = 'replica' without a server restart  (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses Re: POC: enable logical decoding when wal_level = 'replica' without a server restart
Re: POC: enable logical decoding when wal_level = 'replica' without a server restart
List pgsql-hackers
Dear Sawada-san,

> If we implement these ideas, we can simplify the patch quite well as
> we no longer need the lazy behavior nor wait for the recovery to
> complete. I've attached a PoC patch that can be applied on top of the
> v15 patch.

In 0002, I found an assertion failure. Steps:

0. There is a streaming replication system and only primary has a logical slot.
1. Attached to a startup process and set a break at UpdateLogicalDecodingStatusEndOfRecovery.
2. Sent a promote signal to the standby and ensured the startup stopped.
3. Established new connection to the standby
4. Attached to the backend process and set a break at create_logical_replication_slot.
5. Tried to create a new slot on the standby and ensured the backend stopped
6. Moved the startup process till WaitForProcSignalBarrier().
7. Moved the backend process till WaitForProcSignalBarrier(). Both processes could go ahead.
8. Moved the backend till ReplicationSlotReserveWal() and restart_lsn was set.
9. Detached from the startup process. Recovery state became "DONE".
10. Detached from the backend. It would crash at xlog_decode().

Some data was obtained by the gdb, see [1].

Direct cause is that restart_lsn of the slot points the value before STATUS_CHANGE(false).
Per my analysis, ReplicationSlotReserveWal() uses GetXLogReplayRecPtr(NULL) as the
initial decode point, which is the last record the standby receives from the primary.
However, the standby can generate additional record, STATUS_CHANGE (false) in
this case. After the recovery, the decoder would read the STATUS_CHANGE record,
but it breaks our assumption.

Per my understanding, this cannot happen with 0001 because EnsureLogicalDecodingEnabled()
waits till RecoveryInProgress() becomes false.

How should we fix the issue? One approach is to remove the Assert() and ereport(ERROR),
but even in the case the slot may not be able to establish the consistent snapshot.

[1]
```
#0  __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6,
    no_tid=no_tid@entry=0) at pthread_kill.c:44
#1  0x00007f432e08bf43 in __pthread_kill_internal (signo=6, threadid=<optimized out>)
    at pthread_kill.c:78
#2  0x00007f432e03eb46 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#3  0x00007f432e028833 in __GI_abort () at abort.c:79
#4  0x0000000000b96227 in ExceptionalCondition (conditionName=0xdb295d "RecoveryInProgress()",
    fileName=0xdb2928 "../postgres/src/backend/replication/logical/decode.c", lineNumber=174)
    at ../postgres/src/backend/utils/error/assert.c:65
#5  0x000000000090f986 in xlog_decode (ctx=0x2b28430, buf=0x7ffd4a2ebf10)
    at ../postgres/src/backend/replication/logical/decode.c:174
#6  0x000000000090f77f in LogicalDecodingProcessRecord (ctx=0x2b28430, record=0x2b287c8)
    at ../postgres/src/backend/replication/logical/decode.c:116
#7  0x000000000091590b in DecodingContextFindStartpoint (ctx=0x2b28430)
    at ../postgres/src/backend/replication/logical/logical.c:644
#8  0x00000000008fd9ed in create_logical_replication_slot (name=0x2a3f6c8 "slot_sync",
    plugin=0x2a3f768 "test_decoding", temporary=false, two_phase=false, failover=false,
    restart_lsn=0, find_startpoint=true) at ../postgres/src/backend/replication/slotfuncs.c:166
#9  0x00000000008fdb02 in pg_create_logical_replication_slot (fcinfo=0x2b20bd8)
    at ../postgres/src/backend/replication/slotfuncs.c:196
...
(gdb) f 7
#7  0x000000000091590b in DecodingContextFindStartpoint (ctx=0x2b28430)
    at ../postgres/src/backend/replication/logical/logical.c:644
644                     LogicalDecodingProcessRecord(ctx, ctx->reader);
(gdb) printf "%X\n", slot->data.restart_lsn
30000F0
(gdb) q
$ pg_waldump data_sta/pg_wal/000000020000000000000003  | grep "30000F0"
pg_waldump: error: error in WAL record at 0/03000318: invalid record length at 0/03000398: expected at least 24, got 0
rmgr: XLOG        len (rec/tot):     50/    50, tx:          0, lsn: 0/030000F0, prev 0/030000B8, desc: END_OF_RECOVERY
tli2; prev tli 1; time 2025-10-01 18:54:53.971277 JST; wal_level replica 
rmgr: XLOG        len (rec/tot):     27/    27, tx:          0, lsn: 0/03000128, prev 0/030000F0, desc:
LOGICAL_DECODING_STATUS_CHANGEfalse 
```

Best regards,
Hayato Kuroda
FUJITSU LIMITED




pgsql-hackers by date:

Previous
From: Dmitry Dolgov
Date:
Subject: Re: Changing shared_buffers without restart
Next
From: Jakub Wartak
Date:
Subject: Re: The ability of postgres to determine loss of files of the main fork