I tried to replicate the problem described in [1] about logical
replication from a v14 source DB to a v11 target. It fails as
described there; I've not yet tracked down why, but it looks like
the v11 apply worker fails and closes the connection after sending
CREATE_REPLICATION_SLOT. The v14 walsender doesn't have a problem
with that. However, if I do the same thing using HEAD as the source,
I see an assertion failure on the way out of the walsender. The
postmaster log shows
2022-02-14 15:18:29.890 EST [2922064] LOG: starting logical decoding for slot "sub01"
2022-02-14 15:18:29.890 EST [2922064] DETAIL: Streaming transactions committing after 0/FEA23B0, reading WAL from
0/FEA23B0.
2022-02-14 15:18:29.890 EST [2922064] STATEMENT: START_REPLICATION SLOT "sub01" LOGICAL 0/0 (proto_version '1',
publication_names'"pub01"')
2022-02-14 15:18:29.890 EST [2922064] LOG: logical decoding found consistent point at 0/FEA23B0
2022-02-14 15:18:29.890 EST [2922064] DETAIL: There are no running transactions.
2022-02-14 15:18:29.890 EST [2922064] STATEMENT: START_REPLICATION SLOT "sub01" LOGICAL 0/0 (proto_version '1',
publication_names'"pub01"')
2022-02-14 15:18:29.895 EST [2922066] LOG: logical decoding found consistent point at 0/FEA2460
2022-02-14 15:18:29.895 EST [2922066] DETAIL: There are no running transactions.
2022-02-14 15:18:29.895 EST [2922066] STATEMENT: CREATE_REPLICATION_SLOT "sub01_129343_sync_129337" TEMPORARY LOGICAL
pgoutputUSE_SNAPSHOT
TRAP: FailedAssertion("pgstat_is_initialized && !pgstat_is_shutdown", File: "pgstat.c", Line: 5140, PID: 2922066)
postgres: walsender postgres [local] idle in transaction(ExceptionalCondition+0x7c)[0x95126c]
postgres: walsender postgres [local] idle in transaction[0x7a66dd]
postgres: walsender postgres [local] idle in transaction[0x7a68fb]
postgres: walsender postgres [local] idle in transaction(pgstat_report_replslot_drop+0x32)[0x7aad72]
postgres: walsender postgres [local] idle in transaction[0x7e7220]
postgres: walsender postgres [local] idle in transaction(ReplicationSlotCleanup+0xf1)[0x7e7561]
postgres: walsender postgres [local] idle in transaction[0x833982]
postgres: walsender postgres [local] idle in transaction(shmem_exit+0x85)[0x816195]
postgres: walsender postgres [local] idle in transaction[0x816277]
postgres: walsender postgres [local] idle in transaction(proc_exit+0x8)[0x816308]
postgres: walsender postgres [local] idle in transaction(PostgresMain+0x4b4)[0x83e2f4]
postgres: walsender postgres [local] idle in transaction[0x7b1872]
postgres: walsender postgres [local] idle in transaction(PostmasterMain+0xd2c)[0x7b282c]
postgres: walsender postgres [local] idle in transaction(main+0x1e9)[0x509489]
/lib64/libc.so.6(__libc_start_main+0xf3)[0x7fec53a86493]
postgres: walsender postgres [local] idle in transaction(_start+0x2e)[0x50976e]
2022-02-14 15:18:29.974 EST [2921719] LOG: server process (PID 2922066) was terminated by signal 6: Aborted
2022-02-14 15:18:29.974 EST [2921719] LOG: terminating any other active server processes
Stack trace looks like
#2 0x000000000095128b in ExceptionalCondition (
conditionName=conditionName@entry=0xabe798 "pgstat_is_initialized && !pgstat_is_shutdown",
errorType=errorType@entry=0x9a4757"FailedAssertion",
fileName=fileName@entry=0xabe554 "pgstat.c",
lineNumber=lineNumber@entry=5140) at assert.c:69
#3 0x00000000007a66dd in pgstat_assert_is_up () at pgstat.c:5140
#4 0x00000000007a68fb in pgstat_assert_is_up () at pgstat.c:3244
#5 pgstat_send (msg=<optimized out>, len=<optimized out>) at pgstat.c:3228
#6 0x00000000007aad72 in pgstat_report_replslot_drop (
slotname=slotname@entry=0x7fec53654320 "sub01_129343_sync_129337")
at pgstat.c:1948
#7 0x00000000007e7220 in ReplicationSlotDropPtr (slot=0x7fec53654308)
at slot.c:696
#8 0x00000000007e7561 in ReplicationSlotCleanup () at slot.c:547
#9 0x0000000000833982 in ProcKill (code=<optimized out>, arg=<optimized out>)
at proc.c:839
#10 ProcKill (code=<optimized out>, arg=<optimized out>) at proc.c:804
#11 0x0000000000816195 in shmem_exit (code=code@entry=0) at ipc.c:272
#12 0x0000000000816277 in proc_exit_prepare (code=code@entry=0) at ipc.c:194
#13 0x0000000000816308 in proc_exit (code=code@entry=0) at ipc.c:107
#14 0x000000000083e2f4 in PostgresMain (dbname=<optimized out>,
I guess this indicates that walsenders don't initialize their pgstats
support soon enough?
regards, tom lane
[1] https://www.postgresql.org/message-id/CANhtRiamAgYt1A-Nh4%3DmU3E1UhG9XPgB%2BX6mW1DWqa93vUXW9A%40mail.gmail.com