Re: Excessive LOG messages from replication slot sync worker - Mailing list pgsql-hackers
From | Japin Li |
---|---|
Subject | Re: Excessive LOG messages from replication slot sync worker |
Date | |
Msg-id | ME0P300MB04452400335D1F925CCDC865B62BA@ME0P300MB0445.AUSP300.PROD.OUTLOOK.COM Whole thread Raw |
In response to | Re: Excessive LOG messages from replication slot sync worker (shveta malik <shveta.malik@gmail.com>) |
Responses |
Re: Excessive LOG messages from replication slot sync worker
|
List | pgsql-hackers |
On Tue, Aug 12, 2025 at 12:24:10PM +0530, shveta malik wrote: > On Tue, Aug 12, 2025 at 8:52 AM Fujii Masao <masao.fujii@gmail.com> wrote: > > > > Hi, > > > > While testing the replication slot synchronization feature, I set up > > three instances on the same machine: > > > > - Physical replication primary (also Logical replication publisher) > > - Physical replication standby > > - Logical replication subscriber > > > > ---------------------------------------- > > initdb -D pri --encoding=UTF8 --locale=C > > cat <<EOF >> pri/postgresql.conf > > wal_level = logical > > log_line_prefix = '%m [pri][%b] ' > > logging_collector = on > > EOF > > pg_ctl -D pri start > > pgbench -i > > cat <<EOF | psql > > CREATE PUBLICATION mypub FOR ALL TABLES; > > SELECT pg_create_physical_replication_slot('pslot'); > > EOF > > cat <<EOF >> pri/postgresql.conf > > synchronized_standby_slots = 'pslot' > > EOF > > pg_ctl -D pri reload > > > > pg_basebackup -D sby -c fast > > touch sby/standby.signal > > cat <<EOF >> sby/postgresql.conf > > port = 5433 > > primary_conninfo = 'host=localhost port=5432 dbname=postgres' > > primary_slot_name = 'pslot' > > sync_replication_slots = on > > hot_standby_feedback = on > > log_line_prefix = '%m [sby][%b] ' > > EOF > > pg_ctl -D sby start > > > > initdb -D sub --encoding=UTF8 --locale=C > > cat <<EOF >> sub/postgresql.conf > > port = 5434 > > wal_level = logical > > log_line_prefix = '%m [sub][%b] ' > > logging_collector = on > > EOF > > pg_ctl -D sub start > > pgbench -i -Itp -p 5434 > > cat <<EOF | psql -p 5434 > > CREATE SUBSCRIPTION mysub CONNECTION 'host=localhost port=5432 > > dbname=postgres' PUBLICATION mypub WITH (failover); > > EOF > > ---------------------------------------- > > > > > > I noticed that the replication slot sync worker on the standby generates > > a large number of LOG messages, about two per 200–300 ms, during > > pgbench runs (pgbench -c 10 -j 10 -T 60). For example, > > > > ---------------------------------------- > > 2025-08-12 11:48:21.067 JST [sby][slotsync worker] LOG: starting > > logical decoding for slot "mysub" > > 2025-08-12 11:48:21.067 JST [sby][slotsync worker] DETAIL: Streaming > > transactions committing after 0/04F317B8, reading WAL from 0/04006FC0. > > 2025-08-12 11:48:21.067 JST [sby][slotsync worker] LOG: logical > > decoding found consistent point at 0/04006FC0 > > 2025-08-12 11:48:21.067 JST [sby][slotsync worker] DETAIL: There are > > no running transactions. > > 2025-08-12 11:48:21.296 JST [sby][slotsync worker] LOG: starting > > logical decoding for slot "mysub" > > 2025-08-12 11:48:21.296 JST [sby][slotsync worker] DETAIL: Streaming > > transactions committing after 0/04F9DC90, reading WAL from 0/04006FC0. > > 2025-08-12 11:48:21.297 JST [sby][slotsync worker] LOG: logical > > decoding found consistent point at 0/04006FC0 > > 2025-08-12 11:48:21.297 JST [sby][slotsync worker] DETAIL: There are > > no running transactions. > > 2025-08-12 11:48:21.526 JST [sby][slotsync worker] LOG: starting > > logical decoding for slot "mysub" > > 2025-08-12 11:48:21.526 JST [sby][slotsync worker] DETAIL: Streaming > > transactions committing after 0/0506A5C0, reading WAL from 0/04006FC0. > > 2025-08-12 11:48:21.527 JST [sby][slotsync worker] LOG: logical > > decoding found consistent point at 0/04006FC0 > > 2025-08-12 11:48:21.527 JST [sby][slotsync worker] DETAIL: There are > > no running transactions. > > 2025-08-12 11:48:21.762 JST [sby][slotsync worker] LOG: starting > > logical decoding for slot "mysub" > > 2025-08-12 11:48:21.762 JST [sby][slotsync worker] DETAIL: Streaming > > transactions committing after 0/050C7F70, reading WAL from 0/04006FC0. > > 2025-08-12 11:48:21.763 JST [sby][slotsync worker] LOG: logical > > decoding found consistent point at 0/04006FC0 > > 2025-08-12 11:48:21.763 JST [sby][slotsync worker] DETAIL: There are > > no running transactions. > > 2025-08-12 11:48:21.999 JST [sby][slotsync worker] LOG: starting > > logical decoding for slot "mysub" > > 2025-08-12 11:48:21.999 JST [sby][slotsync worker] DETAIL: Streaming > > transactions committing after 0/05111420, reading WAL from 0/04006FC0. > > 2025-08-12 11:48:21.999 JST [sby][slotsync worker] LOG: logical > > decoding found consistent point at 0/04006FC0 > > 2025-08-12 11:48:21.999 JST [sby][slotsync worker] DETAIL: There are > > no running transactions. > > ---------------------------------------- > > > > > > Is this expected behavior, a configuration mistake on my side, or perhaps > > an oversight in the commit of the replication slot synchronization feature? > > > > This seems more like debug output left over from development rather than > > something useful for users. If this is true, would it make sense to change > > these messages from LOG to DEBUG? > > > > Thank You for reporting the issue. > > It looks like commit 2633dae (mistakenly) introduced a change ([1]) in > SnapBuildSnapshotExists(), altering the format used for snapshot file > names during the search. However, SnapBuildSerialize still uses the > old format ("%s/%X-%X.snap"), which led to the slot-sync worker being > unable to locate existing snapshot files. Reverting the format > resolved the issue for me. Could you please check if the attached > patch works for you as well? > > [1]: > SnapBuildSnapshotExists() > - sprintf(path, "%s/%X-%X.snap", > + sprintf(path, "%s/%08X-%08X.snap", > Sorry, it's my fault. I forgot to update the format in SnapBuildSerialize(). I'd prefer to update the format in SnapBuildSerialize() instead of reverting this change. diff --git a/src/backend/replication/logical/snapbuild.c b/src/backend/replication/logical/snapbuild.c index 8532bfd27e5..db7051170fc 100644 --- a/src/backend/replication/logical/snapbuild.c +++ b/src/backend/replication/logical/snapbuild.c @@ -1529,7 +1529,7 @@ SnapBuildSerialize(SnapBuild *builder, XLogRecPtr lsn) * unless the user used pg_resetwal or similar. If a user did so, there's * no hope continuing to decode anyway. */ - sprintf(path, "%s/%X-%X.snap", + sprintf(path, "%s/%08X-%08X.snap", PG_LOGICAL_SNAPSHOTS_DIR, LSN_FORMAT_ARGS(lsn)); @@ -1573,7 +1573,7 @@ SnapBuildSerialize(SnapBuild *builder, XLogRecPtr lsn) elog(DEBUG1, "serializing snapshot to %s", path); /* to make sure only we will write to this tempfile, include pid */ - sprintf(tmppath, "%s/%X-%X.snap.%d.tmp", + sprintf(tmppath, "%s/%08X-%08X.snap.%d.tmp", PG_LOGICAL_SNAPSHOTS_DIR, LSN_FORMAT_ARGS(lsn), MyProcPid); -- Best regards, Japin Li ChengDu WenWu Information Technology Co., LTD.
pgsql-hackers by date: