Re: Skipping logical replication transactions on subscriber side - Mailing list pgsql-hackers
From | vignesh C |
---|---|
Subject | Re: Skipping logical replication transactions on subscriber side |
Date | |
Msg-id | CALDaNm3RRM-0FajnQ+QXGTEPckXaoAbNCWhTANd3_dRKYki2dw@mail.gmail.com Whole thread Raw |
In response to | Re: Skipping logical replication transactions on subscriber side (Masahiko Sawada <sawada.mshk@gmail.com>) |
Responses |
Re: Skipping logical replication transactions on subscriber side
|
List | pgsql-hackers |
On Mon, Nov 15, 2021 at 2:48 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Mon, Nov 15, 2021 at 4:49 PM Greg Nancarrow <gregn4422@gmail.com> wrote: > > > > On Mon, Nov 15, 2021 at 1:49 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > I've attached an updated patch that incorporates all comments I got so > > > far. Please review it. > > > > > > > Thanks for the updated patch. > > A few minor comments: > > > > doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml > > > > (1) tab in doc updates > > > > There's a tab before "Otherwise,": > > > > + copy of the relation with <parameter>relid</parameter>. > > Otherwise, > > Fixed. > > > > > src/backend/utils/adt/pgstatfuncs.c > > > > (2) The function comment for "pg_stat_reset_subscription_worker_sub" > > seems a bit long and I expected it to be multi-line (did you run > > pg_indent?) > > I ran pg_indent on pgstatfuncs.c but it didn't become a multi-line comment. > > > > > src/include/pgstat.h > > > > (3) Remove PgStat_StatSubWorkerEntry.dbid? > > > > The "dbid" member of the new PgStat_StatSubWorkerEntry struct doesn't > > seem to be used, so I think it should be removed. > > (I could remove it and everything builds OK and tests pass). > > > > Fixed. > > Thank you for the comments! I've updated an updated version patch. Thanks for the updated patch. I found one issue: This Assert can fail in few cases: +void +pgstat_report_subworker_error(Oid subid, Oid subrelid, Oid relid, + LogicalRepMsgType command, TransactionId xid, + const char *errmsg) +{ + PgStat_MsgSubWorkerError msg; + int len; + + Assert(strlen(errmsg) < PGSTAT_SUBWORKERERROR_MSGLEN); + len = offsetof(PgStat_MsgSubWorkerError, m_message[0]) + strlen(errmsg) + 1; + I could reproduce the problem with the following scenario: Publisher: create table t1 (c1 varchar); create publication pub1 for table t1; insert into t1 values(repeat('abcd', 5000)); Subscriber: create table t1(c1 smallint); create subscription sub1 connection 'dbname=postgres port=5432' publication pub1 with ( two_phase = true); postgres=# select * from pg_stat_subscription_workers; WARNING: terminating connection because of crash of another server process DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. HINT: In a moment you should be able to reconnect to the database and repeat your command. server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Failed. Subscriber logs: 2021-11-15 19:27:56.380 IST [15685] LOG: logical replication apply worker for subscription "sub1" has started 2021-11-15 19:27:56.384 IST [15687] LOG: logical replication table synchronization worker for subscription "sub1", table "t1" has started TRAP: FailedAssertion("strlen(errmsg) < PGSTAT_SUBWORKERERROR_MSGLEN", File: "pgstat.c", Line: 1946, PID: 15687) postgres: logical replication worker for subscription 16387 sync 16384 (ExceptionalCondition+0xd0)[0x55a18f3c727f] postgres: logical replication worker for subscription 16387 sync 16384 (pgstat_report_subworker_error+0x7a)[0x55a18f126417] postgres: logical replication worker for subscription 16387 sync 16384 (ApplyWorkerMain+0x493)[0x55a18f176611] postgres: logical replication worker for subscription 16387 sync 16384 (StartBackgroundWorker+0x23c)[0x55a18f11f7e2] postgres: logical replication worker for subscription 16387 sync 16384 (+0x54efc0)[0x55a18f134fc0] postgres: logical replication worker for subscription 16387 sync 16384 (+0x54f3af)[0x55a18f1353af] postgres: logical replication worker for subscription 16387 sync 16384 (+0x54e338)[0x55a18f134338] /lib/x86_64-linux-gnu/libpthread.so.0(+0x141f0)[0x7feef84371f0] /lib/x86_64-linux-gnu/libc.so.6(__select+0x57)[0x7feef81e3ac7] postgres: logical replication worker for subscription 16387 sync 16384 (+0x5498c2)[0x55a18f12f8c2] postgres: logical replication worker for subscription 16387 sync 16384 (PostmasterMain+0x134c)[0x55a18f12f1dd] postgres: logical replication worker for subscription 16387 sync 16384 (+0x43c3d4)[0x55a18f0223d4] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xd5)[0x7feef80fd565] postgres: logical replication worker for subscription 16387 sync 16384 (_start+0x2e)[0x55a18ecaf4fe] 2021-11-15 19:27:56.483 IST [15645] LOG: background worker "logical replication worker" (PID 15687) was terminated by signal 6: Aborted 2021-11-15 19:27:56.483 IST [15645] LOG: terminating any other active server processes 2021-11-15 19:27:56.485 IST [15645] LOG: all server processes terminated; reinitializing Here it fails because of a long error message ""invalid input syntax for type smallint: \"abcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabc...." because we try to insert varchar type data into smallint type. Maybe we should trim the error message in this case. Regards, Vignesh
pgsql-hackers by date: