Re: Skipping logical replication transactions on subscriber side - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: Skipping logical replication transactions on subscriber side |
Date | |
Msg-id | CAD21AoA5jupM6O=pYsyfaxQ1aMX-en8=QNgpW6KfXsg7_CS0CQ@mail.gmail.com Whole thread Raw |
In response to | Re: Skipping logical replication transactions on subscriber side (vignesh C <vignesh21@gmail.com>) |
Responses |
RE: Skipping logical replication transactions on subscriber side
Re: Skipping logical replication transactions on subscriber side Re: Skipping logical replication transactions on subscriber side Re: Skipping logical replication transactions on subscriber side RE: Skipping logical replication transactions on subscriber side RE: Skipping logical replication transactions on subscriber side Re: Skipping logical replication transactions on subscriber side |
List | pgsql-hackers |
On Mon, Nov 15, 2021 at 11:43 PM vignesh C <vignesh21@gmail.com> wrote: > > On Mon, Nov 15, 2021 at 2:48 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > On Mon, Nov 15, 2021 at 4:49 PM Greg Nancarrow <gregn4422@gmail.com> wrote: > > > > > > On Mon, Nov 15, 2021 at 1:49 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > > > I've attached an updated patch that incorporates all comments I got so > > > > far. Please review it. > > > > > > > > > > Thanks for the updated patch. > > > A few minor comments: > > > > > > doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml > > > > > > (1) tab in doc updates > > > > > > There's a tab before "Otherwise,": > > > > > > + copy of the relation with <parameter>relid</parameter>. > > > Otherwise, > > > > Fixed. > > > > > > > > src/backend/utils/adt/pgstatfuncs.c > > > > > > (2) The function comment for "pg_stat_reset_subscription_worker_sub" > > > seems a bit long and I expected it to be multi-line (did you run > > > pg_indent?) > > > > I ran pg_indent on pgstatfuncs.c but it didn't become a multi-line comment. > > > > > > > > src/include/pgstat.h > > > > > > (3) Remove PgStat_StatSubWorkerEntry.dbid? > > > > > > The "dbid" member of the new PgStat_StatSubWorkerEntry struct doesn't > > > seem to be used, so I think it should be removed. > > > (I could remove it and everything builds OK and tests pass). > > > > > > > Fixed. > > > > Thank you for the comments! I've updated an updated version patch. > > Thanks for the updated patch. > I found one issue: > This Assert can fail in few cases: > +void > +pgstat_report_subworker_error(Oid subid, Oid subrelid, Oid relid, > + > LogicalRepMsgType command, TransactionId xid, > + const char *errmsg) > +{ > + PgStat_MsgSubWorkerError msg; > + int len; > + > + Assert(strlen(errmsg) < PGSTAT_SUBWORKERERROR_MSGLEN); > + len = offsetof(PgStat_MsgSubWorkerError, m_message[0]) + > strlen(errmsg) + 1; > + > > I could reproduce the problem with the following scenario: > Publisher: > create table t1 (c1 varchar); > create publication pub1 for table t1; > insert into t1 values(repeat('abcd', 5000)); > > Subscriber: > create table t1(c1 smallint); > create subscription sub1 connection 'dbname=postgres port=5432' > publication pub1 with ( two_phase = true); > postgres=# select * from pg_stat_subscription_workers; > WARNING: terminating connection because of crash of another server process > DETAIL: The postmaster has commanded this server process to roll back > the current transaction and exit, because another server process > exited abnormally and possibly corrupted shared memory. > HINT: In a moment you should be able to reconnect to the database and > repeat your command. > server closed the connection unexpectedly > This probably means the server terminated abnormally > before or while processing the request. > The connection to the server was lost. Attempting reset: Failed. > > Subscriber logs: > 2021-11-15 19:27:56.380 IST [15685] LOG: logical replication apply > worker for subscription "sub1" has started > 2021-11-15 19:27:56.384 IST [15687] LOG: logical replication table > synchronization worker for subscription "sub1", table "t1" has started > TRAP: FailedAssertion("strlen(errmsg) < PGSTAT_SUBWORKERERROR_MSGLEN", > File: "pgstat.c", Line: 1946, PID: 15687) > postgres: logical replication worker for subscription 16387 sync 16384 > (ExceptionalCondition+0xd0)[0x55a18f3c727f] > postgres: logical replication worker for subscription 16387 sync 16384 > (pgstat_report_subworker_error+0x7a)[0x55a18f126417] > postgres: logical replication worker for subscription 16387 sync 16384 > (ApplyWorkerMain+0x493)[0x55a18f176611] > postgres: logical replication worker for subscription 16387 sync 16384 > (StartBackgroundWorker+0x23c)[0x55a18f11f7e2] > postgres: logical replication worker for subscription 16387 sync 16384 > (+0x54efc0)[0x55a18f134fc0] > postgres: logical replication worker for subscription 16387 sync 16384 > (+0x54f3af)[0x55a18f1353af] > postgres: logical replication worker for subscription 16387 sync 16384 > (+0x54e338)[0x55a18f134338] > /lib/x86_64-linux-gnu/libpthread.so.0(+0x141f0)[0x7feef84371f0] > /lib/x86_64-linux-gnu/libc.so.6(__select+0x57)[0x7feef81e3ac7] > postgres: logical replication worker for subscription 16387 sync 16384 > (+0x5498c2)[0x55a18f12f8c2] > postgres: logical replication worker for subscription 16387 sync 16384 > (PostmasterMain+0x134c)[0x55a18f12f1dd] > postgres: logical replication worker for subscription 16387 sync 16384 > (+0x43c3d4)[0x55a18f0223d4] > /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xd5)[0x7feef80fd565] > postgres: logical replication worker for subscription 16387 sync 16384 > (_start+0x2e)[0x55a18ecaf4fe] > 2021-11-15 19:27:56.483 IST [15645] LOG: background worker "logical > replication worker" (PID 15687) was terminated by signal 6: Aborted > 2021-11-15 19:27:56.483 IST [15645] LOG: terminating any other active > server processes > 2021-11-15 19:27:56.485 IST [15645] LOG: all server processes > terminated; reinitializing > > Here it fails because of a long error message ""invalid input syntax > for type smallint: Good catch! > \"abcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabc...." > because we try to insert varchar type data into smallint type. Maybe > we should trim the error message in this case. Right. I've fixed this issue and attached an updated patch. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Attachment
pgsql-hackers by date: