TRUNCATE is made to wait when logical synchronous replication - Mailing list pgsql-bugs
From | Keisuke Kuroda |
---|---|
Subject | TRUNCATE is made to wait when logical synchronous replication |
Date | |
Msg-id | CANDwggJ5Q2mFFGNfpyAxhci4qL87C4ch-S9+H68pLFfgB6aA6A@mail.gmail.com Whole thread Raw |
List | pgsql-bugs |
Hi hackers. I have found a problem that TRUNCATE is made to wait when logical synchronous replication. It looks like a deadlock between AccessExclusiveLock(TRUNCATE) and AccessShareLock(walsender) to PRIMARY KEY. # version master-HEAD(8f9b6d40570bd8991f18a089a8445cc5275c1f49) It occurs in PG11 or later. # testcase I attach 010_truncate_sync.patch that is added to the testcase(truncate when synchronous replication). # backtrace(walsender) The following is walsender process backtrace. 'send_relation_and_attrs' tried to select schema info(primary key), but TRUNCATE have already locked primary key. #0 0x00007f038bd96463 in __epoll_wait_nocancel () from /lib64/libc.so.6 #1 0x00000000008f9307 in WaitEventSetWaitBlock (set=0x1399f60, cur_timeout=-1, occurred_events=0x7fff4eb89740, nevents=1) at latch. c:1294 #2 0x00000000008f91e3 in WaitEventSetWait (set=0x1399f60, timeout=-1, occurred_events=0x7fff4eb89740, nevents=1, wait_event_info=50331648) at latch.c:1246 #3 0x00000000008f88b9 in WaitLatchOrSocket (latch=0x7f03856bc094, wakeEvents=33, sock=-1, timeout=-1, wait_event_info=50331648) at latch.c:428 #4 0x00000000008f872c in WaitLatch (latch=0x7f03856bc094, wakeEvents=33, timeout=0, wait_event_info=50331648) at latch.c:368 #5 0x000000000091e655 in ProcSleep (locallock=0x12f9e38, lockMethodTable=0xc657c0 <default_lockmethod>) at proc.c:1292 #6 0x000000000090d7bc in WaitOnLock (locallock=0x12f9e38, owner=0x1307100) at lock.c:1859 #7 0x000000000090c3ff in LockAcquireExtended (locktag=0x7fff4eb89bb0, lockmode=1, sessionLock=false, dontWait=false, reportMemoryError=true, locallockp=0x7fff4eb89ba8) at lock.c:1101 #8 0x0000000000909a26 in LockRelationOid (relid=16387, lockmode=1) at lmgr.c:116 #9 0x0000000000491a00 in relation_open (relationId=16387, lockmode=1) at relation.c:56 #10 0x000000000050e9bf in index_open (relationId=16387, lockmode=1) at indexam.c:136 #11 0x0000000000a9cca8 in RelationGetIndexAttrBitmap (relation=0x7f038cb62528, attrKind=INDEX_ATTR_BITMAP_IDENTITY_KEY) at relcache. c:5018 #12 0x00000000008a03df in logicalrep_write_attrs (out=0x1385f80, rel=0x7f038cb62528) at proto.c:580 #13 0x000000000089fcb2 in logicalrep_write_rel (out=0x1385f80, rel=0x7f038cb62528) at proto.c:368 #14 0x00007f038500409e in send_relation_and_attrs (relation=0x7f038cb62528, ctx=0x13003d0) at pgoutput.c:352 #15 0x00007f0385003f9a in maybe_send_schema (ctx=0x13003d0, relation=0x7f038cb62528, relentry=0x139da20) at pgoutput.c:315 #16 0x00007f038500462f in pgoutput_truncate (ctx=0x13003d0, txn=0x139fc08, nrelations=1, relations=0x13a9d98, change=0x13b0038) at pgoutput.c:511 #17 0x000000000089b47e in truncate_cb_wrapper (cache=0x1391c90, txn=0x139fc08, nrelations=1, relations=0x13a9d98, change=0x13b0038) at logical.c:797 #18 0x00000000008a443d in ReorderBufferCommit (rb=0x1391c90, xid=519, commit_lsn=23097744, end_lsn=23097984, commit_time=647075285768993, origin_id=0, origin_lsn=0) at reorderbuffer.c:1751 #19 0x00000000008974d5 in DecodeCommit (ctx=0x13003d0, buf=0x7fff4eb8a2d0, parsed=0x7fff4eb8a130, xid=519) at decode.c:637 #20 0x00000000008969c5 in DecodeXactOp (ctx=0x13003d0, buf=0x7fff4eb8a2d0) at decode.c:245 #21 0x0000000000896697 in LogicalDecodingProcessRecord (ctx=0x13003d0, record=0x13006d0) at decode.c:114 #22 0x00000000008c8d7c in XLogSendLogical () at walsender.c:2871 #23 0x00000000008c80b4 in WalSndLoop (send_data=0x8c8ce0 <XLogSendLogical>) at walsender.c:2298 #24 0x00000000008c6b22 in StartLogicalReplication (cmd=0x1351eb8) at walsender.c:1214 #25 0x00000000008c73b5 in exec_replication_command ( cmd_string=0x12db4f0 "START_REPLICATION SLOT \"sub1\" LOGICAL 0/0 (proto_version '1', publication_names '\"pub1\"')") at walsender.c:1641 #26 0x000000000092c1ef in PostgresMain (argc=1, argv=0x1306548, dbname=0x1306460 "postgres", username=0x12d80e8 "postgres") at postgres.c:4311 #27 0x000000000087c6eb in BackendRun (port=0x12fe1f0) at postmaster.c:4523 #28 0x000000000087bedb in BackendStartup (port=0x12fe1f0) at postmaster.c:4215 #29 0x00000000008784e8 in ServerLoop () at postmaster.c:1727 #30 0x0000000000877dbf in PostmasterMain (argc=4, argv=0x12d5f90) at postmaster.c:1400 #31 0x000000000077f26d in main (argc=4, argv=0x12d5f90) at main.c:209 This problem is not to occur in publisher table is set REPLICA IDENTITY FULL. --- $node_publisher->safe_psql('postgres', "ALTER TABLE tab1 REPLICA IDENTITY FULL"); --- I think that not need to sent full schema info when TRUNCATE. -- Keisuke Kuroda NTT Software Innovation Center keisuke.kuroda.3862@gmail.com
Attachment
pgsql-bugs by date: