TRUNCATE is made to wait when logical synchronous replication - Mailing list pgsql-bugs

From Keisuke Kuroda
Subject TRUNCATE is made to wait when logical synchronous replication
Date
Msg-id CANDwggJ5Q2mFFGNfpyAxhci4qL87C4ch-S9+H68pLFfgB6aA6A@mail.gmail.com
Whole thread Raw
List pgsql-bugs
Hi hackers.

I have found a problem that TRUNCATE is made to wait when logical
synchronous replication.
It looks like a deadlock between AccessExclusiveLock(TRUNCATE) and
AccessShareLock(walsender) to PRIMARY KEY.

# version

master-HEAD(8f9b6d40570bd8991f18a089a8445cc5275c1f49)
It occurs in PG11 or later.

# testcase

I attach 010_truncate_sync.patch that is added to the
testcase(truncate when synchronous replication).

# backtrace(walsender)

The following is walsender process backtrace.
'send_relation_and_attrs' tried to select schema info(primary key),
but TRUNCATE have already locked primary key.

#0  0x00007f038bd96463 in __epoll_wait_nocancel () from /lib64/libc.so.6
#1  0x00000000008f9307 in WaitEventSetWaitBlock (set=0x1399f60,
cur_timeout=-1, occurred_events=0x7fff4eb89740, nevents=1) at latch.
c:1294
#2  0x00000000008f91e3 in WaitEventSetWait (set=0x1399f60, timeout=-1,
occurred_events=0x7fff4eb89740, nevents=1,
wait_event_info=50331648) at latch.c:1246
#3  0x00000000008f88b9 in WaitLatchOrSocket (latch=0x7f03856bc094,
wakeEvents=33, sock=-1, timeout=-1, wait_event_info=50331648) at
latch.c:428
#4  0x00000000008f872c in WaitLatch (latch=0x7f03856bc094,
wakeEvents=33, timeout=0, wait_event_info=50331648) at latch.c:368
#5  0x000000000091e655 in ProcSleep (locallock=0x12f9e38,
lockMethodTable=0xc657c0 <default_lockmethod>) at proc.c:1292
#6  0x000000000090d7bc in WaitOnLock (locallock=0x12f9e38,
owner=0x1307100) at lock.c:1859
#7  0x000000000090c3ff in LockAcquireExtended (locktag=0x7fff4eb89bb0,
lockmode=1, sessionLock=false, dontWait=false, reportMemoryError=true,
    locallockp=0x7fff4eb89ba8) at lock.c:1101
#8  0x0000000000909a26 in LockRelationOid (relid=16387, lockmode=1) at
lmgr.c:116
#9  0x0000000000491a00 in relation_open (relationId=16387, lockmode=1)
at relation.c:56
#10 0x000000000050e9bf in index_open (relationId=16387, lockmode=1) at
indexam.c:136
#11 0x0000000000a9cca8 in RelationGetIndexAttrBitmap
(relation=0x7f038cb62528, attrKind=INDEX_ATTR_BITMAP_IDENTITY_KEY) at
relcache.
c:5018
#12 0x00000000008a03df in logicalrep_write_attrs (out=0x1385f80,
rel=0x7f038cb62528) at proto.c:580
#13 0x000000000089fcb2 in logicalrep_write_rel (out=0x1385f80,
rel=0x7f038cb62528) at proto.c:368
#14 0x00007f038500409e in send_relation_and_attrs
(relation=0x7f038cb62528, ctx=0x13003d0) at pgoutput.c:352
#15 0x00007f0385003f9a in maybe_send_schema (ctx=0x13003d0,
relation=0x7f038cb62528, relentry=0x139da20) at pgoutput.c:315
#16 0x00007f038500462f in pgoutput_truncate (ctx=0x13003d0,
txn=0x139fc08, nrelations=1, relations=0x13a9d98, change=0x13b0038) at
pgoutput.c:511
#17 0x000000000089b47e in truncate_cb_wrapper (cache=0x1391c90,
txn=0x139fc08, nrelations=1, relations=0x13a9d98, change=0x13b0038) at
logical.c:797
#18 0x00000000008a443d in ReorderBufferCommit (rb=0x1391c90, xid=519,
commit_lsn=23097744, end_lsn=23097984, commit_time=647075285768993,
origin_id=0, origin_lsn=0)
    at reorderbuffer.c:1751
#19 0x00000000008974d5 in DecodeCommit (ctx=0x13003d0,
buf=0x7fff4eb8a2d0, parsed=0x7fff4eb8a130, xid=519) at decode.c:637
#20 0x00000000008969c5 in DecodeXactOp (ctx=0x13003d0,
buf=0x7fff4eb8a2d0) at decode.c:245
#21 0x0000000000896697 in LogicalDecodingProcessRecord (ctx=0x13003d0,
record=0x13006d0) at decode.c:114
#22 0x00000000008c8d7c in XLogSendLogical () at walsender.c:2871
#23 0x00000000008c80b4 in WalSndLoop (send_data=0x8c8ce0
<XLogSendLogical>) at walsender.c:2298
#24 0x00000000008c6b22 in StartLogicalReplication (cmd=0x1351eb8) at
walsender.c:1214
#25 0x00000000008c73b5 in exec_replication_command (
    cmd_string=0x12db4f0 "START_REPLICATION SLOT \"sub1\" LOGICAL 0/0
(proto_version '1', publication_names '\"pub1\"')") at
walsender.c:1641
#26 0x000000000092c1ef in PostgresMain (argc=1, argv=0x1306548,
dbname=0x1306460 "postgres", username=0x12d80e8 "postgres") at
postgres.c:4311
#27 0x000000000087c6eb in BackendRun (port=0x12fe1f0) at postmaster.c:4523
#28 0x000000000087bedb in BackendStartup (port=0x12fe1f0) at postmaster.c:4215
#29 0x00000000008784e8 in ServerLoop () at postmaster.c:1727
#30 0x0000000000877dbf in PostmasterMain (argc=4, argv=0x12d5f90) at
postmaster.c:1400
#31 0x000000000077f26d in main (argc=4, argv=0x12d5f90) at main.c:209



This problem is not to occur in publisher table is set REPLICA IDENTITY FULL.
---
$node_publisher->safe_psql('postgres',
  "ALTER TABLE tab1 REPLICA IDENTITY FULL");
---

I think that not need to sent full schema info when TRUNCATE.


-- 
Keisuke Kuroda
NTT Software Innovation Center
keisuke.kuroda.3862@gmail.com

Attachment

pgsql-bugs by date:

Previous
From: Igor Federizi
Date:
Subject: Installation error
Next
From: Fahar Abbas
Date:
Subject: Re: Installation error