Thread: Duplicated LSN in ReorderBuffer

Duplicated LSN in ReorderBuffer

From

Ildar Musin

Date:

25 June 2019, 14:45:57

Hi hackers,

I believe we found a bug in logical decoding. It only occures with
casserts enabled. It was originally discovered and reproduced by Murat
Kabilov and Ildus Kurbangaliev. Here is the stacktrace we've got:

#0 0x00007facc66ef82f in raise () from /usr/lib/libc.so.6
#1 0x00007facc66da672 in abort () from /usr/lib/libc.so.6
#2 0x0000000000ac4ebf in ExceptionalCondition (
conditionName=0xccdea8 "!(prev_first_lsn < cur_txn->first_lsn)",
errorType=0xccdce4 "FailedAssertion", fileName=0xccdd38 "reorderbuffer.c",
lineNumber=680) at assert.c:54
#3 0x00000000008a9515 in AssertTXNLsnOrder (rb=0x25ca128) at reorderbuffer.c:680
#4 0x00000000008a900f in ReorderBufferTXNByXid (rb=0x25ca128, xid=65609, create=true,
is_new=0x0, lsn=211590864, create_as_top=true) at reorderbuffer.c:559
#5 0x00000000008abf0d in ReorderBufferAddNewTupleCids (rb=0x25ca128, xid=65609,
lsn=211590864, node=..., tid=..., cmin=0, cmax=4294967295, combocid=4294967295)
at reorderbuffer.c:2098
#6 0x00000000008b096b in SnapBuildProcessNewCid (builder=0x25d0158, xid=65610,
lsn=211590864, xlrec=0x25d60b8) at snapbuild.c:781
#7 0x000000000089d01c in DecodeHeap2Op (ctx=0x25ba0b8, buf=0x7ffd0e294da0) at decode.c:382
#8 0x000000000089c8ca in LogicalDecodingProcessRecord (ctx=0x25ba0b8, record=0x25ba378)
at decode.c:125
#9 0x00000000008a124c in DecodingContextFindStartpoint (ctx=0x25ba0b8) at logical.c:492
#10 0x00000000008b9c3d in CreateReplicationSlot (cmd=0x257be20) at walsender.c:957
#11 0x00000000008baa60 in exec_replication_command (
cmd_string=0x24f5b08 "CREATE_REPLICATION_SLOT temp_slot_name TEMPORARY LOGICAL pgoutput USE_SNAPSHOT") at walsender.c:1531
#12 0x0000000000937230 in PostgresMain (argc=1, argv=0x25233a8, dbname=0x2523380 "postgres",
username=0x24f23c8 "zilder") at postgres.c:4245
#13 0x0000000000881453 in BackendRun (port=0x251a900) at postmaster.c:4431
#14 0x0000000000880b4f in BackendStartup (port=0x251a900) at postmaster.c:4122
#15 0x000000000087cbbe in ServerLoop () at postmaster.c:1704
#16 0x000000000087c34a in PostmasterMain (argc=3, argv=0x24f0330) at postmaster.c:1377
#17 0x00000000007926b6 in main (argc=3, argv=0x24f0330) at main.c:228

After viewing coredump we see that
`prev_first_lsn == cur_txn->first_lsn`

The problem seems to be that ReorderBuffer adds two ReorderBufferTXNs
with the same LSN, but different transaction ids: subxid and top-level
xid. See FIX part below.

STEPS TO REPRODUCE
------------------

We were able reproduce it on 10, 11 and on master branch. Postgres was
configured as:

./configure --enable-cassert CFLAGS='-ggdb3 -O0' --prefix=$HOME/pg12

Additional options in postgresql.conf:

wal_level='logical'
max_connections=1000
max_replication_slots=100
max_wal_senders=100
max_logical_replication_workers=100

pgbench scripts:

$ cat create_table.sql
BEGIN;
SAVEPOINT p1;
CREATE temp TABLE t_t (id INT) ON COMMIT DROP;
ROLLBACK TO SAVEPOINT p1;
ROLLBACK;

$ cat create_slot.sql
BEGIN ISOLATION LEVEL REPEATABLE READ READ ONLY;
SELECT pg_create_logical_replication_slot('test' || pg_backend_pid(), 'pgoutput', true);
SELECT pg_drop_replication_slot('test' || pg_backend_pid());
ROLLBACK;

Run in parallel terminals:

$HOME/pg12/bin/pgbench postgres -f create_table.sql -T1000 -c50 -j50
$HOME/pg12/bin/pgbench postgres -f create_slot.sql -T1000 -c50 -j50

It may take some time. On my local machine it breaks in few seconds.

FIX?
----

Can't say that i have enough understanding of what's going on in the
logical decoding code. But the one thing i've noticed is inconsistency
of xids used to make ReorderBufferTXNByXid() call:

1. first, in DecodeHeap2Op() function ReorderBufferProcessXid() is
called with subtransaction id; it actually creates ReorderBufferTXN
and adds it to reorder buffer's hash table and toplevel_by_lsn list;
2. second, within ReorderBufferXidSetCatalogChanges() it uses same
subxid to lookup the ReorderBufferTXN that was created before,
successfully;
3. now in ReorderBufferAddNewTupleCids() it uses top-level transaction
id instead for lookup; it cannot find xid in hash table and tries to
add a new record with the same LSN. And it fails since this LSN is
already in toplevel_by_lsn list.

Attached is a simple patch that uses subxid instead of top-level xid
in ReorderBufferAddNewTupleCids() call. It seems to fix the bug, but
i'm not sure that this is a valid change. Can someone please verify it
or maybe suggest a better solution for the issue?

Best regards,
Ildar

Thread: Duplicated LSN in ReorderBuffer

Attachment

Attachment

Attachment

Attachment

Attachment

Attachment

Attachment