Home > mailing lists

Race condition in replication slot usage introduced by commit f41d846 - Mailing list pgsql-hackers

From	Nisha Moond
Subject	Race condition in replication slot usage introduced by commit f41d846
Date	February 26 15:10:00
Msg-id	CABdArM7J-LbGoMPGUPiFiLOyB_TZ5+YaZb=HMES0mQqzVTn8Gg@mail.gmail.com Whole thread Raw
Responses	Re: Race condition in replication slot usage introduced by commit f41d846
List	pgsql-hackers

Tree view

Hi,

Commit f41d846 [1] introduced a race condition that allows a process
to acquire and continue using an invalidated replication slot, leading
to unexpected behavior.

Example Scenario:
Suppose there is an orphaned logical slot that will be invalidated on
the next checkpoint. Consider the following sequence of events during
ReplicationSlotAcquire() -
 - A process (P1) attempts to acquire the slot. It checks the
invalidated flag and sees that the slot is still valid, so it proceeds
to set the s->active_pid to its own PID.
 - Before P1 can acquire the spinlock and update s->active_pid =
MyProcPid, the checkpointer process invalidates the slot.
 - Now, P1 proceeds to set s->active_pid = MyProcPid and starts using
an already invalidated slot, which should not happen.

Using an invalidated slot should be prevented in the first place. In
my test, this scenario triggered an assertion failure in
CreateDecodingContext() -

```
TRAP: failed Assert("slot->data.invalidated == RS_INVAL_NONE"), File:
"logical.c", Line: 546, PID: 1042884
postgres: nisha postgres [local]
SELECT(ExceptionalCondition+0xbb)[0x649ff7696c82]
postgres: nisha postgres [local]
SELECT(CreateDecodingContext+0x23b)[0x649ff73bf9d3]
postgres: nisha postgres [local] SELECT(+0x61fb71)[0x649ff73c3b71]
postgres: nisha postgres [local]
SELECT(pg_logical_slot_get_changes+0x26)[0x649ff73c3ed6]
...
```

The expected behavior was for ReplicationSlotAcquire() to detect the
invalid slot and return an error instead of allowing its usage.

Attached the patch v1 fixing this issue.

Issue reported by: Hou Zhijie.

[1] https://github.com/postgres/postgres/commit/f41d8468ddea34170fe19fdc17b5a247e7d3ac78

--
Thanks,
Nisha

Attachment

v1-0001-Fix-race-condition-in-ReplicationSlotAcquire-indr.patch

pgsql-hackers by date:

From: Alexander Korotkov
Date: 26 February, 14:48:47
Subject: Re: Get rid of WALBufMappingLock

From: Alexander Korotkov
Date: 26 February, 15:12:30
Subject: Re: Removing unneeded self joins

Race condition in replication slot usage introduced by commit f41d846 - Mailing list pgsql-hackers

Attachment

Previous

Next