Issue in GIN fast-insert: XLogBeginInsert + Read/LockBuffer ordering - Mailing list pgsql-hackers

From Matthias van de Meent
Subject Issue in GIN fast-insert: XLogBeginInsert + Read/LockBuffer ordering
Date
Msg-id CAEze2WhL8uLMqynnnCu1LAPwxD5RKEo0nHV+eXGg_N6ELU88HQ@mail.gmail.com
Whole thread Raw
Responses Re: Issue in GIN fast-insert: XLogBeginInsert + Read/LockBuffer ordering
Re: Issue in GIN fast-insert: XLogBeginInsert + Read/LockBuffer ordering
List pgsql-hackers
Hi,

In Neon, we've had to modify the GIN fast insertion path as attached,
due to an unexpected XLOG insertion and buffer locking ordering issue.

The xlog readme [0] mentions that the common order of operations is 1)
pin and lock any buffers you need for the log record, then 2) start a
critical section, then 3) call XLogBeginInsert.
In Neon, we rely on this documented order of operations to expect to
be able to WAL-log hint pages (freespace map, visibility map) when
they are written to disk (e.g. cache eviction, checkpointer). In
general, this works fine, except that in ginHeapTupleFastInsert we
call XLogBeginInsert() before the last of the buffers for the eventual
record was read, thus creating a path where eviction is possible in a
`begininsert_called = true` context. That breaks our current code by
being unable to evict (WAL-log) the dirtied hint pages.

PFA a patch that rectifies this issue, by moving the XLogBeginInsert()
down to where 1.) we have all relevant buffers pinned and locked, and
2.) we're in a critical section, making that part of the code
consistent with the general scheme for XLog insertion.

Kind regards,

Matthias van de Meent

[0] access/transam/README, section "Write-Ahead Log Coding", line 436-470

Attachment

pgsql-hackers by date:

Previous
From: Ashutosh Sharma
Date:
Subject: confirmed_flush_lsn shows LSN of the data that has not yet been received by the logical subscriber.
Next
From: Amit Kapila
Date:
Subject: Re: Perform streaming logical transactions by background workers and parallel apply