Re: Visibility bug in tuple lock - Mailing list pgsql-hackers
| From | Heikki Linnakangas |
|---|---|
| Subject | Re: Visibility bug in tuple lock |
| Date | |
| Msg-id | e2668d7d-425d-4818-8902-c66afb00e1d3@iki.fi Whole thread Raw |
| In response to | Re: Visibility bug in tuple lock (Jasper Smit <jbsmit@gmail.com>) |
| Responses |
Re: Visibility bug in tuple lock
|
| List | pgsql-hackers |
On 17/12/2025 16:51, Jasper Smit wrote: >> Thanks! That's not quite correct though. This is very subtle, but the >> 'tuple' argument to heap_lock_updated_tuple() points to the buffer >> holding the original tuple. So doing >> HeapTupleHeaderGetUpdateXid(tuple->t_data) reads the current xmax of the >> tuple, which can now be different than what it was earlier, i.e. >> different from the xwait that we waited for. It's important that the >> 'ctid' and 'priorXmax' that we use to follow the chain are read together. > > Oops, you are right, I was too quick, it has already been quite some > time since i was dealing with this problem initially. > > I see you moved the check ItemPointerEquals(&tuple->t_self, ctid) out > to heap_lock_tuple() > but isn't this redundant in the check `if (follow_updates && updated > && !ItemPointerEquals(&tuple->t_self, &t_ctid))` if `updated` is > already true in this condition? Hmm, I don't think so, HEAP_KEYS_UPDATED is also set on deleted tuples. I tried to add an assertion for that and ran the regression tests, but nothing hit it. I was a little surprised. I guess we just don't have test coverage for the deletion case? > Also in `heap_lock_updated_tuple_rec()` is the check for > `TransactionIdIsValid(priorXmax)` now redundant too? Yep. I'll replace it with an assertion. > / * Locking the initial tuple where those > * values came from is the caller's responsibility. */ > > I think this is still ambiguous, you can still interpret that as the > initial tuple needs to be locked before the function is called. Maybe > it is better to change it to something like: > "This function will not lock the initial tuple." Ok. > Unrelated question, I was wondering why these functions take a "const > ItemPointerData *" as an argument as opposed to just pass the ctid by > value? I hate it too :-). Mostly historical reasons, it's always been like that, and that's the convention we use almost everywhere. On 32-bit systems, passing by reference made more sense. I chatted about that with Andres Freund just the other day, and he said that compilers are not good at optimizing passing them by value. I'll take his word on that. And while we're at it, when we're passing around ItemPointerDatas, it would make sense to not store the hi and low bits of the block number separately, they just need to be reassembled into a 32-bit BlockNumber before every use. I think we should have a separate in-memory representation of ItemPointerData, packed into a uint64, and use that in all function signatures. It's a lot of code churn, and would affect extensions too, but I feel it probably would be worth it. - Heikki
pgsql-hackers by date: