During one of HOT stress tests, an asserition failed at tqual.c:1178
in HeapTupleSatisfiesVacuum(). The assertion failure looked really
strange because the assertion checks for HEAP_XMAX_COMMITTED
which we set just couple of lines above. I inspected the core dump
and found that the flag is *set* properly. That was even more strange.
I confirmed that we are holding a SHARE lock on the buffer as we
do at several other places while checking/setting the infomask bits.
We had a theory that somebody clears the flag after the asserting
process sets it and before it checks it. The other process also sets it
back before core dump is generated because core shows the flag
being set properly. The chances of this happening are very slim and
can further be ruled out because I carefully looked at the code and found
that the flag can only be cleared holding an exclusive lock on the buffer.
So we suspected an interaction between multiple processes each holding
a SHARE lock and setting/checking different bits in the infomask and
we could theoritically say that such interaction can potentially lead to
missing hint bit updates. I can think of the following:
Process P1 is setting bit 0 and process P2 setting bit 1 of an integer
'x' whose current value is say 0.
P1 P2
load x in register A load x in register B
A = A | 0x0001 B = B | 0x0002
Store A to x
Store B to x
At the end, P1's update is missing! If P1's further processing is based
on the bit-check, it would go completely wrong.
This easily explains the assertion and core dump analysis. We can
possibly remove that assertion and any other similar assertions
(unless someone can find a hole in the above analysis). But I am
more worried about other similar race conditions where hint bit updates
go missing and thus causing severe MVCC failures.
Btw, to validate the race condition I quickly wrote a simple C
program which attaches to a share memory. Each instance of the
process sets/clears and checks a separate bit. It clearly demonstrates
the danger. The code is attached. Compile and run with an integer
argument to tell which bit to set/reset.
Thanks,
Pavan
--
Pavan Deolasee
EnterpriseDB
http://www.enterprisedb.com