This patch is for a bug in backend/utils/time/combocid.c.
In HeapTupleHeaderAdjustCmax, the code makes the incorrect assumption
that the raw command id corresponds to cmin, when the raw command id can
in fact be a combo cid. It needs to be fixed to call
HeapTupleHeaderGetCmin to get cmin. That's all there is to my patch, but
read on for more details.
One thing that can happen is that multiple subtransactions may delete a
tuple and each abort. Each subtransaction (except the first one) will
store a combo cid that uses another combo cid as cmin. Then when cmin is
accessed by a later operation, HeapTupleHeaderGetCmin will return the
previous combo cid. In unlucky situations, the bogus cmin will be so
high that it looks like the tuple was inserted in the future.
I have pasted an example below to show how things can go wrong. The
second SELECT leaves out some tuples because their stored cmin is
actually a large combo cid. The third SELECT gives correct results
because the inserting transaction committed, so the command id is
ignored. The script file with these commands in in the attachment, and I
think it could make for a valuable regression test.
Side note: Another idea for a fix would be to have tqual.c change a
combocid to a simple cmin when it does a time qual check and sees that
the deleting subtransaction aborted. This is no good because it requires
a write lock to change two header fields atomically, namely the
HEAP_COMBOCID hint and t_cid. There are probably other issues, but this
seems like a show-stopper.
The one-line patch is in the attachment. It gives correct behavior on my
test case. This is my first patch; I hope I did it right!
Thanks!
Karl Schnaitter
test=# CREATE TABLE tab(a INT);
CREATE TABLE
test=# BEGIN;
BEGIN
test=# INSERT INTO tab VALUES(1);
INSERT 0 1
test=# INSERT INTO tab VALUES(2);
INSERT 0 1
test=# INSERT INTO tab VALUES(3);
INSERT 0 1
test=# SAVEPOINT s; DELETE FROM tab; ROLLBACK TO SAVEPOINT s;
SAVEPOINT
DELETE 3
ROLLBACK
test=# SELECT a FROM tab;
a
---
1
2
3
(3 rows)
test=# SAVEPOINT s; DELETE FROM tab; ROLLBACK TO SAVEPOINT s;
SAVEPOINT
DELETE 3
ROLLBACK
test=# SAVEPOINT s; DELETE FROM tab; ROLLBACK TO SAVEPOINT s;
SAVEPOINT
DELETE 3
ROLLBACK
test=# SAVEPOINT s; DELETE FROM tab; ROLLBACK TO SAVEPOINT s;
SAVEPOINT
DELETE 3
ROLLBACK
test=# SELECT a FROM tab;
a
---
1
(1 row)
test=# COMMIT;
COMMIT
test=# SELECT a FROM tab;
a
---
1
2
3
(3 rows)
test=# DROP TABLE tab;
DROP TABLE