Thread: Unexpected results from a query with UNION ALL
Hi everyone,
Recently I got unexpected results from a query that seems to be legit.
The setup is like this:
-- setup
CREATE TABLE parents
(
id uuid not null primary key,
revision integer not null
);
CREATE TABLE children
(
id uuid not null primary key,
parent_id uuid not null references parents
);
INSERT INTO parents (id, revision)
VALUES ('ec422e09-55bb-4465-a990-31f59859959d', 1);
INSERT INTO children (id, parent_id)
VALUES ('5cb82ceb-c5ef-4c59-a02e-f7b610470f8c', 'ec422e09-55bb-4465-a990-31f59859959d');
INSERT INTO children (id, parent_id)
VALUES ('ce5b22b0-c6c4-4c09-826c-7086c53ee9ec', 'ec422e09-55bb-4465-a990-31f59859959d');
The query is:
-- query
SELECT children_union.id AS child_id
FROM parents p
JOIN (SELECT id, parent_id
FROM children
UNION ALL
SELECT null::uuid, null::uuid
WHERE false) children_union ON children_union.parent_id = p.id
WHERE p.id = 'ec422e09-55bb-4465-a990-31f59859959d'
FOR UPDATE OF p;
It looks weird, but it's just a simplification of a much bigger query. The 'SELECT null::uuid, null::uuid WHERE false' part was actually more meaningful but I substituted it with a query that returns 0 rows after finding out that it's irrelevant.
If I just run this query I get something that I would expect to get:
-- result 1
child_id
--------------------------------------
5cb82ceb-c5ef-4c59-a02e-f7b610470f8c
ce5b22b0-c6c4-4c09-826c-7086c53ee9ec
(2 rows)
But if I lock the single row in the parents table:
-- concurrent query
BEGIN;
UPDATE parents
SET revision = revision + 1
WHERE id = 'ec422e09-55bb-4465-a990-31f59859959d';
and then run my query again in a separate session, then it's waiting for the lock to be released. Once I commit the concurrent query and release the lock, I get this:
-- result 2
child_id
--------------------------------------
5cb82ceb-c5ef-4c59-a02e-f7b610470f8c
5cb82ceb-c5ef-4c59-a02e-f7b610470f8c
(2 rows)
but I would expect to get the same result as previously. Is it a bug or am I doing something wrong here?
Thank you,
Andrii
Andrey <adnyre@gmail.com> writes: > Recently I got unexpected results from a query that seems to be legit. > ... > ... Once I commit the concurrent query and > release the lock, I get this: > -- result 2 > child_id > -------------------------------------- > 5cb82ceb-c5ef-4c59-a02e-f7b610470f8c > 5cb82ceb-c5ef-4c59-a02e-f7b610470f8c > (2 rows) > but I would expect to get the same result as previously. Is it a bug > or am I doing something wrong here? It's a surprising result for sure, but I believe it's explained by the algorithm for READ COMMITTED [1], specifically the bit about The search condition of the command (the WHERE clause) is re-evaluated to see if the updated version of the row still matches the search condition. If so, the second updater proceeds with its operation using the updated version of the row. Once the tuple lock is released, the join query fetches the new version of the parents row (with the new revision value, though that's not actually relevant to the result). It then effectively re-executes the join against the UNION construct, and that means it'll always find the first matching row in "children". The "updated version of the row" is taken to mean the entire join row, so it doesn't blink at the fact that it got a different child output than it had started with. Another way to look at this is that locking only "p" underspecifies the query result: there's more than one child row that could join to the "p" row, and the system doesn't promise that you get a result from any particular one of them. If you try to fix it by also locking the UNION result, or by adding FOR UPDATE to the UNION arm that selects from "children", you get ERROR: FOR UPDATE is not allowed with UNION/INTERSECT/EXCEPT This example makes me feel that we've missed some cases where we probably ought to throw that error. Or else work harder on making the combination be supported --- but it looks tricky to produce consistent results, and there have been few complaints about this omission so far. In the meantime, the most recommendable answer for you is probably to switch over to using SERIALIZABLE mode. That'd require adding application logic to retry after a serialization failure, but it would produce consistent results even for complex queries. regards, tom lane [1] https://www.postgresql.org/docs/current/transaction-iso.html#XACT-READ-COMMITTED
I wrote: > Andrey <adnyre@gmail.com> writes: >> ... but I would expect to get the same result as previously. Is it a bug >> or am I doing something wrong here? > It's a surprising result for sure, but I believe it's explained by > the algorithm for READ COMMITTED [1], specifically the bit about Actually, on further thought I believe this really is a bug, because if you change the UNION ALL to UNION it works fine. It probably used to work with UNION ALL too, but that was a few decades ago before we started adding optimizations of UNION ALL :-( I've been poking at this off and on for the last few days, and I've found three different things that will need to be changed to make it work again. At least one of them looks too invasive to consider for back-patch. So don't hold your breath for a proper fix, but perhaps you could use UNION as a workaround? regards, tom lane