Thread: Unexpected results from a query with UNION ALL

Unexpected results from a query with UNION ALL

From
Andrey
Date:
Hi everyone,

Recently I got unexpected results from a query that seems to be legit.
The setup is like this:

-- setup
CREATE TABLE parents
(
    id       uuid    not null primary key,
    revision integer not null
);
CREATE TABLE children
(
    id        uuid    not null primary key,
    parent_id uuid    not null references parents
);
INSERT INTO parents (id, revision)
VALUES ('ec422e09-55bb-4465-a990-31f59859959d', 1);
INSERT INTO children (id, parent_id)
VALUES ('5cb82ceb-c5ef-4c59-a02e-f7b610470f8c', 'ec422e09-55bb-4465-a990-31f59859959d');
INSERT INTO children (id, parent_id)
VALUES ('ce5b22b0-c6c4-4c09-826c-7086c53ee9ec', 'ec422e09-55bb-4465-a990-31f59859959d');

The query is:

-- query
SELECT children_union.id AS child_id
FROM parents p
         JOIN (SELECT id, parent_id
                    FROM children
                    UNION ALL
                    SELECT null::uuid, null::uuid
                    WHERE false) children_union ON children_union.parent_id = p.id
WHERE p.id = 'ec422e09-55bb-4465-a990-31f59859959d'
    FOR UPDATE OF p;

It looks weird, but it's just a simplification of a much bigger query. The 'SELECT null::uuid, null::uuid WHERE false' part was actually more meaningful but I substituted it with a query that returns 0 rows after finding out that it's irrelevant.

If I just run this query I get something that I would expect to get:

-- result 1
               child_id
--------------------------------------
 5cb82ceb-c5ef-4c59-a02e-f7b610470f8c
 ce5b22b0-c6c4-4c09-826c-7086c53ee9ec
(2 rows)

But if I lock the single row in the parents table:

-- concurrent query
BEGIN;
UPDATE parents
SET revision = revision + 1
WHERE id = 'ec422e09-55bb-4465-a990-31f59859959d';

and then run my query again in a separate session, then it's waiting for the lock to be released. Once I commit the concurrent query and release the lock, I get this:

-- result 2
               child_id
--------------------------------------
 5cb82ceb-c5ef-4c59-a02e-f7b610470f8c
 5cb82ceb-c5ef-4c59-a02e-f7b610470f8c
(2 rows)

but I would expect to get the same result as previously. Is it a bug or am I doing something wrong here?

Thank you,
Andrii

Re: Unexpected results from a query with UNION ALL

From
Tom Lane
Date:
Andrey <adnyre@gmail.com> writes:
> Recently I got unexpected results from a query that seems to be legit.
> ...
> ... Once I commit the concurrent query and
> release the lock, I get this:

> -- result 2
>                child_id
> --------------------------------------
>  5cb82ceb-c5ef-4c59-a02e-f7b610470f8c
>  5cb82ceb-c5ef-4c59-a02e-f7b610470f8c
> (2 rows)

> but I would expect to get the same result as previously. Is it a bug
> or am I doing something wrong here?

It's a surprising result for sure, but I believe it's explained by
the algorithm for READ COMMITTED [1], specifically the bit about

    The search condition of the command (the WHERE clause) is
    re-evaluated to see if the updated version of the row still matches the
    search condition.  If so, the second updater proceeds with its operation
    using the updated version of the row.

Once the tuple lock is released, the join query fetches the new
version of the parents row (with the new revision value, though that's
not actually relevant to the result).  It then effectively re-executes
the join against the UNION construct, and that means it'll always find
the first matching row in "children".  The "updated version of the
row" is taken to mean the entire join row, so it doesn't blink at the
fact that it got a different child output than it had started with.

Another way to look at this is that locking only "p" underspecifies
the query result: there's more than one child row that could join
to the "p" row, and the system doesn't promise that you get a result
from any particular one of them.

If you try to fix it by also locking the UNION result, or by adding
FOR UPDATE to the UNION arm that selects from "children", you get

ERROR:  FOR UPDATE is not allowed with UNION/INTERSECT/EXCEPT

This example makes me feel that we've missed some cases where we
probably ought to throw that error.  Or else work harder on making
the combination be supported --- but it looks tricky to produce
consistent results, and there have been few complaints about this
omission so far.

In the meantime, the most recommendable answer for you is probably
to switch over to using SERIALIZABLE mode.  That'd require adding
application logic to retry after a serialization failure, but it
would produce consistent results even for complex queries.

            regards, tom lane

[1] https://www.postgresql.org/docs/current/transaction-iso.html#XACT-READ-COMMITTED



Re: Unexpected results from a query with UNION ALL

From
Tom Lane
Date:
I wrote:
> Andrey <adnyre@gmail.com> writes:
>> ... but I would expect to get the same result as previously. Is it a bug
>> or am I doing something wrong here?

> It's a surprising result for sure, but I believe it's explained by
> the algorithm for READ COMMITTED [1], specifically the bit about

Actually, on further thought I believe this really is a bug,
because if you change the UNION ALL to UNION it works fine.
It probably used to work with UNION ALL too, but that was a few
decades ago before we started adding optimizations of UNION ALL :-(

I've been poking at this off and on for the last few days, and I've
found three different things that will need to be changed to make
it work again.  At least one of them looks too invasive to consider
for back-patch.  So don't hold your breath for a proper fix, but
perhaps you could use UNION as a workaround?

            regards, tom lane