Re: Weird behaviour after update from 12.2 to 12.3 version - Mailing list pgsql-bugs

From David G. Johnston
Subject Re: Weird behaviour after update from 12.2 to 12.3 version
Date
Msg-id CAKFQuwYuKKp5dMyZfy358pG-7ZivwCcuaYhL9x1ng0O+kdywEA@mail.gmail.com
Whole thread Raw
In response to Weird behaviour after update from 12.2 to 12.3 version  (Andrii Palko <palko1902@gmail.com>)
List pgsql-bugs
On Mon, Aug 17, 2020 at 9:26 AM Andrii Palko <palko1902@gmail.com> wrote:
Update to version 12.3 broke this query. In 12.2 it works as expected but in version 12.3 it works conversely. It updates the newer row by information from the old row.

As I understand such breaking changes shouldn't be in the minor version update, so it looks strange for me, also I didn't find anything about it in releases notes.

A minor version update doesn't guarantee that a incorrectly written query that happens to give the expected results will continue to do so if expected results are not due to some guarantee.

Looking at the query I don't see that the WHERE clause is guaranteeing that the product_attribute_options row being updated is the "old row".

What I believe is happening here is since both nextRow and product_attribute_options contain two rows the UPDATE statement join produces 4 different input tuple pair orderings:

(new, old), (new, new), (old, old), (old, new)

I'm beginning to disbelieve your claim that this works properly in 12.2...though I think it's because the UPDATE forces only two of the four possibilities to ever be considered.

The (new, new) and (old, old) pairs don't matter but I would expect that the net result of this would be to always swap your old and new details.  But even if the swapping doesn't occur whether (new, old) or (old, new) is evaluated first undefined.  Supposedly in 12.2 (old, new) was evaluated while in 12.3 (new, old) is.

I'm sure I missing some detail here to make this a truly accurate depiction of reality but in short your UPDATE join is non-deterministic and you need to fix your query because non-determinstic results can change for any number of reasons including minor release updates (though more commonly plan shape due to changing statistics is the more common cause).  I do not believe the lone ORDER BY in the CTE is sufficient to impose determinism.

It looks as if you only need one row to be produced by the CTE so a LIMIT 1 there is warranted and probably sufficient.  Adding an old.id <> new.id would make the update more efficient since it wouldn't attempt to update more than one row (assuming that only two rows are present in the table).

David J.

pgsql-bugs by date:

Previous
From: James Lucas
Date:
Subject: Re: BUG #16582: Logical index corruption leading to apparent index scan infinite loop
Next
From: Cyril Jouve
Date:
Subject: Re: BUG #16508: using multi-host connection string when the first host is starting fails