Thread: Fix optimization of foreign-key on update actions

Fix optimization of foreign-key on update actions

From

Peter Eisentraut

Date:

05 February 2019, 15:36:31

I came across an edge case in how our foreign key implementation works
that I think is not SQL conforming.  It has to do with how updates to
values that "look" different but compare as equal are cascaded.  A
simple case involves float -0 vs. 0, but relevant cases also arise with
citext and case-insensitive collations.

Consider this example:

create table pktable2 (a float8, b float8, primary key (a, b));
create table fktable2 (x float8, y float8,
    foreign key (x, y) references pktable2 (a, b) on update cascade);

insert into pktable2 values ('-0', '-0');
insert into fktable2 values ('-0', '-0');

update pktable2 set a = '0' where a = '-0';

What happens now?

select * from pktable2;
 a | b
---+----
 0 | -0
(1 row)

-- buggy: did not update fktable2.x
select * from fktable2;
 x  | y
----+----
 -0 | -0
(1 row)

This happens because ri_KeysEqual() compares the old and new rows and
decides that because they are "equal", the ON UPDATE actions don't need
to be run.

The SQL standard seems clear that ON CASCADE UPDATE means that an
analogous UPDATE should be run on matching rows in the foreign key
table, so the current behavior is wrong.

Moreover, if another column is also updated, like update pktable2 set a
= '0', b = '5', then the old and new rows are no longer equal, and so x
will get updated in fktable2.  So the context creates inconsistencies.

The fix is that in these cases we have ri_KeysEqual() use a more
low-level form of equality, like for example record_image_eq does.  In
fact, we can take the same code.  See attached patches.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: Fix optimization of foreign-key on update actions

From

Andrew Gierth

Date:

05 February 2019, 16:02:08

>>>>> "Peter" == Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:

 Peter> The SQL standard seems clear

ha

hahaha

HAHAHAHAHAHA

(since when has the SQL standard ever been clear?)

SQL2016, 15.17 Execution of referential actions

10) If a non-null value of a referenced column RC in the referenced
    table is updated to a value that is distinct from the current value
    of RC, then,

      [snip all the stuff about how ON UPDATE actions work]

does that "is distinct from" mean that IS DISTINCT FROM would be true,
or does it mean "is in some way distinguishable from"? Nothing I can see
in the spec suggests the latter.

-- 
Andrew (irc:RhodiumToad)

Re: Fix optimization of foreign-key on update actions

From

Tom Lane

Date:

05 February 2019, 16:20:52

Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
> "Peter" == Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:
>  Peter> The SQL standard seems clear

> (since when has the SQL standard ever been clear?)

Point to Andrew ;-).  However, I kind of like Peter's idea anyway
on the grounds that byte-wise comparison is probably faster than
invoking the datatype comparators.  Also, language lawyering aside,
I think he may be right about what people would expect "should" happen.

What I *don't* like about the proposed patch is that it installs a
new, different comparison rule for the ON UPDATE CASCADE case only.
If we were to go in this direction, I'd think we should try to use
the same comparison rule for all FK row comparisons.

The inconsistencies get messier the more you think about it,
really.  If a referencing row was datatype-equal, but not byte-equal,
to the PK row to start with, why would an update changing the PK row
(perhaps to another datatype-equal value) result in forcing the
referencing row to become byte-equal?  How does this fit in with
the fact that our notion of what uniqueness means in the PK table
is no-datatype-equality, rather than no-byte-equality?

On the whole we might be better off leaving this alone.

            regards, tom lane

Re: Fix optimization of foreign-key on update actions

From

Laurenz Albe

Date:

06 February 2019, 09:11:13

Andrew Gierth wrote:
> SQL2016, 15.17 Execution of referential actions
> 
> 10) If a non-null value of a referenced column RC in the referenced
>     table is updated to a value that is distinct from the current value
>     of RC, then,
> 
>       [snip all the stuff about how ON UPDATE actions work]
> 
> does that "is distinct from" mean that IS DISTINCT FROM would be true,
> or does it mean "is in some way distinguishable from"? Nothing I can see
> in the spec suggests the latter.

My 2003 standard defines, and even condescends to be informal:

3.1.6.8 distinct (of a pair of comparable values): Capable of being distinguished within a given context.
Informally, not equal, not both null. A null value and a non-null value are distinct.

Yours,
Laurenz Albe

Re: Fix optimization of foreign-key on update actions

From

Andrew Gierth

Date:

06 February 2019, 11:23:58

>>>>> "Laurenz" == Laurenz Albe <laurenz.albe@cybertec.at> writes:

 Laurenz> Andrew Gierth wrote:
 >> SQL2016, 15.17 Execution of referential actions
 >> 
 >> 10) If a non-null value of a referenced column RC in the referenced
 >> table is updated to a value that is distinct from the current value
 >> of RC, then,
 >> 
 >> [snip all the stuff about how ON UPDATE actions work]
 >> 
 >> does that "is distinct from" mean that IS DISTINCT FROM would be true,
 >> or does it mean "is in some way distinguishable from"? Nothing I can see
 >> in the spec suggests the latter.

 Laurenz> My 2003 standard defines, and even condescends to be informal:

 Laurenz> 3.1.6.8 distinct (of a pair of comparable values): Capable of
 Laurenz> being distinguished within a given context. Informally, not
 Laurenz> equal, not both null. A null value and a non-null value are
 Laurenz> distinct.

Hrm. SQL2016 has similar language which I previously missed, but I don't
think it actually helps:

3.1.6.9  distinct (of a pair of comparable values)
         capable of being distinguished within a given context

               NOTE 8 -- Informally, two values are distinct if neither
               is null and the values are not equal. A null value and a
               non- null value are distinct. Two null values are not
               distinct. See Subclause 4.1.5, "Properties of distinct",
               and the General Rules of Subclause 8.15, "<distinct
               predicate>".

Two values which are sql-equal but not identical, such as two strings in
a case-insensitive collation that differ only by case, are
distinguishable in some contexts but not others, so what context
actually applies to the quoted rule?

I think the only reasonable interpretation is that it should use the
same kind of distinctness that is being used by the unique constraint
and the equality comparison that define whether the FK is satisfied.

-- 
Andrew.

Re: Fix optimization of foreign-key on update actions

From

Peter Eisentraut

Date:

06 February 2019, 22:04:21

On 06/02/2019 12:23, Andrew Gierth wrote:
> Two values which are sql-equal but not identical, such as two strings in
> a case-insensitive collation that differ only by case, are
> distinguishable in some contexts but not others, so what context
> actually applies to the quoted rule?
> 
> I think the only reasonable interpretation is that it should use the
> same kind of distinctness that is being used by the unique constraint
> and the equality comparison that define whether the FK is satisfied.

By that logic, a command such as

    UPDATE t1 SET x = '0' WHERE x = '-0';

could be optimized away as a noop, because in that world there is no
construct by which you can prove whether the update happened.

I think that would not be satisfactory.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Fix optimization of foreign-key on update actions

From

Peter Eisentraut

Date:

06 February 2019, 22:15:59

On 05/02/2019 17:20, Tom Lane wrote:
> What I *don't* like about the proposed patch is that it installs a
> new, different comparison rule for the ON UPDATE CASCADE case only.
> If we were to go in this direction, I'd think we should try to use
> the same comparison rule for all FK row comparisons.

That's easy to change.  I had it like that in earlier versions of the
patch.  I agree it would be better for consistency, but it would create
some cases where we do unnecessary extra work.

> The inconsistencies get messier the more you think about it,
> really.  If a referencing row was datatype-equal, but not byte-equal,
> to the PK row to start with, why would an update changing the PK row
> (perhaps to another datatype-equal value) result in forcing the
> referencing row to become byte-equal?  How does this fit in with
> the fact that our notion of what uniqueness means in the PK table
> is no-datatype-equality, rather than no-byte-equality?

This patch doesn't actually change the actual foreign key behavior.
Foreign keys already work like that.  The patch only touches an
optimization that checks whether it's worth running the full foreign key
behavior because the change was significant enough.  That shouldn't
affect the outcome.  It should be the case that if you replace
RI_FKey_pk_upd_check_required() by just "return true", then nothing
user-visible changes.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Fix optimization of foreign-key on update actions

From

Peter Eisentraut

Date:

11 March 2019, 11:57:49

On 2019-02-06 23:15, Peter Eisentraut wrote:
> On 05/02/2019 17:20, Tom Lane wrote:
>> What I *don't* like about the proposed patch is that it installs a
>> new, different comparison rule for the ON UPDATE CASCADE case only.
>> If we were to go in this direction, I'd think we should try to use
>> the same comparison rule for all FK row comparisons.
> 
> That's easy to change.  I had it like that in earlier versions of the
> patch.  I agree it would be better for consistency, but it would create
> some cases where we do unnecessary extra work.

Updated patch with this change made, and some conflicts resolved.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: Fix optimization of foreign-key on update actions

From

Peter Eisentraut

Date:

18 March 2019, 17:02:52

On 2019-03-11 12:57, Peter Eisentraut wrote:
> On 2019-02-06 23:15, Peter Eisentraut wrote:
>> On 05/02/2019 17:20, Tom Lane wrote:
>>> What I *don't* like about the proposed patch is that it installs a
>>> new, different comparison rule for the ON UPDATE CASCADE case only.
>>> If we were to go in this direction, I'd think we should try to use
>>> the same comparison rule for all FK row comparisons.
>>
>> That's easy to change.  I had it like that in earlier versions of the
>> patch.  I agree it would be better for consistency, but it would create
>> some cases where we do unnecessary extra work.
> 
> Updated patch with this change made, and some conflicts resolved.

Committed like this.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services