Thread: BUG #6401: IS DISTINCT FROM improperly compares geomoetric datatypes
The following bug has been logged on the website: Bug reference: 6401 Logged by: Kenaniah Cerny Email address: kenaniah@gmail.com PostgreSQL version: 9.1.2 Operating system: Centos 6 x86_64 Description:=20=20=20=20=20=20=20=20 ---- SELECT point(2, 3) IS DISTINCT FROM point(2, 3); ---- ERROR: operator does not exist: point =3D point LINE 1: SELECT point(2, 3) IS DISTINCT FROM point(2, 3) ^ HINT: No operator matches the given name and argument type(s). You might need to add explicit type casts. ---- According to the docs (http://www.postgresql.org/docs/current/static/functions-geometry.html#FUNC= TIONS-GEOMETRY), the "same as" operator is the ~=3D operator, not the =3D operator.=20 This problem cascades to the comparison of complex types, such as "OLD IS DISTINCT FROM NEW" in the context of plpgsql.=20
Re: BUG #6401: IS DISTINCT FROM improperly compares geomoetric datatypes
From
Heikki Linnakangas
Date:
On 18.01.2012 23:14, kenaniah@gmail.com wrote: > The following bug has been logged on the website: > > Bug reference: 6401 > Logged by: Kenaniah Cerny > Email address: kenaniah@gmail.com > PostgreSQL version: 9.1.2 > Operating system: Centos 6 x86_64 > Description: > > ---- > SELECT point(2, 3) IS DISTINCT FROM point(2, 3); > ---- > ERROR: operator does not exist: point = point > LINE 1: SELECT point(2, 3) IS DISTINCT FROM point(2, 3) > ^ > HINT: No operator matches the given name and argument type(s). You might > need to add explicit type casts. > ---- > > According to the docs > (http://www.postgresql.org/docs/current/static/functions-geometry.html#FUNCTIONS-GEOMETRY), > the "same as" operator is the ~= operator, not the = operator. Yeah, unfortunately the code to transform IS DISTINCT FROM into the corresponding operator doesn't know about that. In fact, you might expect it to complain about missing "<>" operator, as "IS DISTINCT FROM" is really more like inequality, but the planner in fact implements it using the = operator, negating the result. That's because it's more common for datatypes to have a = operator than <>. Curiously, that's not the case with point datatype - it has a <> operator but not =. Frankly that's such a rare corner case that I'm not very enthusiastic about fixing it. One idea would be to look up the type's b-tree sort operators, and pick the equality operator from there. But point datatype doesn't have b-tree sort operators, either, so it wouldn't help in this case. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Excerpts from Heikki Linnakangas's message of jue ene 19 07:25:36 -0300 201= 2: > Frankly that's such a rare corner case that I'm not very enthusiastic=20 > about fixing it. One idea would be to look up the type's b-tree sort=20 > operators, and pick the equality operator from there. But point datatype= =20 > doesn't have b-tree sort operators, either, so it wouldn't help in this= =20 > case. It doesn't have a hash opclass either, which could be used as a fallback in case there's no btree. Point cannot obviously have a btree opclass (no inequalities), but a hash one seems possible. I think the use case of IS NOT DISTINCT FROM for rowtypes in triggers is a valid one. --=20 =C3=81lvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Re: BUG #6401: IS DISTINCT FROM improperly compares geomoetric datatypes
From
Heikki Linnakangas
Date:
On 19.01.2012 15:30, Alvaro Herrera wrote: > > Excerpts from Heikki Linnakangas's message of jue ene 19 07:25:36 -0300 2012: > >> Frankly that's such a rare corner case that I'm not very enthusiastic >> about fixing it. One idea would be to look up the type's b-tree sort >> operators, and pick the equality operator from there. But point datatype >> doesn't have b-tree sort operators, either, so it wouldn't help in this >> case. > > It doesn't have a hash opclass either, which could be used as a fallback > in case there's no btree. Point cannot obviously have a btree opclass > (no inequalities), but a hash one seems possible. It wouldn't be difficult to define b-tree operators for point, by comparing x value first, then y, or something like that. If the index operations are only used for equality lookups, it doesn't matter how the < and > are defined as long as the system is self-consistent. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Alvaro Herrera <alvherre@commandprompt.com> writes: > Excerpts from Heikki Linnakangas's message of jue ene 19 07:25:36 -0300 2012: >> Frankly that's such a rare corner case that I'm not very enthusiastic >> about fixing it. One idea would be to look up the type's b-tree sort >> operators, and pick the equality operator from there. But point datatype >> doesn't have b-tree sort operators, either, so it wouldn't help in this >> case. > It doesn't have a hash opclass either, which could be used as a fallback > in case there's no btree. Point cannot obviously have a btree opclass > (no inequalities), but a hash one seems possible. > I think the use case of IS NOT DISTINCT FROM for rowtypes in triggers is > a valid one. Note that IS [NOT] DISTINCT is not the only place that assumes that it should use an operator named "=". There's also scalar IN, the simple form of CASE, and possibly some others that I forget at the moment. IMO, if we're going to change the semantics of any of these, we should do them all together. This is something I've kinda wanted to do for a long time, but never gotten around to. We've managed to clean up hard-wired assumptions about operator names in a lot of other places, but these syntactic constructs still do it by name. One argument against changing it is that arguably doing so would violate the letter of the SQL standard. For example, I observe that SQL defines the IN construct thus: The expression RVC IN IPV is equivalent to RVC = ANY IPV (SQL99 8.4 <in predicate> syntax rule 4). The word "equality" appears nowhere in the definition of IN. Thus, if we take "X IN (Y,Z,Q)" and implement it with some operator not named "=", we have not done what the spec clearly says to do. Now you can make the case that we'd be implementing the spirit rather than the letter of the spec, but that's a rather shaky case to have to make. The same is true for simple CASE: c) The <case specification> is equivalent to a <searched case> in which each <searched when clause> specifies a <search condition> of the form "CO=WO". with absolutely no hint that equality is what the "=" symbol is supposed to get you. And in 8.13 <distinct predicate> we have Case: i) "X IS DISTINCT FROM Y" is false if either: 1) X and Y are the null value, or 2) X = Y according to Subclause 8.2, "<comparison predicate>". ii) Otherwise, "X IS DISTINCT FROM Y" is true. which at least suggests that what's wanted is equality, but they're still defining it in terms of an operator named "=" (and AFAICS subclause 8.2 doesn't address the possibility that "X=Y" could mean something other than the common idea of equality). So on the whole, it might be better to just provide an operator named "=" for point, and not open up the can of worms about whether these constructs should use some other rule for deciding which operator to compare with. regards, tom lane
Is this a TODO? --------------------------------------------------------------------------- On Thu, Jan 19, 2012 at 10:39:42AM -0500, Tom Lane wrote: > Alvaro Herrera <alvherre@commandprompt.com> writes: > > Excerpts from Heikki Linnakangas's message of jue ene 19 07:25:36 -0300 2012: > >> Frankly that's such a rare corner case that I'm not very enthusiastic > >> about fixing it. One idea would be to look up the type's b-tree sort > >> operators, and pick the equality operator from there. But point datatype > >> doesn't have b-tree sort operators, either, so it wouldn't help in this > >> case. > > > It doesn't have a hash opclass either, which could be used as a fallback > > in case there's no btree. Point cannot obviously have a btree opclass > > (no inequalities), but a hash one seems possible. > > > I think the use case of IS NOT DISTINCT FROM for rowtypes in triggers is > > a valid one. > > Note that IS [NOT] DISTINCT is not the only place that assumes that it > should use an operator named "=". There's also scalar IN, the simple > form of CASE, and possibly some others that I forget at the moment. > IMO, if we're going to change the semantics of any of these, we should > do them all together. > > This is something I've kinda wanted to do for a long time, but never > gotten around to. We've managed to clean up hard-wired assumptions > about operator names in a lot of other places, but these syntactic > constructs still do it by name. > > One argument against changing it is that arguably doing so would violate > the letter of the SQL standard. For example, I observe that SQL defines > the IN construct thus: > > The expression > > RVC IN IPV > > is equivalent to > > RVC = ANY IPV > > (SQL99 8.4 <in predicate> syntax rule 4). The word "equality" appears > nowhere in the definition of IN. Thus, if we take "X IN (Y,Z,Q)" and > implement it with some operator not named "=", we have not done what > the spec clearly says to do. Now you can make the case that we'd be > implementing the spirit rather than the letter of the spec, but that's > a rather shaky case to have to make. > > The same is true for simple CASE: > > c) The <case specification> is equivalent to a <searched case> > in which each <searched when clause> specifies a <search > condition> of the form "CO=WO". > > with absolutely no hint that equality is what the "=" symbol is supposed > to get you. And in 8.13 <distinct predicate> we have > > Case: > i) "X IS DISTINCT FROM Y" is false if either: > 1) X and Y are the null value, or > 2) X = Y according to Subclause 8.2, "<comparison > predicate>". > ii) Otherwise, "X IS DISTINCT FROM Y" is true. > > which at least suggests that what's wanted is equality, but they're > still defining it in terms of an operator named "=" (and AFAICS > subclause 8.2 doesn't address the possibility that "X=Y" could mean > something other than the common idea of equality). > > So on the whole, it might be better to just provide an operator named > "=" for point, and not open up the can of worms about whether these > constructs should use some other rule for deciding which operator to > compare with. > > regards, tom lane > > -- > Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-bugs -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +