Thread: Comments on Exclusion Constraints and related datatypes
Exclusion constraints are good. There's a few annoyances around them, that are minor but feel should be addressed for 9.0. * Exclusion indexes are created with the suffix "_exclusion". That's a very long suffix and will overflow most defined reports/screens. It would be much better to use just "_excl", so that the number of characters for index suffixes is always 4 or less characters, whatever type of index they are. * Circles, Boxes and other geometric datatypes defined "overlaps" to include touching shapes. So SELECT circle '((0,0), 1)' && circle '((2,0),1)'; is true, which is fairly strange and makes those datatypes very counter intuitive. Considering they are instructional aids, this is bad. * inet datatypes don't have a commutative operator on which a unique index can be built. There is no "overlaps" equivalent, which again is a shame because that stops them being used with the new feature. That leaves me thinking that we're shipping Postgres 9.0 with a new feature that isn't real-world usable with built-in datatypes. It's not even easily usable for demo applications either with the noddy datatypes. I do appreciate that the main use case is with add-in datatypes but this is a shame for such a great feature. Also, if the only common sense usage of exclusion constraints is GIST, why does the syntax default to "btree"? This means that alter table boxes add exclude using gist (position with &&); is required, to avoid this ERROR alter table boxes add exclude (position with &&); ERROR: data type box has no default operator class for access method "btree" HINT: You must specify an operator class for the index or define a default operator class for the data type. Surely in the absence of a relevant btree operator class we should automatically check for GIST operator classes? -- Simon Riggs www.2ndQuadrant.com
On Mon, Mar 22, 2010 at 9:15 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > Exclusion constraints are good. There's a few annoyances around them, > that are minor but feel should be addressed for 9.0. > > * Exclusion indexes are created with the suffix "_exclusion". That's a > very long suffix and will overflow most defined reports/screens. It > would be much better to use just "_excl", so that the number of > characters for index suffixes is always 4 or less characters, whatever > type of index they are. I'd be OK with that. > * Circles, Boxes and other geometric datatypes defined "overlaps" to > include touching shapes. So > SELECT circle '((0,0), 1)' && circle '((2,0),1)'; > is true, which is fairly strange and makes those datatypes very counter > intuitive. Considering they are instructional aids, this is bad. As a non-user of geometric datatypes, I have no opinion on this one way or the other. > * inet datatypes don't have a commutative operator on which a unique > index can be built. There is no "overlaps" equivalent, which again is a > shame because that stops them being used with the new feature. This would be a nice thing to fix, and I was thinking about doing it, but I just ran out of time. I think it can be left for 9.1. I have not infrequently wanted to build an IP allocation database, and this would be perfect for that. > That leaves me thinking that we're shipping Postgres 9.0 with a new > feature that isn't real-world usable with built-in datatypes. It's not > even easily usable for demo applications either with the noddy > datatypes. I do appreciate that the main use case is with add-in > datatypes but this is a shame for such a great feature. > > Also, if the only common sense usage of exclusion constraints is GIST, > why does the syntax default to "btree"? This means that > > alter table boxes add exclude using gist (position with &&); > > is required, to avoid this ERROR > > alter table boxes add exclude (position with &&); > ERROR: data type box has no default operator class for access method > "btree" > HINT: You must specify an operator class for the index or define a > default operator class for the data type. > > Surely in the absence of a relevant btree operator class we should > automatically check for GIST operator classes? That doesn't seem particularly consistent with what we've done elsewhere. ...Robert
Simon Riggs <simon@2ndQuadrant.com> writes: > * Exclusion indexes are created with the suffix "_exclusion". That's a > very long suffix and will overflow most defined reports/screens. It > would be much better to use just "_excl", No particular objection here. > * Circles, Boxes and other geometric datatypes defined "overlaps" to > include touching shapes. So > SELECT circle '((0,0), 1)' && circle '((2,0),1)'; > is true, which is fairly strange and makes those datatypes very counter > intuitive. Considering they are instructional aids, this is bad. You're approximately twenty years too late to propose changing that, even if it were clearly a good idea which I doubt. > Also, if the only common sense usage of exclusion constraints is GIST, > why does the syntax default to "btree"? Since your "if" isn't a correct statement, the complaint doesn't follow. regards, tom lane
On Mon, 2010-03-22 at 10:13 -0400, Tom Lane wrote: > Simon Riggs <simon@2ndQuadrant.com> writes: > > * Exclusion indexes are created with the suffix "_exclusion". That's a > > very long suffix and will overflow most defined reports/screens. It > > would be much better to use just "_excl", > > No particular objection here. OK, will change. > > * Circles, Boxes and other geometric datatypes defined "overlaps" to > > include touching shapes. So > > SELECT circle '((0,0), 1)' && circle '((2,0),1)'; > > is true, which is fairly strange and makes those datatypes very counter > > intuitive. Considering they are instructional aids, this is bad. > > You're approximately twenty years too late to propose changing that, > even if it were clearly a good idea which I doubt. Possibly. We should at least document that. > > Also, if the only common sense usage of exclusion constraints is GIST, > > why does the syntax default to "btree"? > > Since your "if" isn't a correct statement, the complaint doesn't follow. Docs say "The access method must support amgettuple (see Chapter 51); at present this means GIN cannot be used. Although it's allowed, there is little point in using btree or hash indexes with an exclusion constraint, because this does nothing that an ordinary unique constraint doesn't do better. So in practice the access method will always be GiST." Hence my comment. -- Simon Riggs www.2ndQuadrant.com
Simon Riggs <simon@2ndQuadrant.com> writes: > On Mon, 2010-03-22 at 10:13 -0400, Tom Lane wrote: >>> Also, if the only common sense usage of exclusion constraints is GIST, >>> why does the syntax default to "btree"? >> >> Since your "if" isn't a correct statement, the complaint doesn't follow. > Docs say > "The access method must support amgettuple (see Chapter 51); at present > this means GIN cannot be used. Although it's allowed, there is little > point in using btree or hash indexes with an exclusion constraint, > because this does nothing that an ordinary unique constraint doesn't do > better. So in practice the access method will always be GiST." Well, I would hope that the lack of GIN support will be cured someday. I see the above as a statement of what's true in 9.0, not what will always be true; so it's pretty weak as a justification for introducing a confusing default behavior. Actually the statement might be overly strong even now. If you want uniqueness checks with a hash index, exclusion is the only way to get that. I'm not sure that that's actually useful versus a standard btree unique constraint, but it's at least arguably another use case. regards, tom lane
On mån, 2010-03-22 at 13:15 +0000, Simon Riggs wrote: > * inet datatypes don't have a commutative operator on which a unique > index can be built. There is no "overlaps" equivalent, which again is a > shame because that stops them being used with the new feature. http://pgfoundry.org/projects/ip4r/ provides an overlap operator.
On Mon, 2010-03-22 at 16:40 +0200, Peter Eisentraut wrote: > On mån, 2010-03-22 at 13:15 +0000, Simon Riggs wrote: > > * inet datatypes don't have a commutative operator on which a unique > > index can be built. There is no "overlaps" equivalent, which again is a > > shame because that stops them being used with the new feature. > > http://pgfoundry.org/projects/ip4r/ provides an overlap operator. Like I said, nothing in the base server. -- Simon Riggs www.2ndQuadrant.com
Simon Riggs <simon@2ndQuadrant.com> wrote: > On Mon, 2010-03-22 at 10:13 -0400, Tom Lane wrote: >> Simon Riggs <simon@2ndQuadrant.com> writes: >> > * Circles, Boxes and other geometric datatypes defined >> > "overlaps" to include touching shapes. So >> > SELECT circle '((0,0), 1)' && circle '((2,0),1)'; >> > is true, which is fairly strange and makes those datatypes very >> > counter intuitive. Considering they are instructional aids, >> > this is bad. >> >> You're approximately twenty years too late to propose changing >> that, even if it were clearly a good idea which I doubt. > > Possibly. We should at least document that. Basically, what you feel is missing is documentation that if two shapes share one or more points they are considered to overlap; there is no requirement that they share an area? -Kevin
On Mon, 2010-03-22 at 10:32 -0500, Kevin Grittner wrote: > Simon Riggs <simon@2ndQuadrant.com> wrote: > > On Mon, 2010-03-22 at 10:13 -0400, Tom Lane wrote: > >> Simon Riggs <simon@2ndQuadrant.com> writes: > > >> > * Circles, Boxes and other geometric datatypes defined > >> > "overlaps" to include touching shapes. So > >> > SELECT circle '((0,0), 1)' && circle '((2,0),1)'; > >> > is true, which is fairly strange and makes those datatypes very > >> > counter intuitive. Considering they are instructional aids, > >> > this is bad. > >> > >> You're approximately twenty years too late to propose changing > >> that, even if it were clearly a good idea which I doubt. > > > > Possibly. We should at least document that. > > Basically, what you feel is missing is documentation that if two > shapes share one or more points they are considered to overlap; > there is no requirement that they share an area? Yes, for most people touching != overlap. So it just looks like a bug. -- Simon Riggs www.2ndQuadrant.com
On Mon, Mar 22, 2010 at 03:42:39PM +0000, Simon Riggs wrote: > On Mon, 2010-03-22 at 10:32 -0500, Kevin Grittner wrote: > > Simon Riggs <simon@2ndQuadrant.com> wrote: > > > On Mon, 2010-03-22 at 10:13 -0400, Tom Lane wrote: > > >> Simon Riggs <simon@2ndQuadrant.com> writes: > > > > >> > * Circles, Boxes and other geometric datatypes defined > > >> > "overlaps" to include touching shapes. So SELECT circle > > >> > '((0,0), 1)' && circle '((2,0),1)'; is true, which is fairly > > >> > strange and makes those datatypes very counter intuitive. > > >> > Considering they are instructional aids, this is bad. > > >> > > >> You're approximately twenty years too late to propose changing > > >> that, even if it were clearly a good idea which I doubt. > > > > > > Possibly. We should at least document that. > > > > Basically, what you feel is missing is documentation that if two > > shapes share one or more points they are considered to overlap; > > there is no requirement that they share an area? > > Yes, for most people touching != overlap. So it just looks like a > bug. I don't know which people you've surveyed, but at least in my math classes, one point in common was sufficient for an overlap. I'd be happy to write up something that makes this clear. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Simon Riggs <simon@2ndQuadrant.com> wrote: > Yes, for most people touching != overlap. So it just looks like a > bug. A quick search of the web turned up a definition of overlap in geometry as meaning that two polygons share at least one *internal* point, which would be consistent with your interpretation; but there is the issue of breaking existing code. Perhaps people are now accustomed to following the existing overlaps test with a test that the area of intersection is non-zero? Anyway, based on what I found, we should document the current behavior, as the term in PostgreSQL doesn't seem to match the conventional definition in geometry. -Kevin
Simon Riggs wrote: >> >> Basically, what you feel is missing is documentation that if two >> shapes share one or more points they are considered to overlap; >> there is no requirement that they share an area? >> > > Yes, for most people touching != overlap. So it just looks like a bug. > > I guess I must be strange, then, it doesn't strike me as particularly unnatural. By all means document it, though, if the docs are not clear on the point. cheers andrew
On Mon, 2010-03-22 at 09:00 -0700, David Fetter wrote: > > Yes, for most people touching != overlap. So it just looks like a > > bug. > > I don't know which people you've surveyed, but at least in my math > classes, one point in common was sufficient for an overlap. I'd be > happy to write up something that makes this clear. If you're happy to document it, good, thanks. -- Simon Riggs www.2ndQuadrant.com
On Mon, Mar 22, 2010 at 04:47:37PM +0000, Simon Riggs wrote: > On Mon, 2010-03-22 at 09:00 -0700, David Fetter wrote: > > > > Yes, for most people touching != overlap. So it just looks like a > > > bug. > > > > I don't know which people you've surveyed, but at least in my math > > classes, one point in common was sufficient for an overlap. I'd be > > happy to write up something that makes this clear. > > If you're happy to document it, good, thanks. I'm seeing a lot of places where this might be documented. Any ideas as to which ones are appropriate? Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
David Fetter <david@fetter.org> writes: > I'm seeing a lot of places where this might be documented. Any ideas > as to which ones are appropriate? I would think "Geometric Functions and Operators" would be the most appropriate spot ... regards, tom lane
On Mon, Mar 22, 2010 at 01:46:51PM -0400, Tom Lane wrote: > David Fetter <david@fetter.org> writes: > > I'm seeing a lot of places where this might be documented. Any ideas > > as to which ones are appropriate? > > I would think "Geometric Functions and Operators" would be the most > appropriate spot ... Please find patch enclosed. :) Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Attachment
David Fetter wrote: > diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml > index 9881ff4..9313112 100644 > --- a/doc/src/sgml/func.sgml > +++ b/doc/src/sgml/func.sgml > @@ -7134,7 +7134,7 @@ CREATE TYPE rainbow AS ENUM ('red', 'orange', 'yellow', 'green', 'blue', 'purple > </row> > <row> > <entry> <literal>&&</literal> </entry> > - <entry>Overlaps?</entry> > + <entry>Overlaps? One point in common makes this true.</entry> > <entry><literal>box '((0,0),(1,1))' && box '((0,0),(2,2))'</literal></entry> > </row> > <row> Hmm, how does this look in horizontal space? (The <row> makes me think it's a table.) -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
On Mon, Mar 22, 2010 at 04:04:16PM -0300, Alvaro Herrera wrote: > David Fetter wrote: > > > diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml > > index 9881ff4..9313112 100644 > > --- a/doc/src/sgml/func.sgml > > +++ b/doc/src/sgml/func.sgml > > @@ -7134,7 +7134,7 @@ CREATE TYPE rainbow AS ENUM ('red', 'orange', 'yellow', 'green', 'blue', 'purple > > </row> > > <row> > > <entry> <literal>&&</literal> </entry> > > - <entry>Overlaps?</entry> > > + <entry>Overlaps? One point in common makes this true.</entry> > > <entry><literal>box '((0,0),(1,1))' && box '((0,0),(2,2))'</literal></entry> > > </row> > > <row> > > Hmm, how does this look in horizontal space? (The <row> makes me think > it's a table.) Looks OK to me. The entry above, "Closest point to first operand on second operand" is actually wider. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
On Mon, Mar 22, 2010 at 1:15 PM, Simon Riggs <simon@2ndquadrant.com> wrote: > > * Circles, Boxes and other geometric datatypes defined "overlaps" to > include touching shapes. So > > * inet datatypes don't have a commutative operator on which a unique > index can be built. There is no "overlaps" equivalent, which again is a > shame because that stops them being used with the new feature. I think our unusual data types are one of the strong points of Postgres but they're missing a lot of operators and opclasses to make them really useful. There's no reason we couldn't have separate overlaps and overlaps-internally operators just like we have <=,>= and <,>. And it would be nice to flesh out the network data type more fully, perhaps merging in as much of ip4r as makes sense. I remember when I tried to use geometric data types I was stymied by missing operators. In particular I was surprised that point <in> box wasn't a gist indexable method. I think that particular case has been addressed but I think there are many more like it. -- greg
On Mon, 2010-03-22 at 10:55 -0700, David Fetter wrote: > On Mon, Mar 22, 2010 at 01:46:51PM -0400, Tom Lane wrote: > > David Fetter <david@fetter.org> writes: > > > I'm seeing a lot of places where this might be documented. Any ideas > > > as to which ones are appropriate? > > > > I would think "Geometric Functions and Operators" would be the most > > appropriate spot ... > > Please find patch enclosed. :) I think you've missed my point. I don't suppose anybody would disagree that box '((0,0),(1,1))' && box '((0,0),(2,2))' is true. The fact that these two boxes share the point (0,0) has nothing to do with the obvious fact that the larger box completely contains the smaller one. What I was talking about was that box '((0,0),(1,1))' && box '((1,1),(2,2))' returns true, even though they touch at only a single point, and share zero area. Perhaps the end result here is the realisation that a normal search operator like overlaps (&&) might not be the right operator to use for exclusion constraints. For the latter, we might want something that defines the left and bottom edges of a box to belong to the box (>=), whereas the top edges are the boundary but do not belong to the box itself (<). In a timerange that might be easier to visualise: the start timestamp is inclusive, the end timestamp is exclusive, so that each timestamp exists in only one timerange when the timeranges are overlapping with no gaps. -- Simon Riggs www.2ndQuadrant.com
Simon Riggs <simon@2ndQuadrant.com> wrote: > I think you've missed my point. > What I was talking about was that box '((0,0),(1,1))' && box > '((1,1),(2,2))' returns true, even though they touch at only a > single point, and share zero area. FWIW, that's what I would take away from "one point in common" -Kevin
David Fetter wrote: > On Mon, Mar 22, 2010 at 04:04:16PM -0300, Alvaro Herrera wrote: > > David Fetter wrote: > > > > > diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml > > > index 9881ff4..9313112 100644 > > > --- a/doc/src/sgml/func.sgml > > > +++ b/doc/src/sgml/func.sgml > > > @@ -7134,7 +7134,7 @@ CREATE TYPE rainbow AS ENUM ('red', 'orange', 'yellow', 'green', 'blue', 'purple > > > </row> > > > <row> > > > <entry> <literal>&&</literal> </entry> > > > - <entry>Overlaps?</entry> > > > + <entry>Overlaps? One point in common makes this true.</entry> > > > <entry><literal>box '((0,0),(1,1))' && box '((0,0),(2,2))'</literal></entry> > > > </row> > > > <row> > > > > Hmm, how does this look in horizontal space? (The <row> makes me think > > it's a table.) > > Looks OK to me. The entry above, "Closest point to first operand on > second operand" is actually wider. Patch applied. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + None of us is going to be here forever. +
Greg Stark wrote: > On Mon, Mar 22, 2010 at 1:15 PM, Simon Riggs <simon@2ndquadrant.com> wrote: > > > > * Circles, Boxes and other geometric datatypes defined "overlaps" to > > include touching shapes. So > > > > * inet datatypes don't have a commutative operator on which a unique > > index can be built. There is no "overlaps" equivalent, which again is a > > shame because that stops them being used with the new feature. > > I think our unusual data types are one of the strong points of > Postgres but they're missing a lot of operators and opclasses to make > them really useful. > > There's no reason we couldn't have separate overlaps and > overlaps-internally operators just like we have <=,>= and <,>. And it > would be nice to flesh out the network data type more fully, perhaps > merging in as much of ip4r as makes sense. Added to TODO: Add overlaps geometric operators that ignore point overlaps * http://archives.postgresql.org/pgsql-hackers/2010-03/msg00861.php -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + None of us is going to be here forever. +
Robert Haas wrote: > > * inet datatypes don't have a commutative operator on which a unique > > index can be built. There is no "overlaps" equivalent, which again is a > > shame because that stops them being used with the new feature. > > This would be a nice thing to fix, and I was thinking about doing it, > but I just ran out of time. I think it can be left for 9.1. I have > not infrequently wanted to build an IP allocation database, and this > would be perfect for that. Added to TODO: Add INET overlaps operator, for use by exclusion constraints * http://archives.postgresql.org/pgsql-hackers/2010-03/msg00845.php -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + None of us is going to be here forever. +