Thread: Singleton range constructors versus functional coercion notation
The singleton range constructors don't work terribly well. regression=# select int4range(42); -- okint4range -----------[42,43) (1 row) regression=# select int4range(null); -- not so okint4range ----------- (1 row) regression=# select int4range('42'); -- clearly not ok ERROR: malformed range literal: "42" LINE 1: select int4range('42'); ^ DETAIL: Missing left parenthesis or bracket. The second of these might at first glance seem all right; until you remember that range_constructor1 is not strict and throws an error on null input. So it's not getting called. What is actually happening in both cases 2 and 3 is that func_get_detail() is interpreting the syntax as equivalent to 'literal'::int4range. We do not have a whole lot of room to maneuver here, because that equivalence is of very long standing; and as mentioned in the comments in that function, we can't easily alter its priority relative to other interpretations. I don't immediately see a solution that's better than dropping the single-argument range constructors. Even if you don't care that much about the NULL case, things like this are pretty fatal from a usability standpoint: regression=# select daterange('2011-11-18'); ERROR: malformed range literal: "2011-11-18" LINE 1: select daterange('2011-11-18'); ^ DETAIL: Missing left parenthesis or bracket. I'm not sure that singleton ranges are so useful that we need to come up with a short-form input method for them. (Yeah, I know that this case could be fixed with an explicit cast, but if we leave it like this we'll get a constant stream of bug reports about it.) For that matter, the zero-argument range constructors seem like mostly a waste of catalog space too ... what's wrong with writing 'empty'::int4range when you need that? regards, tom lane
On Sat, 2011-11-19 at 12:27 -0500, Tom Lane wrote: > The singleton range constructors don't work terribly well. ... > I don't immediately see a solution that's better than dropping the > single-argument range constructors. We could change the name, I suppose, but that seems awkward. I'm hesitant to remove them because the alternative is significantly more verbose: numrange(1.0, 1.0, '[]'); But I don't have any particularly good ideas to save them, either. Regarding the zero-argument (empty) constructors, I'd be fine removing them. They don't seem to cause problems, but the utility is also very minor. Regards,Jeff Davis
2011/11/19 Jeff Davis <pgsql@j-davis.com>: > On Sat, 2011-11-19 at 12:27 -0500, Tom Lane wrote: >> The singleton range constructors don't work terribly well. > ... > >> I don't immediately see a solution that's better than dropping the >> single-argument range constructors. > > We could change the name, I suppose, but that seems awkward. I'm > hesitant to remove them because the alternative is significantly more > verbose: > > numrange(1.0, 1.0, '[]'); one parameter range should be confusing. Single parameter range constructors is useless Regards Pavel > > But I don't have any particularly good ideas to save them, either. > > Regarding the zero-argument (empty) constructors, I'd be fine removing > them. They don't seem to cause problems, but the utility is also very > minor. > > Regards, > Jeff Davis > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers >
Jeff Davis <pgsql@j-davis.com> writes: > On Sat, 2011-11-19 at 12:27 -0500, Tom Lane wrote: >> I don't immediately see a solution that's better than dropping the >> single-argument range constructors. > We could change the name, I suppose, but that seems awkward. Yeah, something like int4range_1(42) would work, but it seems rather ugly. > I'm hesitant to remove them because the alternative is significantly > more verbose: > numrange(1.0, 1.0, '[]'); Right. The question is, does the case occur in practice often enough to justify a shorter notation? I'm not sure. One thing I've been thinking for a bit is that for discrete ranges, I find the '[)' canonical form to be surprising/confusing. If we were to fix range_adjacent along the lines suggested by Florian, would it be practical to go over to '[]' as the canonical form? One good thing about that approach is that canonicalization wouldn't have to involve generating values that might overflow. regards, tom lane
On Nov19, 2011, at 21:57 , Tom Lane wrote: > One thing I've been thinking for a bit is that for discrete ranges, > I find the '[)' canonical form to be surprising/confusing. If we > were to fix range_adjacent along the lines suggested by Florian, > would it be practical to go over to '[]' as the canonical form? > One good thing about that approach is that canonicalization wouldn't > have to involve generating values that might overflow. I have argued against that in the past, mostly because 1) It's better to be consistent 2) While it seems intuitive for integer ranges to use [] and float ranges to use [), things are far less clear once you take other base types into account. For example, we'd end up making ranges over DATE use canonicalize to [], but ranges over TIMESTAMP to [). Which, at least IMHO, is quite far from intuitive. And if we start "fixing" these issues by making exception from the "discrete -> [], continuous -> [)" rule, we'll end up with essentially arbitrary behaviour pretty quickly. At which point (1) kicks it ;-) And then there also ranges over NUMERIC, which can be both discrete and continuous, depending on the typmod. Which I think is a good reason in itself to make as little behaviour as possible depend on the continuity or non-continuity of range types. best regards, Florian Pflug
On Sat, 2011-11-19 at 15:57 -0500, Tom Lane wrote: > > I'm hesitant to remove them because the alternative is significantly > > more verbose: > > numrange(1.0, 1.0, '[]'); > > Right. The question is, does the case occur in practice often enough > to justify a shorter notation? I'm not sure. Well, if there were a good shorter notation, then probably so. But it doesn't look like we have a good idea, so I'm fine with dropping it. > One thing I've been thinking for a bit is that for discrete ranges, > I find the '[)' canonical form to be surprising/confusing. If we > were to fix range_adjacent along the lines suggested by Florian, > would it be practical to go over to '[]' as the canonical form? > One good thing about that approach is that canonicalization wouldn't > have to involve generating values that might overflow. I think we had that discussion before, and Florian brought up some good points (both then and in a reply now). Also, Robert wasn't convinced that '[]' is necessarily better for discrete ranges: http://archives.postgresql.org/pgsql-hackers/2011-10/msg00598.php Regards,Jeff Davis
On Nov 20, 2011, at 10:24 PM, Jeff Davis <pgsql@j-davis.com> wrote: > On Sat, 2011-11-19 at 15:57 -0500, Tom Lane wrote: >>> I'm hesitant to remove them because the alternative is significantly >>> more verbose: >>> numrange(1.0, 1.0, '[]'); >> >> Right. The question is, does the case occur in practice often enough >> to justify a shorter notation? I'm not sure. > > Well, if there were a good shorter notation, then probably so. But it > doesn't look like we have a good idea, so I'm fine with dropping it. We should also keep in mind that people who use range types can and likely will define their own convenience functions. If people use singletons, or open ranges, or closed ranges, or one-hour timestamp ranges frequently, they can make theirown notational shorthand with a 3-line CREATE FUNCTION statement. We don't need to have it all in core. ...Robert
Robert Haas <robertmhaas@gmail.com> writes: > On Nov 20, 2011, at 10:24 PM, Jeff Davis <pgsql@j-davis.com> wrote: >> Well, if there were a good shorter notation, then probably so. But it >> doesn't look like we have a good idea, so I'm fine with dropping it. > We should also keep in mind that people who use range types can and likely will define their own convenience functions. If people use singletons, or open ranges, or closed ranges, or one-hour timestamp ranges frequently, they canmake their own notational shorthand with a 3-line CREATE FUNCTION statement. We don't need to have it all in core. But if you believe that, what syntax do you think people are likely to try if they want a singleton range constructor? Leaving the user to discover the problem and try to invent a workaround is not better than doing it ourselves ... regards, tom lane
On 21 November 2011 14:55, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Nov 20, 2011, at 10:24 PM, Jeff Davis <pgsql@j-davis.com> wrote: >>> Well, if there were a good shorter notation, then probably so. But it >>> doesn't look like we have a good idea, so I'm fine with dropping it. > >> We should also keep in mind that people who use range types can and likely will define their own convenience functions. If people use singletons, or open ranges, or closed ranges, or one-hour timestamp ranges frequently, they canmake their own notational shorthand with a 3-line CREATE FUNCTION statement. We don't need to have it all in core. > > But if you believe that, what syntax do you think people are likely to > try if they want a singleton range constructor? Leaving the user to > discover the problem and try to invent a workaround is not better than > doing it ourselves ... > In the field of mathematics, a standard shorthand notation for the degenerate interval [x,x] is {x} - the singleton set - so that's one possibility. Dean
On Mon, Nov 21, 2011 at 9:55 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Nov 20, 2011, at 10:24 PM, Jeff Davis <pgsql@j-davis.com> wrote: >>> Well, if there were a good shorter notation, then probably so. But it >>> doesn't look like we have a good idea, so I'm fine with dropping it. > >> We should also keep in mind that people who use range types can and likely will define their own convenience functions. If people use singletons, or open ranges, or closed ranges, or one-hour timestamp ranges frequently, they canmake their own notational shorthand with a 3-line CREATE FUNCTION statement. We don't need to have it all in core. > > But if you believe that, what syntax do you think people are likely to > try if they want a singleton range constructor? Leaving the user to > discover the problem and try to invent a workaround is not better than > doing it ourselves ... Wait, do I hear the great Tom Lane arguing for putting more than the minimal amount of stuff in core? :-) I honestly don't know what function names people will pick, and I don't care. Someone might like singleton(x), which would be impractical as a built-in because there could be more than one range type over the same base type, but if the user defines the function they can pick what's convenient for them. If they use singletons exceedingly frequently they might even want something really short, like just(x) or s(x). Or they might say daterange1(x), along the lines you suggested earlier. The point is that by not defining more than necessary in core, we give the user the flexibility to do what they want. In cases where that amounts to handing them a loaded gun with the safety off, we shouldn't do it, but that doesn't seem to be the case here. It doesn't take a lot of programming acumen to write a function that passes two copies of its single argument to a built-in. The only mistake anyone's likely to make is forgetting to declare it non-VOLATILE. But the real point is that I don't think we should assume that singleton ranges are unique in being things for which people will want shortcuts. We talked about having a behavior-changing GUC to control whether the bounds are [) or [] or () or (], but we didn't do it, because behavior-changing GUCs are a bad idea. But I fully expect that people who make heavy use of range types will (and we should encourage them to) define convenience functions with names of their own choosing that pass the bounds that are useful to them. If isn't the very model of a use-case for inline-able SQL functions, I'm not sure what is. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Tue, 2011-11-22 at 09:07 -0500, Robert Haas wrote: > I honestly don't know what function names people will pick, and I > don't care. Someone might like singleton(x), which would be > impractical as a built-in because there could be more than one range > type over the same base type, but if the user defines the function > they can pick what's convenient for them. If they use singletons > exceedingly frequently they might even want something really short, > like just(x) or s(x). Or they might say daterange1(x), along the > lines you suggested earlier. For that matter, they might pick daterange(x), as I picked earlier, and run into the same problems. It's a little strange that we allow people to define functions with one argument and the same name as a type if such functions are confusing. This isn't intended as an argument in either direction, just an observation. Regards,Jeff Davis
Jeff Davis <pgsql@j-davis.com> writes: > It's a little strange that we allow people to define functions with one > argument and the same name as a type if such functions are confusing. As long as your mental model is that such a function is a cast, everything is fine. The trouble with the range constructors is that they aren't really casts, as shown by the fact that when you readtextrange('foo') you expect 'foo' to be text and not textrange. regards, tom lane