Thread: Range Type constructors
There are two issues I'd like to discuss related to constructing range types from other values. 1. The obvious constructor would be: range(1, 10) But is that [1, 10), (1, 10], (1, 10), or [1, 10]? We need to support all 4, and it's not obvious how to do that easily. The solution that I came up with is not particularly clean, but is quite practical: range(1, 10) -> [1, 10) range__(1, 10) -> (1, 10) range_i(1, 10) -> (1, 10] rangei_(1, 10) -> [1, 10) rangeii(1, 10) ->[1, 10] The last two letters refer to the lower and upper bounds, respectively. A "i" means "inclusive" and an "_" means "exclusive". range() is an alias for rangei_(), because that's the most common representation to use. I realize this isn't a clean solution, and better ideas are welcome. This one actually is quite natural to use I think: short to type and easy to remember (for me at least ;). It gets a little stranger for trying to construct unbounded ranges from other values. Again, there are four possibilities: range_uinfi(5) -> [5, INF) range_uinf_(5) -> (5, INF) range_linfi(5)-> (-INF, 5] range_linf_(5) -> (-INF, 5) And again, not exactly clean, but they work. Constructing a singleton range is easy, fortunately, because only something like "[5,5]" makes sense, "[5,5)" doesn't. So there's just a single-argument version of range: range(5) -> [5,5] 2. The second issue is with the type system. In order for the polymorphic constructors to work, they need to be able to determine the data types of their inputs to construct the range. I am using get_fn_expr_argtype() to accomplish that, but it's not always guaranteed to work. That was the problem Erik ran into: the "range @> elem" operator was implicitly constructing a range on the right side based on the type of the right operand; but was being called in contexts where the types aren't known (like the selectivity estimator). The fix was easy: get the type from the range operand (which is actually stored with the range). But that fix won't work for the constructors above, because there is no range argument to start from. So: in what contexts are functions called that get_fn_expr_argtype() might fail; and are the above constructors at risk for that? Is there a better way? Regards,Jeff Davis
On Wed, Feb 9, 2011 at 14:50, Jeff Davis <pgsql@j-davis.com> wrote: > 1. > The obvious constructor would be: > range(1, 10) > But is that [1, 10), (1, 10], (1, 10), or [1, 10]? We need to support > all 4, and it's not obvious how to do that easily. here is the same issue in table partitioning. Also, We might use the syntax for our partitioning in the future. Just for reference, DB2 uses EXCLUSIVE and INCLUSIVE keywords to specify boundaries. CREATE TABLE ... PARTITION BY RANGE (...) (STARTING 0 EXCLUSIVE ENDING 100 INCLUSIVE) http://publib.boulder.ibm.com/infocenter/db2luw/v9r8/index.jsp?topic=/com.ibm.db2.luw.sql.ref.doc/doc/r0000927.html I'm not sure it is the best syntax, but at least it's easy to read for beginners and works with parentheses completion by text editors. -- Itagaki Takahiro
On Wed, 2011-02-09 at 15:39 +0900, Itagaki Takahiro wrote: > On Wed, Feb 9, 2011 at 14:50, Jeff Davis <pgsql@j-davis.com> wrote: > > 1. > > The obvious constructor would be: > > range(1, 10) > > But is that [1, 10), (1, 10], (1, 10), or [1, 10]? We need to support > > all 4, and it's not obvious how to do that easily. > > here is the same issue in table partitioning. Also, We might use the > syntax for our partitioning in the future. Just for reference, > DB2 uses EXCLUSIVE and INCLUSIVE keywords to specify boundaries. > > CREATE TABLE ... PARTITION BY RANGE (...) > (STARTING 0 EXCLUSIVE ENDING 100 INCLUSIVE) Interesting. It needs to be usable in normal expressions, however, so it may require some adaptation. That's how arrays do it: there's a special Expr node that represents an array expression. Maybe the same thing could be used for range types, but I fear that there may be some grammar conflicts. I doubt we'd want to fully reserve the keyword "range". Regards,Jeff Davis
On Wed, Feb 9, 2011 at 2:09 AM, Jeff Davis <pgsql@j-davis.com> wrote: > That's how arrays do it: there's a special Expr node that represents an > array expression. Maybe the same thing could be used for range types, > but I fear that there may be some grammar conflicts. I doubt we'd want > to fully reserve the keyword "range". According to our documentation[1], RANGE is reserved in SQL:2008 and SQL:2003, which makes it more imaginable to reserve it than it would be otherwise. I believe that in a previous email you mentioned that you were hoping to implement RANGE JOIN, and I will just note that the restrictions of the grammar require that any keyword that immediately follows the previous expression and precedes JOIN must be fully reserved. I'm not sure if you meant that a range join would literally use the syntax RANGE JOIN, but if so then you're going to have to argue for fully reserving RANGE anyway, in which case there'd be no special reason not to allow RANGE [1,10) to mean just that. On the other hand, if a RANGE JOIN just means a regular join on some funky operator, and there's no other reason to reserve range, I wouldn't do it just to get a nicer syntax here. Have you done investigation of what RANGE is used to mean in the SQL spec? Is what you're implementing (a) spec, (b) similar idea, but not the spec, or (c) something completely different? I'm guessing (c) but I have no idea what the spec is using it for. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company [1] http://www.postgresql.org/docs/current/static/sql-keywords-appendix.html
On Thu, 2011-02-10 at 13:07 -0500, Robert Haas wrote: > According to our documentation[1], RANGE is reserved in SQL:2008 and > SQL:2003, which makes it more imaginable to reserve it than it would > be otherwise. Oh, interesting. > I believe that in a previous email you mentioned that > you were hoping to implement RANGE JOIN, and I will just note that the > restrictions of the grammar require that any keyword that immediately > follows the previous expression and precedes JOIN must be fully > reserved. I'm not sure if you meant that a range join would literally > use the syntax RANGE JOIN, but if so then you're going to have to > argue for fully reserving RANGE anyway, in which case there'd be no > special reason not to allow RANGE [1,10) to mean just that. On the > other hand, if a RANGE JOIN just means a regular join on some funky > operator, and there's no other reason to reserve range, I wouldn't do > it just to get a nicer syntax here. It's mostly just a regular join on a funky operator. We may want that operator to allow a new plan (range merge join); but I think we can determine that it's a range join from the use of the operator. I'll have to look into that more. > Have you done investigation of what RANGE is used to mean in the SQL > spec? Is what you're implementing (a) spec, (b) similar idea, but not > the spec, or (c) something completely different? I'm guessing (c) but > I have no idea what the spec is using it for. (c) was my intention. I did take a brief look at the spec a while back, but I'll take a more detailed look. I think it only has to do with window specifications. This might solve the constructor problem nicely if we could do things like: RANGE[10,20) But I have a feeling that will either cause a bizarre problem with the grammar, or someone will think it's not very SQL-like. Regards,Jeff Davis
On Thu, Feb 10, 2011 at 1:41 PM, Jeff Davis <pgsql@j-davis.com> wrote: > This might solve the constructor problem nicely if we could do things > like: > RANGE[10,20) > But I have a feeling that will either cause a bizarre problem with the > grammar, or someone will think it's not very SQL-like. I think won't cause any problem at all if RANGE is fully reserved, but like you say we probably don't want to do that unless it's absolutely necessary, and if you don't actually need to be able to type in foo RANGE JOIN bar then it probably isn't. I think your proposed naming schema for constructors is pretty reasonable, except I might use "o" for open and "c" for closed rather than "i" and "_", i.e. range_oo(), range_oc(), range_co(), range_cc().If that'll get us by without fully reserving RANGEthen I'd certainly be in favor of doing it that way. I was just saying - if we were inevitably going to have to reserve RANGE, then we could try to squeeze a little more out of it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
> This might solve the constructor problem nicely if we could do things > like: > RANGE[10,20) > But I have a feeling that will either cause a bizarre problem with the > grammar, or someone will think it's not very SQL-like. It will certainly mess up syntax highlighting and matching bracket detection in pretty much all text editors... best regards, Florian Pflug
Florian Pflug <fgp@phlo.org> writes: >> This might solve the constructor problem nicely if we could do things >> like: >> RANGE[10,20) >> But I have a feeling that will either cause a bizarre problem with the >> grammar, or someone will think it's not very SQL-like. > It will certainly mess up syntax highlighting and matching bracket detection > in pretty much all text editors... Yeah. It's a cute-looking notation but surely it will cause many more problems than it's worth. I agree with Robert's suggestion of plain functions named like range_co() etc. regards, tom lane
On Feb 10, 2011, at 10:41 AM, Jeff Davis wrote: > This might solve the constructor problem nicely if we could do things > like: > RANGE[10,20) > But I have a feeling that will either cause a bizarre problem with the > grammar, or someone will think it's not very SQL-like. I like it a lot better than the funkily-named functions you posted yesterday. Best, David
On Feb 10, 2011, at 11:21 AM, Tom Lane wrote: >> It will certainly mess up syntax highlighting and matching bracket detection >> in pretty much all text editors... > > Yeah. It's a cute-looking notation but surely it will cause many more > problems than it's worth. I agree with Robert's suggestion of plain > functions named like range_co() etc. I could see myself using ranges in ways similar to using arrays. It should would be nice to have the equivalent of ARRAY[]and/or ARRAY() to use for ranges… Best, David