Thread: Range Types and extensions
I'd like to take another look at Range Types and whether part of it should be an extension. Some of these issues relate to extensions in general, not just range types. First of all, what are the advantages to being in core? 1. ANYRANGE + CREATE TYPE ... AS RANGE -------------------------------------- This is the most compelling, in my opinion. People can define new range functions and new range types independently and each one gets the benefit of the other automatically. Without this, there will be an explosion of functions and a bunch of inconsistencies like functions that support most range types but not all (merely because the function author didn't know that the type existed). In the several talks that I've given, a common question is related to "multiranges" (ranges with holes). These get a little complex, and I don't have a complete answer. However, multiranges can be approximated with ordered arrays of non-overlapping, non-adjacent ranges. If someone wants to take it upon themselves to develop a set of operators here, that would be great -- but without ANYRANGE the operators would be unmanageable. 2. Documentation and Tests -------------------------- Let's say we take a minimalist view, and only have ANYRANGE and CREATE TYPE ... AS RANGE in core; and leave the rest as an extension. What exactly would the documentation say? I think it would be even more hypothetical and abstract than the documentation for Exclusion Constraints. So, there is a certain documentation advantage to having at least enough functionality to allow someone to try out the feature. And the tests for such a minimalist feature would be a significant challenge -- what do we do there? Get pg_regress to load the extension from PGXN? 3. Quality ---------- PostgreSQL has a great reputation for quality, and for good reason. But extensions don't follow the same quality-control standards; and even if some do, there is no visible stamp of approval. So, to ask someone to use an extension means that they have to evaluate the quality for themselves, which is a pretty high barrier. Since PGXN (thanks David Wheeler) and EXTENSIONs (thanks Dmitri) solve many of the other issues, quality control is one of the biggest ones remaining. I still get questions about when the temporal type will be "in core", and I think this is why. I don't think this is a good excuse to put it in core though. We need to solve this problem, and the best way to start is by getting well-reviewed, high-quality extensions out there. 4. Future work -- RANGE KEY, RANGE FOREIGN KEY, RANGE MERGE JOIN, etc. --------------------------------- There are a few aspects of range types that aren't in the first patch, but are fairly obvious follow-up additions. These will require some knowledge about ranges in the backend, like finding the "overlaps" operator for a range. The current patch provides this knowledge by providing a built-in overlaps operator for ANYRANGE. This would be a non-issue if we had a good type interface system (that works on polymorphic types) -- we could just have a built-in "range" interface, and the range extension could add "&&" as the range interface's overlaps operator for the type ANYRANGE. ================================= So, where on this spectrum should range types fall? I think the most minimalist would be to only support #1 (and the necessary type IO functions); and leave all other functions, operators, and opclasses to an extension. That has a lot of appeal, but I don't think we can ignore the challenges above. On the other hand, trying to make it a complete feature in core has challenges as well. For instance, even with Range Types, Exclusion Constraints aren't practical out-of-the-box unless we also have BTree-GiST in core. So there's a snowball effect. There might also be some middle ground, where its like the minimalist approach, but with a few very basic constructors and accessors. That would at least make it easier to test, but then to be actually useful (with index support, operators, fancy functions, etc.) you'd need the extension. Thoughts? Regards,Jeff Davis
Jeff Davis wrote: > I'd like to take another look at Range Types and whether part of it > should be an extension. Some of these issues relate to extensions in > general, not just range types. > > First of all, what are the advantages to being in core? I believe that ranges aka intervals are widely useful generic types, next after relations/tuples/arrays, and they *should* be supported in core, same as arrays are. In particular, the usefulness of ranges/intervals is often orthogonal to many other things, and for many types including numbers, strings, temporals. Now assuming that a range/interval value is generally defined in terms of a pair of endpoints of some ordered type (that is, a type for which ORDER BY or RANK or {<,>,<=,>=} etc or LIMIT makes sense), it will be essential that this value is capable of distinguishing open and closed intervals. For example, a range value can be represented by a tuple with 4 attributes, where two of those are the endpoint values, and two of those are booleans saying whether each of the endpoints is inside or outside the range/interval. Also, if Postgres has some concept of type-generic special values -Inf and +Inf (which always sort before or after any other value in the type system), those can be used as endpoints to indicate that the interval is unbounded. Unless you have some other syntax in mind, I suggest lifting the range literal syntax from Perl 6, where ".." is an infix operator building a range between its arguments, and a "^" on either side means that side is open, I think; so there are 4 variants: {..,^..,..^,^..^}. Now as to general usefulness of intervals ... Any operation that wants to deal with a range somehow, such as the BETWEEN syntax, could instead use a range/interval; for example, both of: foo in 1..10 foo between 1 and 10 ... would mean the same thing, but the 1..10 can be replaced by an arbitrary value expression or variable reference. Likewise with: date in start ..^ end date >= start and date < end ... mean the same thing. The LIMIT clause could take a range to specify take and skip count at once. Array slicing can be done using foo[first..last] or such. A random number generator that takes endpoints can take a range argument. An array or relation of these range can represent ranges with holes, and the general results of range union operations. -- Darren Duncan
2011/6/6 Darren Duncan <darren@darrenduncan.net>: > Jeff Davis wrote: >> >> I'd like to take another look at Range Types and whether part of it >> should be an extension. Some of these issues relate to extensions in >> general, not just range types. >> >> First of all, what are the advantages to being in core? it should be supported by FOREACH statement in PL/pgSQL Pavel
Pavel Stehule wrote: > 2011/6/6 Darren Duncan <darren@darrenduncan.net>: >> Jeff Davis wrote: >>> I'd like to take another look at Range Types and whether part of it >>> should be an extension. Some of these issues relate to extensions in >>> general, not just range types. >>> >>> First of all, what are the advantages to being in core? > > it should be supported by FOREACH statement in PL/pgSQL Yes, absolutely. I know this feature is loved in Perl. But this usage would only work for a more limited range of data types, namely those over which one can build a sequence generator, such as integers, because they have a next-value/prev-value function defined. In other words, while range types in general would work for any ordered type, FOREACH would only work for the subset of those that are ordinal types. -- Darren Duncan
Jeff Davis <pgsql@j-davis.com> writes: > I'd like to take another look at Range Types and whether part of it > should be an extension. Some of these issues relate to extensions in > general, not just range types. That's a good question :) I think the way things are going to be organised now is that we will have core-blessed extensions: don't mix the mechanism and the policy. > non-issue if we had a good type interface system (that works on > polymorphic types) -- we could just have a built-in "range" interface, > and the range extension could add "&&" as the range interface's overlaps > operator for the type ANYRANGE. That means that this is, IMHO, the right approach. Have core support that enables user defined RANGE types with indexing and planner support, etc, like we have OPERATOR CLASS and FAMILY and all the jazz. And the useful stuff you need to have to benefit from that core support would be an extension. It could be a core maintained extension, and it could even get installed by default, so that all the users would need to do is 'CREATE EXTENSION timeranges;', for example. So, I see us getting those different kinds of extensions in the future: a. core extensions, shipped by defaultb. contribs, not shipped by default, maintained by core hackersc. examples, includedin the source code only, maintained as contribsd. “trusted network” of extensions (pgdg, pgxn, debian, privates,etc)e. external independent extensions, just as now The other main difference between a core extension and a contrib will be where it's documented. Greg Smith proposed a patch wherein he moved some contribs to a new extension/ place, and had them auto installed. I think the consensus is to instead add a new chapter (maybe between current chapters 9. Functions and Operators and 10. Type Conversion) and host “core extensions” docs there. The source code organisation is controversial because technically not necessary. We have to keep the work Greg did to keep those contribs shipped by default. Oh, and that is on the 9.1 Open Items, right? Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On Mon, 2011-06-06 at 06:56 +0200, Pavel Stehule wrote: > 2011/6/6 Darren Duncan <darren@darrenduncan.net>: > > Jeff Davis wrote: > >> > >> I'd like to take another look at Range Types and whether part of it > >> should be an extension. Some of these issues relate to extensions in > >> general, not just range types. > >> > >> First of all, what are the advantages to being in core? > > it should be supported by FOREACH statement in PL/pgSQL Oh, good idea. It would only work for discrete ranges though. However, I would need to somehow reintroduce the concept of "next", which has some hazards to it (as Tom pointed out, we don't want someone to define the "next" for a float to be "+1.0"). I'll have to think about this. Regards,Jeff Davis
On Sun, 2011-06-05 at 21:51 -0700, Darren Duncan wrote: > Jeff Davis wrote: > > I'd like to take another look at Range Types and whether part of it > > should be an extension. Some of these issues relate to extensions in > > general, not just range types. > > > > First of all, what are the advantages to being in core? > > I believe that ranges aka intervals are widely useful generic types, next after > relations/tuples/arrays, and they *should* be supported in core, same as arrays are. I think we all agree that ranges are important. I am not suggesting that we sacrifice on the semantics to make it an extension; I'm just trying to see if involving extensions for some of the approximately 5000 lines would be a good idea. > Now assuming that a range/interval value is generally defined in terms of a pair > of endpoints of some ordered type (that is, a type for which ORDER BY or RANK or > {<,>,<=,>=} etc or LIMIT makes sense), it will be essential that this value is > capable of distinguishing open and closed intervals. Right, it already does that explicitly. I'd appreciate your input on some of the previous discussion though. > Also, if Postgres has some concept of type-generic special values -Inf and +Inf > (which always sort before or after any other value in the type system), those > can be used as endpoints to indicate that the interval is unbounded. I already introduced +/- infinity to range types. They are not generic outside of ranges, however -- therefore you can't select the upper bound of an upper-infinite range. > Unless you have some other syntax in mind, I suggest lifting the range literal > syntax from Perl 6, where ".." is an infix operator building a range between its > arguments, and a "^" on either side means that side is open, I think; so there > are 4 variants: {..,^..,..^,^..^}. Oh, interesting syntax. That might make a good operator version of a constructor. Unfortunately, "." is not valid in an operator name in PG. Maybe I can use tilde or dash? > Any operation that wants to deal with a range somehow, such as the BETWEEN > syntax, could instead use a range/interval; for example, both of: > > foo in 1..10 I don't know if it's reasonable to introduce syntax like "in" here. Maybe we could just still use "between" and it would recognize that the RHS is a range? > The LIMIT clause could take a range to specify take and skip count at once. Interesting idea. > Array slicing can be done using foo[first..last] or such. I like that, but we already have foo[3:7], so it might be better not to introduce redundancy. Too bad I can't use ":" as an operator. > A random number generator that takes endpoints can take a range argument. Sounds useful because it would make it more explicit whether the endpoints are possible results. > An array or relation of these range can represent ranges with holes, and the > general results of range union operations. Right, that's been brought up before as well. In particular, Scott Bailey has done some thinking/writing on this topic. Regards,Jeff Davis
2011/6/6 Jeff Davis <pgsql@j-davis.com>: > On Mon, 2011-06-06 at 06:56 +0200, Pavel Stehule wrote: >> 2011/6/6 Darren Duncan <darren@darrenduncan.net>: >> > Jeff Davis wrote: >> >> >> >> I'd like to take another look at Range Types and whether part of it >> >> should be an extension. Some of these issues relate to extensions in >> >> general, not just range types. >> >> >> >> First of all, what are the advantages to being in core? >> >> it should be supported by FOREACH statement in PL/pgSQL > > Oh, good idea. It would only work for discrete ranges though. > > However, I would need to somehow reintroduce the concept of "next", > which has some hazards to it (as Tom pointed out, we don't want someone > to define the "next" for a float to be "+1.0"). I'll have to think about > this. we can define a step FOREACH x IN RANGE ..... BY .... LOOP END LOOP Regards Pavel > > Regards, > Jeff Davis > >
On Mon, 2011-06-06 at 14:42 +0200, Dimitri Fontaine wrote: > I think the way things are going to be organised now is that we will > have core-blessed extensions: don't mix the mechanism and the policy. I like that idea. > > non-issue if we had a good type interface system (that works on > > polymorphic types) -- we could just have a built-in "range" interface, > > and the range extension could add "&&" as the range interface's overlaps > > operator for the type ANYRANGE. > > That means that this is, IMHO, the right approach. Have core support > that enables user defined RANGE types with indexing and planner support, > etc, like we have OPERATOR CLASS and FAMILY and all the jazz. If we take the minimal approach, the index support would be the first to be moved to an extension. In order to have index support in core, we need quite a few functions and a significant amount of code. Minimal would be: * CREATE TYPE ... AS RANGE * ANYRANGE * The IO functions * Possibly the constructors and accessors ( range(), range_oc(), lower(), upper(), etc.) Regarding the type interfaces, the only thing that really worries me there is that my future work will depend on them existing, and I haven't really thought through the details. For instance, it just occurred to me recently that it would need to support polymorphic types, which might be a little bit more complicated than a simple lookup. I suppose it's easier to put a few functions in core later if we get stuck than to rip them out later. > And the useful stuff you need to have to benefit from that core support > would be an extension. It could be a core maintained extension, and it > could even get installed by default, so that all the users would need to > do is 'CREATE EXTENSION timeranges;', for example. Sounds good to me. However, would the extension be available in pg_regress? If not, I will need to include those constructors/accessors to be able to test anything. > I think the consensus is to instead add a new chapter (maybe between > current chapters 9. Functions and Operators and 10. Type Conversion) and > host “core extensions” docs there. The source code organisation is > controversial because technically not necessary. We have to keep the > work Greg did to keep those contribs shipped by default. Oh, and that > is on the 9.1 Open Items, right? OK, so there are still a few things to be decided around documentation and tests. Both of those things can take a significant amount of time to rework, so I think I'll leave it alone until we have more of a consensus. We still have time before 9.2 to break some of the code out into an extension when we do have the doc/test issues resolved. Regards,Jeff Davis
On Sun, Jun 5, 2011 at 6:59 PM, Jeff Davis <pgsql@j-davis.com> wrote: > There might also be some middle ground, where its like the minimalist > approach, but with a few very basic constructors and accessors. That > would at least make it easier to test, but then to be actually useful > (with index support, operators, fancy functions, etc.) you'd need the > extension. > > Thoughts? I can see merit to having parts of RANGE implemented in core, along with some of the "usage parts" implemented as extensions, so that if I'm not actually using (say) INET ranges, then the database isn't cluttered up with all the functions and operators for INET ranges. How to slice it apart into an appropriate admixture of core and extensions is a good question, though it seems pretty likely that having an extension for each data type that is to be mixed into a range is a reasonable way to go. I think this also can make some would-be arguments against RANGE go away... "I hate that this RANGE extension means we have to draw 5000 lines of code into every database, and draws in 275 operator functions" evaporates if the "base" part is entirely smaller, and if you only draw in all the functions and operators if you request loading of each of the 17 extensions. Per-type extensions offers a pretty natural partitioning of the code for each type, which seems pretty good. -- When confronted by a difficult problem, solve it by reducing it to the question, "How would the Lone Ranger handle this?"
On Mon, 2011-06-06 at 18:28 +0200, Pavel Stehule wrote: > we can define a step > > FOREACH x IN RANGE ..... BY .... That wouldn't need any of the range infrastructure at all -- it would be purely syntactic, right? Regards,Jeff Davis
On Mon, 2011-06-06 at 16:45 +0000, Christopher Browne wrote: > How to slice it apart into an appropriate admixture of core and > extensions is a good question, though it seems pretty likely that > having an extension for each data type that is to be mixed into a > range is a reasonable way to go. ... > Per-type extensions offers a pretty natural partitioning of the code > for each type, which seems pretty good. Ideally, most range types can be created with a simple: CREATE TYPE foorange AS RANGE (subtype=foo); There might be a few subtype-specific functions, like the canonical function, but overall it should be a small amount of code per range. However, I'd say just bundle a bunch of rangetypes together in one extension. There's not really much cost -- if you are using one range type, you'll use a few more. Regards,Jeff Davis
On Sun, Jun 5, 2011 at 2:59 PM, Jeff Davis <pgsql@j-davis.com> wrote: > So, where on this spectrum should range types fall? I think the most > minimalist would be to only support #1 (and the necessary type IO > functions); and leave all other functions, operators, and opclasses to > an extension. That has a lot of appeal, but I don't think we can ignore > the challenges above. > > On the other hand, trying to make it a complete feature in core has > challenges as well. For instance, even with Range Types, Exclusion > Constraints aren't practical out-of-the-box unless we also have > BTree-GiST in core. So there's a snowball effect. > > There might also be some middle ground, where its like the minimalist > approach, but with a few very basic constructors and accessors. That > would at least make it easier to test, but then to be actually useful > (with index support, operators, fancy functions, etc.) you'd need the > extension. I don't have clear feeling on this question in general, but if we're going to break this up into pieces, it's important that they be logical pieces. Putting half the feature in core and half into an extension just because we can will simplify complicate code maintenance to no good end. The snowball effect is something to avoid, and we need to watch out for that, but if the upshot of putting part of it in core is that the core code can no longer be understood or maintained because it depends heavily on a bunch of non-core code, that's not helpful. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
2011/6/6 Jeff Davis <pgsql@j-davis.com>: > On Mon, 2011-06-06 at 18:28 +0200, Pavel Stehule wrote: >> we can define a step >> >> FOREACH x IN RANGE ..... BY .... > > That wouldn't need any of the range infrastructure at all -- it would be > purely syntactic, right? > I don't think. For lot of types the specification of a step is necessary - for date, for float. Pavel > Regards, > Jeff Davis > >
On Mon, Jun 06, 2011 at 12:53:49PM -0400, Robert Haas wrote: > > I don't have clear feeling on this question in general, but if we're > going to break this up into pieces, it's important that they be > logical pieces. Putting half the feature in core and half into an > extension just because we can will simplify complicate code > maintenance to no good end. The snowball effect is something to > avoid, and we need to watch out for that, but if the upshot of putting > part of it in core is that the core code can no longer be understood > or maintained because it depends heavily on a bunch of non-core code, > that's not helpful. > And concretely, code paths that cannot be exercised easily from core-only code will not get regression tested, and will therefore rot. Ross -- Ross Reedstrom, Ph.D. reedstrm@rice.edu Systems Engineer & Admin, Research Scientist phone: 713-348-6166 Connexions http://cnx.org fax: 713-348-3665 Rice University MS-375, Houston, TX 77005 GPG Key fingerprint = F023 82C8 9B0E 2CC6 0D8E F888 D3AE 810E 88F0 BEDE
* Dimitri Fontaine (dimitri@2ndQuadrant.fr) wrote: > That means that this is, IMHO, the right approach. Have core support > that enables user defined RANGE types with indexing and planner support, > etc, like we have OPERATOR CLASS and FAMILY and all the jazz. Yes, we do, however.. > And the useful stuff you need to have to benefit from that core support > would be an extension. It could be a core maintained extension, and it > could even get installed by default, so that all the users would need to > do is 'CREATE EXTENSION timeranges;', for example. I don't like the idea of having a capability which is not utilized in core. We should make it so extensions can *also* have access to define their own, but we should have the basics covered in core. > a. core extensions, shipped by default Having it as a core extension might work, but I'm not really 'sold' on it. Thanks, Stephen
On Sun, Jun 5, 2011 at 1:59 PM, Jeff Davis <pgsql@j-davis.com> wrote: > In the several talks that I've given, a common question is related to > "multiranges" (ranges with holes). These get a little complex, and I > don't have a complete answer. However, multiranges can be approximated > with ordered arrays of non-overlapping, non-adjacent ranges. If someone > wants to take it upon themselves to develop a set of operators here, > that would be great -- but without ANYRANGE the operators would be > unmanageable. > > 2. Documentation and Tests > -------------------------- > Let's say we take a minimalist view, and only have ANYRANGE and CREATE > TYPE ... AS RANGE in core; and leave the rest as an extension. > > What exactly would the documentation say? I think it would be even more > hypothetical and abstract than the documentation for Exclusion > Constraints. So, there is a certain documentation advantage to having at > least enough functionality to allow someone to try out the feature. > > And the tests for such a minimalist feature would be a significant > challenge -- what do we do there? Get pg_regress to load the extension > from PGXN? > > > 3. Quality > ---------- > PostgreSQL has a great reputation for quality, and for good reason. But > extensions don't follow the same quality-control standards; and even if > some do, there is no visible stamp of approval. So, to ask someone to > use an extension means that they have to evaluate the quality for > themselves, which is a pretty high barrier. > > Since PGXN (thanks David Wheeler) and EXTENSIONs (thanks Dmitri) solve > many of the other issues, quality control is one of the biggest ones > remaining. I still get questions about when the temporal type will be > "in core", and I think this is why. > > I don't think this is a good excuse to put it in core though. We need to > solve this problem, and the best way to start is by getting > well-reviewed, high-quality extensions out there. > > > 4. Future work -- RANGE KEY, RANGE FOREIGN KEY, RANGE MERGE JOIN, etc. > --------------------------------- > There are a few aspects of range types that aren't in the first patch, > but are fairly obvious follow-up additions. These will require some > knowledge about ranges in the backend, like finding the "overlaps" > operator for a range. The current patch provides this knowledge by > providing a built-in overlaps operator for ANYRANGE. This would be a > non-issue if we had a good type interface system (that works on > polymorphic types) -- we could just have a built-in "range" interface, > and the range extension could add "&&" as the range interface's overlaps > operator for the type ANYRANGE. > > ================================= > > So, where on this spectrum should range types fall? I think the most > minimalist would be to only support #1 (and the necessary type IO > functions); and leave all other functions, operators, and opclasses to > an extension. That has a lot of appeal, but I don't think we can ignore > the challenges above. > > On the other hand, trying to make it a complete feature in core has > challenges as well. For instance, even with Range Types, Exclusion > Constraints aren't practical out-of-the-box unless we also have > BTree-GiST in core. So there's a snowball effect. > > There might also be some middle ground, where its like the minimalist > approach, but with a few very basic constructors and accessors. That > would at least make it easier to test, but then to be actually useful > (with index support, operators, fancy functions, etc.) you'd need the > extension. > > Thoughts? ISTM (I haven't followed all the lead up so apologies if this is already covered) a range is a 3rd pseudo 'container' type (the other two being composites and arrays). Do you see: *) being able to make arrays of ranges/ranges of arrays? *) range of composites? I vote for at minimum the type itself and ANYRANGE to be in core. From there you could make it like arrays where the range type is automatically generated for each POD type. I would consider that for sure on basis of simplicity in user-land unless all the extra types and operators are a performance hit. A clean and highly usable implementation in the type system in the spirit of arrays would be fantastic. I'm particularly interested in hypothetical constructor/destructor and in/out mechanics...an 'unnest' like function, a range(a,b,c) that does as row(a,b,c) does, etc, especially if you can work it out so that everything is not hammered through textual processing. merlin
Stephen Frost <sfrost@snowman.net> writes: > I don't like the idea of having a capability which is not utilized > in core. We should make it so extensions can *also* have access to > define their own, but we should have the basics covered in core. Well if another part of core depends on the feature set, then of course you don't have a choice to make it an extension any more. I think that's where I would draw the line. > Having it as a core extension might work, but I'm not really 'sold' on > it. Well, core extension means built by default, part of default regression tests and all. The regression test simply begins with the create extension stanza, that's about it. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Jeff Davis wrote: > On Sun, 2011-06-05 at 21:51 -0700, Darren Duncan wrote: >> Jeff Davis wrote: >>> I'd like to take another look at Range Types and whether part of it >>> should be an extension. Some of these issues relate to extensions in >>> general, not just range types. >>> >>> First of all, what are the advantages to being in core? >> I believe that ranges aka intervals are widely useful generic types, next after >> relations/tuples/arrays, and they *should* be supported in core, same as arrays are. > > I think we all agree that ranges are important. I am not suggesting that > we sacrifice on the semantics to make it an extension; I'm just trying > to see if involving extensions for some of the approximately 5000 lines > would be a good idea. Generally speaking, the best way to go about this is to define the *generic* data type in the core, and leave most operators to extensions. So, in core, we need to have the way to select a range value over ANYTYPE either completely as a value literal or in terms of endpoint values from arbitrary expressions or variables, store the range value in a database, retrieve it, and access its component attributes (endpoints, open/closed) in user-defined constraint and operator definitions. The fundamental value of ranges is the fact that they're a concise way to store and express an interval over an ordered type, and to either compare such intervals or test whether individual values or sets of values are in intervals. And people do that a *lot* (such as withdates), so I see having this range type, which is generic and orthogonal to other types in the same way as arrays or tables are, in core just makes the most sense, and as previously illustrated, ranges are useful in places one might not always think about. Ranges are also much more flexible than BETWEEN for what it does, because AFAIK you can't indicate open or closed with BETWEEN. You should not need to define separate range types or operators for each ordered type, same as you should not have to do so for arrays, or where such functionality is defined should be similar; whatever functionality for arrays you do or don't define in core, do corresponding things for ranges. >> Now assuming that a range/interval value is generally defined in terms of a pair >> of endpoints of some ordered type (that is, a type for which ORDER BY or RANK or >> {<,>,<=,>=} etc or LIMIT makes sense), it will be essential that this value is >> capable of distinguishing open and closed intervals. > > Right, it already does that explicitly. I'd appreciate your input on > some of the previous discussion though. On this note, here's a *big* thing that needs discussion ... Citing this whole FOREACH talk, we need to recognize that this talk about ranges is actually being overloaded for 2 very distinct concepts, which are probably best dealt with separately, possibly as distinct types. This discussion came up in the development of Perl 6 too, and that discussion is probably worth looking into. Ranges/intervals in the general sense can *not* be used to enumerate a list of values in a standard type-sensical manner, such as FOREACH requires. Ranges/intervals are about *comparison*, meaning combinations of tests of how 2 arbitrary values of an ordered type sort relative to each other, and that's it. This usage works for integers, other numbers,strings, dates, and so on, all in a natural manner. Value enumeration, such as in a FOREACH, is a *separate* concept. The comparison and enumeration tasks have distinct sets of operators and are used in distinct contexts. Enumeration requires next/prev-value operators, while ranges/intervals in general do not. Enumeration requires discrete types (or the faking of such) like integers while ranges work for continuous types. Moreover, in practice, one probably wants enumerations to be more flexible than just monotonic increases. With enumerations you'd probably want to start go top-down or bottom-up, you might want to increase geometrically or by some other formula rather than incrementally. I totally agree with sharing syntax and using ranges/intervals to define sequence generators, but a range value should be considered immutable like a number or string while a sequence generator may mutate. For syntax, one could use "x..y" to define an interval while "x...y" for a sequence generator, or that's what Perl 6 does. See also http://perlcabal.org/syn/S03.html#Range_and_RangeIter_semantics that talks about how Perl 6 does ranges. >> Also, if Postgres has some concept of type-generic special values -Inf and +Inf >> (which always sort before or after any other value in the type system), those >> can be used as endpoints to indicate that the interval is unbounded. > > I already introduced +/- infinity to range types. They are not generic > outside of ranges, however -- therefore you can't select the upper bound > of an upper-infinite range. Well, what you have is the least one would want. >> Unless you have some other syntax in mind, I suggest lifting the range literal >> syntax from Perl 6, where ".." is an infix operator building a range between its >> arguments, and a "^" on either side means that side is open, I think; so there >> are 4 variants: {..,^..,..^,^..^}. > > Oh, interesting syntax. That might make a good operator version of a > constructor. Unfortunately, "." is not valid in an operator name in PG. > Maybe I can use tilde or dash? Can Pg be changed to support "." in operator names as long as they don't just appear by themselves? What would this break to do so? >> Any operation that wants to deal with a range somehow, such as the BETWEEN >> syntax, could instead use a range/interval; for example, both of: >> >> foo in 1..10 > > I don't know if it's reasonable to introduce syntax like "in" here. > Maybe we could just still use "between" and it would recognize that the > RHS is a range? I believe it is quite reasonable to treat ranges like sets, in an abstract sense, and so using set membership syntax like "in" is valid. Same as one should be able to use "in" to test whether a value is in an array. I would expect "in" to be a polymorphic infix operator same as "<" or "=" etc are, aren't they? This shouldn't conflict with testing tuples in relations as they are different types, same as you can use the same "<" for numbers and strings, can't you? We could add parenthesis if that helps: foo in (1..10) >> The LIMIT clause could take a range to specify take and skip count at once. > > Interesting idea. > >> Array slicing can be done using foo[first..last] or such. > > I like that, but we already have foo[3:7], so it might be better not to > introduce redundancy. Too bad I can't use ":" as an operator. On that note, some languages use ":" for defining intervals rather than "..". Some languages also use round parenthesis or curly braces to define intervals, but I really don't like that and we shouldn't use it. >> A random number generator that takes endpoints can take a range argument. > > Sounds useful because it would make it more explicit whether the > endpoints are possible results. Exactly. >> An array or relation of these range can represent ranges with holes, and the >> general results of range union operations. > > Right, that's been brought up before as well. In particular, Scott > Bailey has done some thinking/writing on this topic. I also see these as considerably less important and useful in practice than the continuous intervals. Facilities for discontinuous intervals could more easily be left to extensions than those for continuous ones. I see the continuous as more fundamental, at least in the same manner as seeing integers as more fundamental than rationals (you can define the latter with the former), though one could define things in the opposite manner too. > Regards, > Jeff Davis -- Darren Duncan
Merlin Moncure <mmoncure@gmail.com> writes: > I vote for at minimum the type itself and ANYRANGE to be in core. > From there you could make it like arrays where the range type is > automatically generated for each POD type. I would consider that for > sure on basis of simplicity in user-land unless all the extra types > and operators are a performance hit. Auto-generation of range types isn't going to happen, simply because the range type needs more information than is provided by the base type declaration. (First, you need a btree opclass, and second, you need a "next" function if it's a discrete type.) By my count there are only about 20 datatypes in core for which it looks sensible to provide a range type (ie, it's a non-deprecated, non-composite type with a standard default btree opclass). For that many, we might as well just build 'em in. regards, tom lane
On Mon, Jun 6, 2011 at 6:23 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Merlin Moncure <mmoncure@gmail.com> writes: >> I vote for at minimum the type itself and ANYRANGE to be in core. >> From there you could make it like arrays where the range type is >> automatically generated for each POD type. I would consider that for >> sure on basis of simplicity in user-land unless all the extra types >> and operators are a performance hit. > > Auto-generation of range types isn't going to happen, simply because the > range type needs more information than is provided by the base type > declaration. (First, you need a btree opclass, and second, you need a > "next" function if it's a discrete type.) > > By my count there are only about 20 datatypes in core for which it looks > sensible to provide a range type (ie, it's a non-deprecated, > non-composite type with a standard default btree opclass). For that > many, we might as well just build 'em in. right. hm -- can you have multiple range type definitions for a particular type? I was thinking about a type reduction for casting like we have for arrays: select '[1,3)'::int{}. but maybe that isn't specific enough? merlin
Merlin Moncure <mmoncure@gmail.com> writes: > On Mon, Jun 6, 2011 at 6:23 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> By my count there are only about 20 datatypes in core for which it looks >> sensible to provide a range type (ie, it's a non-deprecated, >> non-composite type with a standard default btree opclass). �For that >> many, we might as well just build 'em in. > right. hm -- can you have multiple range type definitions for a > particular type? In principle, sure, if the type has multiple useful sort orderings. I don't immediately see any core types for which we'd bother. (In particular I don't see a use case for range types corresponding to the *_pattern_ops btree opclasses, especially now that COLLATE "C" has rendered them sorta obsolete.) BTW, Jeff, have you worked out the implications of collations for textual range types? I confess to not having paid much attention to range types lately. regards, tom lane
On Tue, 2011-06-07 at 11:15 -0400, Tom Lane wrote: > Merlin Moncure <mmoncure@gmail.com> writes: > > right. hm -- can you have multiple range type definitions for a > > particular type? > > In principle, sure, if the type has multiple useful sort orderings. Right. Additionally, you might want to use different "canonical" functions for the same subtype. > I don't immediately see any core types for which we'd bother. Agreed. > BTW, Jeff, have you worked out the implications of collations for > textual range types? Well, "it seems to work" is about as far as I've gotten. As far as the implications, I'll need to do a little more research and thinking. But I don't immediately see anything too worrisome. Regards,Jeff Davis
On Mon, 2011-06-06 at 14:42 -0700, Darren Duncan wrote: > On this note, here's a *big* thing that needs discussion ... [ refering to the concept of "discrete" versus "continuous" ranges ] Yes, there has been much discussion on this topic already. The solution right now is that they both behave like continuous ranges for most operations. But each time a value is produced, a discrete range has a "canonicalize" function that aligns it to the proper boundaries and chooses a convention from [], [), (], (). For discrete ranges that's only a convention, because multiple representations are equal in value, but that's not so for continuous ranges. Another approach would be to offer "next" and "prev" functions instead of "canonical", or a "plus(thetype, integer)" and "minus(thetype, integer)". > Can Pg be changed to support "." in operator names as long as they don't just > appear by themselves? What would this break to do so? Someone else would have to comment on that. My feeling is that it might create problems with qualified names, and also with PG's "arg.function" call syntax. > >> foo in 1..10 > I believe it is quite reasonable to treat ranges like sets, in an abstract > sense, and so using set membership syntax like "in" is valid. OK, I think I agree with this now. I'll think about it some more. > I also see these as considerably less important and useful in practice than the > continuous intervals. [ multiranges ] Agreed. I've left those alone for now, because it's a separate concept. Regards,Jeff Davis
Jeff Davis <pgsql@j-davis.com> writes: > On Mon, 2011-06-06 at 14:42 -0700, Darren Duncan wrote: >> Can Pg be changed to support "." in operator names as long as they don't just >> appear by themselves? What would this break to do so? > Someone else would have to comment on that. DOT_DOT is already a token in plpgsql; trying to make it be also an operator name would break a lot of existing plpgsql code. regards, tom lane
Jeff Davis wrote: > On Tue, 2011-06-07 at 11:15 -0400, Tom Lane wrote: >> Merlin Moncure <mmoncure@gmail.com> writes: >>> right. hm -- can you have multiple range type definitions for a >>> particular type? >> In principle, sure, if the type has multiple useful sort orderings. > > Right. Additionally, you might want to use different "canonical" > functions for the same subtype. > >> I don't immediately see any core types for which we'd bother. > > Agreed. > >> BTW, Jeff, have you worked out the implications of collations for >> textual range types? > > Well, "it seems to work" is about as far as I've gotten. > > As far as the implications, I'll need to do a little more research and > thinking. But I don't immediately see anything too worrisome. I would expect ranges to have exactly the same semantics as ORDER BY or "<" etc with respect to collations for textual range types. If collation is an attribute of a textual type, meaning that the textual type or its values have a sense of their collation built-in, then ranges for those textual types should "just work" without any extra range-specific syntax, same as you could say ORDER BY without any further qualifiers. If collation is not an attribute of a textual type, meaning that you normally have to qualify the desired collation for each order-sensitive operation using it (even if that can be defined by a session/etc setting which still just ultimately works at the operator rather than type level), or if a textual type can have it built in but it is overridable per operator, then either ranges should have an extra attribute saying what collation (or other type-specific order-determining function) to use, or all range operators take the optional collation parameter like with ORDER BY. Personally, I think it is a more elegant programming language design for an ordered type to have its own sense of a one true canonical ordering of its values, and where one could conceptually have multiple orderings, there would be a separate data type for each one. That is, while you probably only need a single type with respect to ordering for any real numeric type, for textual types you could have a separate textual type for each collation. In particular, I say separate type because a collation can sometimes affect differently what text values compare as "same", as far as I know. On a tangent, I believe that various insensitive comparisons or sortings are very reasonably expressed as collations rather than some other mechanism, eg if you wanted sortings that compare different letter case as same or not, or with or without accents as same or not. So under this "elegant" system, there is no need to ever specify collation at the operator level (which could become quite verbose and unweildy), but instead you can cast data types if you want to change their sense of canonical ordering. Now if the various text-specific operators are polymorphic across these text type variants, users don't generally have to know the difference except when it matters. On a tangent, I believe that the best definition of "equal" or "same" in a type system is global substitutability. Ignoring implementation details, if a program ever finds that 2 operands to the generic "=" (equality test) operator result in TRUE, then the program should feel free to replace all occurrences of one operand in the program with occurrences of the other, for optimization, because generic "=" returning TRUE means one is just as good as the other. This assumes generally that we're dealing with immutable value types. -- Darren Duncan
Jeff Davis wrote: > On Mon, 2011-06-06 at 14:42 -0700, Darren Duncan wrote: >> Can Pg be changed to support "." in operator names as long as they don't just >> appear by themselves? What would this break to do so? > > Someone else would have to comment on that. My feeling is that it might > create problems with qualified names, and also with PG's "arg.function" > call syntax. With respect to qualified names or "arg.function", then unless the "function" can be symbolic, I considered your examples to be the "appear by themselves", hence "." by itself wouldn't be a new operator, and I generally assumed here that any multi-character operators with "." to be symbolic. In any event, I also saw Tom's reply about DOT_DOT being a token already. -- Darren Duncan
On Tue, 2011-06-07 at 10:20 -0700, Jeff Davis wrote: > > BTW, Jeff, have you worked out the implications of collations for > > textual range types? > > Well, "it seems to work" is about as far as I've gotten. > > As far as the implications, I'll need to do a little more research and > thinking. But I don't immediately see anything too worrisome. I take that back :( It looks like the type input function may be a problem, because it doesn't look like it knows what the collation is yet. In other words, PG_GET_COLLATION() is zero for the type input function. But I need to do a comparison to find out if the range is valid or not. For instance: '[a, Z)'::textrange is valid in "en_US" but not "C". The range constructor: range('a', 'Z') is fine though. Not sure what to do here. Regards,Jeff Davis
Jeff Davis <pgsql@j-davis.com> writes: > It looks like the type input function may be a problem, because it > doesn't look like it knows what the collation is yet. In other words, > PG_GET_COLLATION() is zero for the type input function. Yeah, we've assumed that I/O functions do not need to know collation. regards, tom lane
On Jun8, 2011, at 17:46 , Jeff Davis wrote: > It looks like the type input function may be a problem, because it > doesn't look like it knows what the collation is yet. In other words, > PG_GET_COLLATION() is zero for the type input function. > > But I need to do a comparison to find out if the range is valid or not. > For instance: > '[a, Z)'::textrange > is valid in "en_US" but not "C". Maybe that check should just be removed? If one views the range '[L, U)' as a concise way of expressing "L <= x AND x < U" for some x, then allowing the case L > U seems quite natural. There won't be any such x of course, but the range is still valid, just empty. Actually, thinking for this a bit, I believe this is the only way text ranges can support collations. If the validity of a range depends on the collation, then changing the collation after creation seems weird, since it can make previous valid ranges invalid and vice versa. There could be a function RANGE_EMPTY() which people can put into their CHECK constraints if they don't want such ranges to sneak into their tables... best regards, Florian Pflug
On Thu, Jun 9, 2011 at 6:26 PM, Florian Pflug <fgp@phlo.org> wrote: > On Jun8, 2011, at 17:46 , Jeff Davis wrote: >> It looks like the type input function may be a problem, because it >> doesn't look like it knows what the collation is yet. In other words, >> PG_GET_COLLATION() is zero for the type input function. >> >> But I need to do a comparison to find out if the range is valid or not. >> For instance: >> '[a, Z)'::textrange >> is valid in "en_US" but not "C". > > Maybe that check should just be removed? If one views the range > '[L, U)' as a concise way of expressing "L <= x AND x < U" for some > x, then allowing the case L > U seems quite natural. There won't > be any such x of course, but the range is still valid, just empty. > > Actually, thinking for this a bit, I believe this is the only > way text ranges can support collations. If the validity of a range > depends on the collation, then changing the collation after creation > seems weird, since it can make previous valid ranges invalid and > vice versa. > > There could be a function RANGE_EMPTY() which people can put into > their CHECK constraints if they don't want such ranges to sneak > into their tables... I think the collation is going to have to be baked into the type definition, no? You can't just up and change the collation of the column as you could for a straight text column, if that might cause the contents of some rows to be viewed as invalid. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Jun12, 2011, at 04:37 , Robert Haas wrote: > On Thu, Jun 9, 2011 at 6:26 PM, Florian Pflug <fgp@phlo.org> wrote: >> On Jun8, 2011, at 17:46 , Jeff Davis wrote: >>> It looks like the type input function may be a problem, because it >>> doesn't look like it knows what the collation is yet. In other words, >>> PG_GET_COLLATION() is zero for the type input function. >>> >>> But I need to do a comparison to find out if the range is valid or not. >>> For instance: >>> '[a, Z)'::textrange >>> is valid in "en_US" but not "C". >> >> Maybe that check should just be removed? If one views the range >> '[L, U)' as a concise way of expressing "L <= x AND x < U" for some >> x, then allowing the case L > U seems quite natural. There won't >> be any such x of course, but the range is still valid, just empty. >> >> Actually, thinking for this a bit, I believe this is the only >> way text ranges can support collations. If the validity of a range >> depends on the collation, then changing the collation after creation >> seems weird, since it can make previous valid ranges invalid and >> vice versa. >> >> There could be a function RANGE_EMPTY() which people can put into >> their CHECK constraints if they don't want such ranges to sneak >> into their tables... > > I think the collation is going to have to be baked into the type > definition, no? You can't just up and change the collation of the > column as you could for a straight text column, if that might cause > the contents of some rows to be viewed as invalid. Now you've lost me. If a text range is simply a pair of strings, as I suggested, and collations are applied only during comparison and RANGE_EMPTY(), why would the collation have to be baked into the type? If you're referring to the case (1) Create table with text-range column and collation C1 (2) Add check constraint containingRANGE_EMPTY() (3) Add data (4) Alter column to have collation C2, possibly changing the result of RANGE_EMPTY()for existing ranges. then that points to a problem with ALTER COLUMN. best regards, Florian Pflug
On Sun, Jun 12, 2011 at 7:53 AM, Florian Pflug <fgp@phlo.org> wrote: >> I think the collation is going to have to be baked into the type >> definition, no? You can't just up and change the collation of the >> column as you could for a straight text column, if that might cause >> the contents of some rows to be viewed as invalid. > > Now you've lost me. If a text range is simply a pair of strings, > as I suggested, and collations are applied only during comparison > and RANGE_EMPTY(), why would the collation have to be baked into > the type? > > If you're referring to the case > (1) Create table with text-range column and collation C1 > (2) Add check constraint containing RANGE_EMPTY() > (3) Add data > (4) Alter column to have collation C2, possibly changing > the result of RANGE_EMPTY() for existing ranges. > then that points to a problem with ALTER COLUMN. No, I'm saying that you might have a column containing '[a, Z)', and someone might change the collation of the column from en_US to C. When the collation was en_US, the column could legally contain that value, but now that the collation is C, it can't. ALTER TABLE isn't going to recheck the validity of the data when someone changes the collation: that's only supposed to affect the sort order, not the definition of what is a legal value. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas wrote: > On Sun, Jun 12, 2011 at 7:53 AM, Florian Pflug <fgp@phlo.org> wrote: >>> I think the collation is going to have to be baked into the type >>> definition, no? You can't just up and change the collation of the >>> column as you could for a straight text column, if that might cause >>> the contents of some rows to be viewed as invalid. >> Now you've lost me. If a text range is simply a pair of strings, >> as I suggested, and collations are applied only during comparison >> and RANGE_EMPTY(), why would the collation have to be baked into >> the type? >> >> If you're referring to the case >> (1) Create table with text-range column and collation C1 >> (2) Add check constraint containing RANGE_EMPTY() >> (3) Add data >> (4) Alter column to have collation C2, possibly changing >> the result of RANGE_EMPTY() for existing ranges. >> then that points to a problem with ALTER COLUMN. > > No, I'm saying that you might have a column containing '[a, Z)', and > someone might change the collation of the column from en_US to C. > When the collation was en_US, the column could legally contain that > value, but now that the collation is C, it can't. ALTER TABLE isn't > going to recheck the validity of the data when someone changes the > collation: that's only supposed to affect the sort order, not the > definition of what is a legal value. You can have the same collation problem even without range types. Consider the following: (1) Create table with the 2 text columns {L,R} and both columns have the collation en_US. (2) Add check constraint requiring "L <= R". (3) Add a record with the value 'a' for L and 'Z' for R. (4)Alter the columns to have the collation C. Good language design principles demand that the semantics for this simplified case and the semantics for replacing {L,R} with a single range-of-text-typed column be the same, including what happens with CHECK and ALTER TABLE. Likewise, anything that affects ORDER BY should affect {<,>,<=,>=} and friends the same way and vice-versa and likewise should affect range validity. It makes sense for collation to be considered part of text data types, and changing collation is casting from one text type to another. Generally speaking, any inherent or applied aspect of a text or other value (such as collation) that affects the results of any deterministic operations on those values (such as sorting) should be considered part of the data type of those values. -- Darren Duncan
On Mon, Jun 13, 2011 at 12:47 AM, Darren Duncan <darren@darrenduncan.net> wrote: >>> If you're referring to the case >>> (1) Create table with text-range column and collation C1 >>> (2) Add check constraint containing RANGE_EMPTY() >>> (3) Add data >>> (4) Alter column to have collation C2, possibly changing >>> the result of RANGE_EMPTY() for existing ranges. >>> then that points to a problem with ALTER COLUMN. >> >> No, I'm saying that you might have a column containing '[a, Z)', and >> someone might change the collation of the column from en_US to C. >> When the collation was en_US, the column could legally contain that >> value, but now that the collation is C, it can't. ALTER TABLE isn't >> going to recheck the validity of the data when someone changes the >> collation: that's only supposed to affect the sort order, not the >> definition of what is a legal value. > > You can have the same collation problem even without range types. > > Consider the following: > (1) Create table with the 2 text columns {L,R} and both columns have the > collation en_US. > (2) Add check constraint requiring "L <= R". > (3) Add a record with the value 'a' for L and 'Z' for R. > (4) Alter the columns to have the collation C. Oh, good point. rhaas=# create table sample (t text collate "en_US", check (t < 'Z')); CREATE TABLE rhaas=# insert into sample values ('a'); INSERT 0 1 rhaas=# alter table sample alter column t type text collate "C"; ERROR: check constraint "sample_t_check" is violated by some row But interestingly, my Mac has a different notion of how this collation works: it thinks 'a' > 'Z' even in en_US. :-( -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Fri, 2011-06-10 at 00:26 +0200, Florian Pflug wrote: > Maybe that check should just be removed? If one views the range > '[L, U)' as a concise way of expressing "L <= x AND x < U" for some > x, then allowing the case L > U seems quite natural. There won't > be any such x of course, but the range is still valid, just empty. [ Please excuse the late reply, I was on vacation. ] That's an interesting perspective, but I don't think it's a good idea. Up to this point, I've considered a range value to be a set of contiguous values, and the endpoints just happen to be a way to represent that set. If changing the collation changes a set of positive cardinality into an empty set, clearly it's a different value. We don't want the COLLATE clause to change the value, because things that do change the value (like a typecast) should offer the opportunity to call a function so that you can verify that it's valid or change it to some canonical form. So, I believe that you are proposing to change the concept of a range value from "a contiguous set of values" to "a pair of bounds". There are numerous implications, one of which is that I don't think that we can maintain the equality of all empty ranges. Consider these expressions, where x is a non-empty range with collation "A", but is empty in collation "B" (and "*" means "range intersection"): (x COLLATE "B") COLLATE "A" ((x COLLATE "B") * '(-Inf, Inf)') COLLATE "A" ('-'::textrange * '(-Inf, Inf)') COLLATE "A" All of those expressions should be equal (according to global substitutibility, as Darren mentioned). But they can't be, because the last expression is always an empty range, whereas the first one is not (because merely changing the collation back and forth offers no opportunity to even notice that you have an empty range at one point). So, I believe that we'd be stuck with non-equal empty ranges, as well as many other possibly non-intuitive implications. So, I lean strongly toward the interpretation that a range is a contiguous set of values, and changing the collation should not change the value. Things that do change the value (like a typecast) should offer the opportunity to handle cases like this with a function call, but changing collation does not. This leaves making the collation a part of the range type itself (as Robert suggested). Comments? Regards,Jeff Davis
On Jun18, 2011, at 10:10 , Jeff Davis wrote: > On Fri, 2011-06-10 at 00:26 +0200, Florian Pflug wrote: > So, I believe that you are proposing to change the concept of a range > value from "a contiguous set of values" to "a pair of bounds". Yeah. Mostly though because I figured that'd make defining their semantics easier, not necessarily because that interpretation is better, though. > There are > numerous implications, one of which is that I don't think that we can > maintain the equality of all empty ranges. Consider these expressions, > where x is a non-empty range with collation "A", but is empty in > collation "B" (and "*" means "range intersection"): > > (x COLLATE "B") COLLATE "A" > ((x COLLATE "B") * '(-Inf, Inf)') COLLATE "A" > ('-'::textrange * '(-Inf, Inf)') COLLATE "A" > > All of those expressions should be equal (according to global > substitutibility, as Darren mentioned). But they can't be, because the > last expression is always an empty range, whereas the first one is not > (because merely changing the collation back and forth offers no > opportunity to even notice that you have an empty range at one point). > So, I believe that we'd be stuck with non-equal empty ranges, as well as > many other possibly non-intuitive implications. Yeah. Once you give up the idea that range is a set, extensionality (i.e. the axiom "there's only one empty range" or more precisely "there only one range which no object is a member of") has to go too. > So, I lean strongly toward the interpretation that a range is a > contiguous set of values, Yeah, I agree now, mainly because defining them as a set give rise to richer semantics than defining them to be a pair. If someone needs just a pair of values and maybe a BETWEEN operator, that is easily done with CREATE TYPE and a few SQL or PLPGSQL functions. > and changing the collation should not change > the value. Things that do change the value (like a typecast) should > offer the opportunity to handle cases like this with a function call, > but changing collation does not. > > This leaves making the collation a part of the range type itself (as > Robert suggested). Yes, that seems necessary for consistency. That leaves the question of what to do if someone tries to modify a textrange's collation with a COLLATE clause. For example, For example, whats the result of 'Ä' in '[A,Z']::textrange_german COLLATE 'C' where 'Ä' is a german Umlaut-A which sorts after 'A' but before 'B' in locale 'de_DE' but sorts after 'Z' in locale 'C'. (I'm assuming that textrange_german was defined with collation 'de_DE'). With the set-based definition of ranges, the only sensible thing is to simply ignore the COLLATE clause I think. best regards, Florian Pflug
On Sat, 2011-06-18 at 22:19 +0200, Florian Pflug wrote: > Yes, that seems necessary for consistency. That leaves the question > of what to do if someone tries to modify a textrange's collation with > a COLLATE clause. For example, > > For example, whats the result of > 'Ä' in '[A,Z']::textrange_german COLLATE 'C' > where 'Ä' is a german Umlaut-A which sorts after 'A' but before 'B' > in locale 'de_DE' but sorts after 'Z' in locale 'C'. (I'm assuming > that textrange_german was defined with collation 'de_DE'). > > With the set-based definition of ranges, the only sensible thing > is to simply ignore the COLLATE clause I think. I think rejecting it makes more sense, so a range would not be a collatable type; it just happens to use collations of the subtype internally. Regards,Jeff Davis
On Jun19, 2011, at 00:23 , Jeff Davis wrote: > On Sat, 2011-06-18 at 22:19 +0200, Florian Pflug wrote: >> Yes, that seems necessary for consistency. That leaves the question >> of what to do if someone tries to modify a textrange's collation with >> a COLLATE clause. For example, >> >> For example, whats the result of >> 'Ä' in '[A,Z']::textrange_german COLLATE 'C' >> where 'Ä' is a german Umlaut-A which sorts after 'A' but before 'B' >> in locale 'de_DE' but sorts after 'Z' in locale 'C'. (I'm assuming >> that textrange_german was defined with collation 'de_DE'). >> >> With the set-based definition of ranges, the only sensible thing >> is to simply ignore the COLLATE clause I think. > > I think rejecting it makes more sense, so a range would not be a > collatable type; it just happens to use collations of the subtype > internally. Ah, crap, I put the COLLATE in the wrong place. What I actually had in mind was ('Ä' COLLATE 'C') in '[A,Z]'::textrange_german I was afraid that the "in" operator cannot distinguish this case from field in '[A,Z]'::textrange_german where "field" is declared with "COLLATE 'C'". In the seconds case, throwing an error seems a bit harsh.... There's also this fun little case field in '[A,Z]' (note lack of an explicit cast). Here the input function would probably need to verify that there's a range type corresponding to the field's type *and* that the range type's collation matches the field's collation. I wonder if that's possible - Tom said somewhere that input function don't receive collation information, though I don't know if that restriction applies in this case. best regards, Florian Pflug
On Sun, Jun 19, 2011 at 11:21:28AM +0200, Florian Pflug wrote: > > I think rejecting it makes more sense, so a range would not be a > > collatable type; it just happens to use collations of the subtype > > internally. > > Ah, crap, I put the COLLATE in the wrong place. What I actually > had in mind was > ('Ä' COLLATE 'C') in '[A,Z]'::textrange_german Operators don't have to be collation sensetive. If they're not then the COLLATE in the above statement is redundant. You can decide that an interval needs an implicit collation and you can just use that. > I was afraid that the "in" operator cannot distinguish this case > from > field in '[A,Z]'::textrange_german > where "field" is declared with "COLLATE 'C'". It should be able to, after all in the first case the collation is explicit, in the latter implicit. > There's also this fun little case > field in '[A,Z]' > (note lack of an explicit cast). Here the input function would > probably need to verify that there's a range type corresponding > to the field's type *and* that the range type's collation matches > the field's collation. I wonder if that's possible - Tom said > somewhere that input function don't receive collation information, > though I don't know if that restriction applies in this case. Collation checking is generally done by the planner. I don't see why the input function should check, the result of an input function is by definition DEFAULT. It's up to the 'in' operator to check. Note that the whole idea of collation is not really supposed to be assigned to object for storage. How that can be resolved I'm not sure. Mvg, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patriotism is when love of your own people comes first; nationalism, > when hate for people other than your own comes first. > - Charles de Gaulle
On Sun, 2011-06-19 at 12:24 +0200, Martijn van Oosterhout wrote: > Collation checking is generally done by the planner. I don't see why > the input function should check, the result of an input function is by > definition DEFAULT. It's up to the 'in' operator to check. > > Note that the whole idea of collation is not really supposed to be > assigned to object for storage. How that can be resolved I'm not sure. I think if we just say that it's a property of the range type definition, then that's OK. It's similar to specifying a non-default btree opclass for the range type -- it just changes which total order the range type adheres to. If you meant that the collation shouldn't be stored along with the value itself, then I agree. Regards,Jeff Davis
On Jun19, 2011, at 20:08 , Jeff Davis wrote: > On Sun, 2011-06-19 at 12:24 +0200, Martijn van Oosterhout wrote: >> Collation checking is generally done by the planner. I don't see why >> the input function should check, the result of an input function is by >> definition DEFAULT. It's up to the 'in' operator to check. >> >> Note that the whole idea of collation is not really supposed to be >> assigned to object for storage. How that can be resolved I'm not sure. > > I think if we just say that it's a property of the range type > definition, then that's OK. It's similar to specifying a non-default > btree opclass for the range type -- it just changes which total order > the range type adheres to. In fact, it's exactly the same, because what we *actually* need to specify is not an opclass but a comparison operator. Which is only well-defined if you know *both* an opclass *and* a collation. That reminds me - the conclusion there was that we cannot have two range types with the same base type but different opclasses, wasn't it? AFAIR precisely because otherwise there's no sensible way to handle 'text' in '[lower,upper]' If I'm not mistaken about this, that would imply that we also cannot have two range types with the same base type, the same opclass, but different collations. Which seems rather unfortunate... In fact, if that's true, maybe restricing range types to the database collation would be best... best regards, Florian Pflug
On Sun, 2011-06-19 at 21:29 +0200, Florian Pflug wrote: > If I'm not mistaken about this, that would imply that we also cannot > have two range types with the same base type, the same opclass, > but different collations. Which seems rather unfortunate... In fact, > if that's true, maybe restricing range types to the database collation > would be best... Yes, we cannot have two range types with the same base type. That is a consequence of the polymorphic type system, which needs to be able to determine the range type given the base type. A workaround is to use domains. That is effective, but awkward. For instance, given: CREATE DOMAIN textdomain AS text; CREATE TYPE textdomainrange AS RANGE (subtype=textdomain); then: '[a,z)'::textdomainrange @> 'b'::textdomain would work, but: '[a,z)'::textdomainrange @> 'b' would not, which would be annoying. I don't see a way around this. It's not a collation problem, but a general "multiple range types with the same subtype" problem. I don't think there's much benefit in restricting it to the DB collation. If someone really needs a different collation (or opclass, for that matter), it might as well be allowed, even if you have to do extra type annotations. Regards,Jeff Davis
On Mon, Jun 20, 2011 at 2:33 AM, Jeff Davis <pgsql@j-davis.com> wrote: > On Sun, 2011-06-19 at 21:29 +0200, Florian Pflug wrote: >> If I'm not mistaken about this, that would imply that we also cannot >> have two range types with the same base type, the same opclass, >> but different collations. Which seems rather unfortunate... In fact, >> if that's true, maybe restricing range types to the database collation >> would be best... > > Yes, we cannot have two range types with the same base type. That is a > consequence of the polymorphic type system, which needs to be able to > determine the range type given the base type. Boy, that's an unfortunate limitation. :-( -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Sun, Jun 19, 2011 at 11:33:02PM -0700, Jeff Davis wrote: > On Sun, 2011-06-19 at 21:29 +0200, Florian Pflug wrote: > > If I'm not mistaken about this, that would imply that we also > > cannot have two range types with the same base type, the same > > opclass, but different collations. Which seems rather > > unfortunate... In fact, if that's true, maybe restricing range > > types to the database collation would be best... > > Yes, we cannot have two range types with the same base type. That is > a consequence of the polymorphic type system, which needs to be able > to determine the range type given the base type. > > A workaround is to use domains. That is effective, but awkward. For > instance, given: > CREATE DOMAIN textdomain AS text; > CREATE TYPE textdomainrange AS RANGE (subtype=textdomain); > then: > '[a,z)'::textdomainrange @> 'b'::textdomain > would work, but: > '[a,z)'::textdomainrange @> 'b' > would not, which would be annoying. > > I don't see a way around this. It's not a collation problem, but a > general "multiple range types with the same subtype" problem. How might you address that problem, assuming you had the needed resources to do it? Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
On Jun20, 2011, at 15:19 , Robert Haas wrote: > On Mon, Jun 20, 2011 at 2:33 AM, Jeff Davis <pgsql@j-davis.com> wrote: >> On Sun, 2011-06-19 at 21:29 +0200, Florian Pflug wrote: >>> If I'm not mistaken about this, that would imply that we also cannot >>> have two range types with the same base type, the same opclass, >>> but different collations. Which seems rather unfortunate... In fact, >>> if that's true, maybe restricing range types to the database collation >>> would be best... >> >> Yes, we cannot have two range types with the same base type. That is a >> consequence of the polymorphic type system, which needs to be able to >> determine the range type given the base type. > > Boy, that's an unfortunate limitation. :-( Hm, I'm starting to wonder if there isn't a way around that. It seems that this restriction comes from the desire to allow functions with the polymorphic signature (ANYELEMENT, ANYELEMENT) -> ANYRANGE. The only such function I can currently come up with is the generic range constructor. Is having that worth the restriction to one range type per base type? Another option might be to extend polymorphic argument matching to allow functions with the signature (<non-polymorphic args>) -> <polymorphic type> but to require the concrete output type to be specified with a cast at the call site. For the generic range constructor, you'd then have to write RANGE(lower, upper)::range_type (If we had that, we could also (finally) provide functions to set and get fields of composite types by name. As it stands, doing that cleanly is hard because the desired signature of the get function, namely (record, fieldname text) -> anyelement is not supported.) A third approach might be to first define a PAIR type and then define ranges on top of that. Since PAIR types wouldn't include a comparison operators, the restriction to one PAIR type per base type wouldn't matter. Instead of a generic RANGE constructor you'd then use the generic PAIR constructor and cast the resulting PAIR to whatever range you desire, i.e. write PAIR(lower, upper)::range_type. best regards, Florian Pflug
Robert Haas <robertmhaas@gmail.com> writes: > On Mon, Jun 20, 2011 at 2:33 AM, Jeff Davis <pgsql@j-davis.com> wrote: >> Yes, we cannot have two range types with the same base type. That is a >> consequence of the polymorphic type system, which needs to be able to >> determine the range type given the base type. > Boy, that's an unfortunate limitation. :-( Given the need to deal with multiple collations for collatable types, I'd say it's not so much "unfortunate" as "utterly unworkable". At least unless you give up the notion of binding the collation into the type definition ... which has other issues, per discussion a few days ago. Even ignoring collations, I really think we want to allow multiple range types for base types that have multiple btree sort orderings. regards, tom lane
On Mon, Jun 20, 2011 at 3:17 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Given the need to deal with multiple collations for collatable types, > I'd say it's not so much "unfortunate" as "utterly unworkable". At > least unless you give up the notion of binding the collation into the > type definition ... which has other issues, per discussion a few days > ago. Even ignoring collations, I really think we want to allow multiple > range types for base types that have multiple btree sort orderings. I was imagining it would be not part of the type but part of the internal data in the range type. The dumped representation would look something like ['bar','baz',''en_US'] and input forms like ['bar','baz'] would just default to the database default collation or the session's default collation or whatever. The most disturbing thing about this is that it would make unrestorable dumps if any of those collation names change or are not installed before the data is loaded. It's kind of like having your table names embedded in a text column in your tables. It could make things awkward to manage later. -- greg
Greg Stark <stark@mit.edu> writes: > On Mon, Jun 20, 2011 at 3:17 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Given the need to deal with multiple collations for collatable types, >> I'd say it's not so much "unfortunate" as "utterly unworkable". �At >> least unless you give up the notion of binding the collation into the >> type definition ... which has other issues, per discussion a few days >> ago. �Even ignoring collations, I really think we want to allow multiple >> range types for base types that have multiple btree sort orderings. > I was imagining it would be not part of the type but part of the > internal data in the range type. The dumped representation would look > something like ['bar','baz',''en_US'] and input forms like > ['bar','baz'] would just default to the database default collation or > the session's default collation or whatever. > The most disturbing thing about this is that it would make > unrestorable dumps if any of those collation names change or are not > installed before the data is loaded. It's kind of like having your > table names embedded in a text column in your tables. It could make > things awkward to manage later. Yeah. In particular this would cause issues for pg_upgrade, which would have to somehow ensure that collation OIDs didn't change between old and new installations, which is just about impossible given the current method for assigning them. I think we need to avoid that, really. regards, tom lane
On Mon, 2011-06-20 at 16:01 +0200, Florian Pflug wrote: > Hm, I'm starting to wonder if there isn't a way around that. It seems that > this restriction comes from the desire to allow functions with the > polymorphic signature > (ANYELEMENT, ANYELEMENT) -> ANYRANGE. > > The only such function I can currently come up with is the generic > range constructor. Is having that worth the restriction to one > range type per base type? Good point. Having constructors is obviously important, but perhaps they don't have to be generic. We could generate catalog entries for each constructor for each range type, and name them after the range type itself. So, instead of: range(1, 10) you'd write: int4range(1,10) That actually might be better anyway, because relying on the polymorphic version is not perfect now anyway. For instance, if you want an int8range using the generic range() constructor, you need a cast. We'd still need to get the polymorphic type system to work the way we want in this case. I'll look into that. > Another option might be to extend polymorphic argument matching > to allow functions with the signature > (<non-polymorphic args>) -> <polymorphic type> > but to require the concrete output type to be specified with a cast > at the call site. For the generic range constructor, you'd then > have to write > RANGE(lower, upper)::range_type Interesting idea. > A third approach might be to first define a PAIR type and then > define ranges on top of that. Since PAIR types wouldn't include > a comparison operators, the restriction to one PAIR type per > base type wouldn't matter. Instead of a generic RANGE constructor > you'd then use the generic PAIR constructor and cast the resulting > PAIR to whatever range you desire, i.e. write > PAIR(lower, upper)::range_type. Another interesting idea. A little awkward though, and doesn't offer much opportunity to specify inclusivity/exclusivity of the bounds. Regards,Jeff Davis
On Mon, Jun 20, 2011 at 11:21 AM, Jeff Davis <pgsql@j-davis.com> wrote: > On Mon, 2011-06-20 at 16:01 +0200, Florian Pflug wrote: >> Hm, I'm starting to wonder if there isn't a way around that. It seems that >> this restriction comes from the desire to allow functions with the >> polymorphic signature >> (ANYELEMENT, ANYELEMENT) -> ANYRANGE. >> >> The only such function I can currently come up with is the generic >> range constructor. Is having that worth the restriction to one >> range type per base type? > > Good point. > > Having constructors is obviously important, but perhaps they don't have > to be generic. We could generate catalog entries for each constructor > for each range type, and name them after the range type itself. So, > instead of: > range(1, 10) > you'd write: > int4range(1,10) > > That actually might be better anyway, because relying on the polymorphic > version is not perfect now anyway. For instance, if you want an > int8range using the generic range() constructor, you need a cast. > > We'd still need to get the polymorphic type system to work the way we > want in this case. I'll look into that. hm, what if there *was( only one range type per base type, but in the various contexts where specific ordering and collation was important you could optionally pass them in? Meaning, the specific ordering was not bound rigidly to the type, but to the operation? merlin
On Jun20, 2011, at 19:16 , Merlin Moncure wrote: > On Mon, Jun 20, 2011 at 11:21 AM, Jeff Davis <pgsql@j-davis.com> wrote: > hm, what if there *was( only one range type per base type, but in the > various contexts where specific ordering and collation was important > you could optionally pass them in? Meaning, the specific ordering was > not bound rigidly to the type, but to the operation? I suggested that previously here http://archives.postgresql.org/pgsql-hackers/2011-06/msg00846.php In the ensuing discussion, however, it became clear that by doing so range types become little more than a pair of values. More specifically, a range then *doesn't* represent a set of values, because whether or not a value is "in" the range depends on a specific sort order. Actually, you'd probably even loose the possibility of having a normalization function for discrete base types (which makes sure that we know that "[1,2]" is the same as "[1,3)"), because it's hard to image one normalization function that works sensibly for two different orderings. So by doing that, you effectively turn a RANGE into a quadruple (lower type, lower_included bool, upper type, upper_included bool). best regards, Florian Pflug
Florian Pflug <fgp@phlo.org> writes: > On Jun20, 2011, at 19:16 , Merlin Moncure wrote: >> On Mon, Jun 20, 2011 at 11:21 AM, Jeff Davis <pgsql@j-davis.com> wrote: >> hm, what if there *was( only one range type per base type, but in the >> various contexts where specific ordering and collation was important >> you could optionally pass them in? Meaning, the specific ordering was >> not bound rigidly to the type, but to the operation? > I suggested that previously here > http://archives.postgresql.org/pgsql-hackers/2011-06/msg00846.php > In the ensuing discussion, however, it became clear that by doing so > range types become little more than a pair of values. More specifically, > a range then *doesn't* represent a set of values, because whether or > not a value is "in" the range depends on a specific sort order. Yeah, that doesn't seem like the way to go. If a range value doesn't represent a well-defined set of base-type values, we lose a lot of the mathematical underpinnings for range operations. So ... just how awful would it be if we hard-wired range types to always use their base type's default btree sort ordering and the database's default collation? In principle that sucks, but I'm not sure how wide the use-cases actually will be for other choices. The other viable alternative seems to be to require those two properties (btree opclass and collation) to be part of a specific range type definition. The complaint about that seemed to be that we couldn't infer an ANYRANGE type given only ANYELEMENT, but could we alleviate that by identifying one range type as the default for the base type, and then using that one in cases where we have no ANYRANGE input? regards, tom lane
Tom Lane wrote: > Florian Pflug <fgp@phlo.org> writes: >> On Jun20, 2011, at 19:16 , Merlin Moncure wrote: >>> On Mon, Jun 20, 2011 at 11:21 AM, Jeff Davis <pgsql@j-davis.com> wrote: >>> hm, what if there *was( only one range type per base type, but in the >>> various contexts where specific ordering and collation was important >>> you could optionally pass them in? Meaning, the specific ordering was >>> not bound rigidly to the type, but to the operation? > >> I suggested that previously here >> http://archives.postgresql.org/pgsql-hackers/2011-06/msg00846.php > >> In the ensuing discussion, however, it became clear that by doing so >> range types become little more than a pair of values. More specifically, >> a range then *doesn't* represent a set of values, because whether or >> not a value is "in" the range depends on a specific sort order. > > Yeah, that doesn't seem like the way to go. If a range value doesn't > represent a well-defined set of base-type values, we lose a lot of the > mathematical underpinnings for range operations. > > So ... just how awful would it be if we hard-wired range types to always > use their base type's default btree sort ordering and the database's > default collation? In principle that sucks, but I'm not sure how wide > the use-cases actually will be for other choices. > > The other viable alternative seems to be to require those two properties > (btree opclass and collation) to be part of a specific range type > definition. The complaint about that seemed to be that we couldn't > infer an ANYRANGE type given only ANYELEMENT, but could we alleviate > that by identifying one range type as the default for the base type, > and then using that one in cases where we have no ANYRANGE input? > > regards, tom lane I still think that the most elegant solution is for stuff like collation to just be built-in to the base types that the range is ranging over, meaning we have a separate text base type for each text collation, and the text operators are polymorphic over all those base types. Having collations and stuff as something off to the side not built-in to text/etc types is the root of the problem. The range-specific stuff can remain ANYELEMENT and no special-casing is required. Also, besides range constructors, a generic membership test like "value in range" is polymorphic. -- Darren Duncan
Darren Duncan <darren@darrenduncan.net> writes: > I still think that the most elegant solution is for stuff like collation to just > be built-in to the base types that the range is ranging over, meaning we have a > separate text base type for each text collation, and the text operators are > polymorphic over all those base types. Having collations and stuff as something > off to the side not built-in to text/etc types is the root of the > problem. I tend to agree that this aspect of the SQL standard isn't terribly well designed, but it's the standard and we're stuck with it. We're not going to support two parallel methods of dealing with collations. regards, tom lane
On Jun20, 2011, at 20:58 , Tom Lane wrote: > Darren Duncan <darren@darrenduncan.net> writes: >> I still think that the most elegant solution is for stuff like collation to just >> be built-in to the base types that the range is ranging over, meaning we have a >> separate text base type for each text collation, and the text operators are >> polymorphic over all those base types. Having collations and stuff as something >> off to the side not built-in to text/etc types is the root of the >> problem. > > I tend to agree that this aspect of the SQL standard isn't terribly well > designed, but it's the standard and we're stuck with it. We're not > going to support two parallel methods of dealing with collations. Plus, you can always define a DOMAIN for every collation you intent to use, and stay clear of COLLATE clauses except as part of these domain definitions. Most interestingly, this is also the workaround Jeff Davis suggested for those who absolutely need two range types over the same base type (i.e. define one of the ranges over a domain). best regards, Florian Pflug
Florian Pflug wrote: > On Jun20, 2011, at 20:58 , Tom Lane wrote: >> Darren Duncan <darren@darrenduncan.net> writes: >>> I still think that the most elegant solution is for stuff like collation to just >>> be built-in to the base types that the range is ranging over, meaning we have a >>> separate text base type for each text collation, and the text operators are >>> polymorphic over all those base types. Having collations and stuff as something >>> off to the side not built-in to text/etc types is the root of the >>> problem. >> I tend to agree that this aspect of the SQL standard isn't terribly well >> designed, but it's the standard and we're stuck with it. We're not >> going to support two parallel methods of dealing with collations. > > Plus, you can always define a DOMAIN for every collation you intent to use, > and stay clear of COLLATE clauses except as part of these domain definitions. > > Most interestingly, this is also the workaround Jeff Davis suggested for > those who absolutely need two range types over the same base type (i.e. > define one of the ranges over a domain). > > best regards, > Florian Pflug That DOMAIN-based solution ostensibly sounds like a good one then, under the circumstances. What I *don't* want to see is for things like ranges to have their own collations and the like. From the perspective of all range-specific things, the types over which they're defined like text should just have their own native ordering, which defines the range's sense of "before" and "after". If DOMAIN effectively does that for text types, then that is the way to go. -- Darren Duncan
On Mon, 2011-06-20 at 12:54 -0700, Darren Duncan wrote: > That DOMAIN-based solution ostensibly sounds like a good one then, under the > circumstances. It's not bad from a theoretical standpoint, but it does require some extra type annotation, which is not really the "SQL way". > What I *don't* want to see is for things like ranges to have > their own collations and the like. I'm not 100% sure what you mean here. If you mean that you don't want range types to pay attention to COLLATE clauses, etc., then I agree. I would also agree if you mean that range values should not carry the collation with them. However, it looks like we might try to make the opclass/collation pair a property of the range type definition. That seems nice, because it allows us to keep the nice properties of ranges as well as the type inference and polymorphism for everything except the constructors. Regards,Jeff Davis
On Mon, 2011-06-20 at 13:43 -0400, Tom Lane wrote: > The other viable alternative seems to be to require those two properties > (btree opclass and collation) to be part of a specific range type > definition. The complaint about that seemed to be that we couldn't > infer an ANYRANGE type given only ANYELEMENT, but could we alleviate > that by identifying one range type as the default for the base type, > and then using that one in cases where we have no ANYRANGE input? Yes, that sounds similar to Florian's suggestion, and I think there may be a solution down this path. However, if we're going to have range types with non-default orderings, then we need a way to construct them. I suggested that, if constructors are the primary problem case, then just generate non-polymorphic constructors at range type definition time, named after the range type name. I'll look into that approach. Regards,Jeff Davis