Re: Range Types and extensions - Mailing list pgsql-hackers
From | Darren Duncan |
---|---|
Subject | Re: Range Types and extensions |
Date | |
Msg-id | 4DEE6DE3.4090308@darrenduncan.net Whole thread Raw |
In response to | Re: Range Types and extensions (Jeff Davis <pgsql@j-davis.com>) |
List | pgsql-hackers |
Jeff Davis wrote: > On Tue, 2011-06-07 at 11:15 -0400, Tom Lane wrote: >> Merlin Moncure <mmoncure@gmail.com> writes: >>> right. hm -- can you have multiple range type definitions for a >>> particular type? >> In principle, sure, if the type has multiple useful sort orderings. > > Right. Additionally, you might want to use different "canonical" > functions for the same subtype. > >> I don't immediately see any core types for which we'd bother. > > Agreed. > >> BTW, Jeff, have you worked out the implications of collations for >> textual range types? > > Well, "it seems to work" is about as far as I've gotten. > > As far as the implications, I'll need to do a little more research and > thinking. But I don't immediately see anything too worrisome. I would expect ranges to have exactly the same semantics as ORDER BY or "<" etc with respect to collations for textual range types. If collation is an attribute of a textual type, meaning that the textual type or its values have a sense of their collation built-in, then ranges for those textual types should "just work" without any extra range-specific syntax, same as you could say ORDER BY without any further qualifiers. If collation is not an attribute of a textual type, meaning that you normally have to qualify the desired collation for each order-sensitive operation using it (even if that can be defined by a session/etc setting which still just ultimately works at the operator rather than type level), or if a textual type can have it built in but it is overridable per operator, then either ranges should have an extra attribute saying what collation (or other type-specific order-determining function) to use, or all range operators take the optional collation parameter like with ORDER BY. Personally, I think it is a more elegant programming language design for an ordered type to have its own sense of a one true canonical ordering of its values, and where one could conceptually have multiple orderings, there would be a separate data type for each one. That is, while you probably only need a single type with respect to ordering for any real numeric type, for textual types you could have a separate textual type for each collation. In particular, I say separate type because a collation can sometimes affect differently what text values compare as "same", as far as I know. On a tangent, I believe that various insensitive comparisons or sortings are very reasonably expressed as collations rather than some other mechanism, eg if you wanted sortings that compare different letter case as same or not, or with or without accents as same or not. So under this "elegant" system, there is no need to ever specify collation at the operator level (which could become quite verbose and unweildy), but instead you can cast data types if you want to change their sense of canonical ordering. Now if the various text-specific operators are polymorphic across these text type variants, users don't generally have to know the difference except when it matters. On a tangent, I believe that the best definition of "equal" or "same" in a type system is global substitutability. Ignoring implementation details, if a program ever finds that 2 operands to the generic "=" (equality test) operator result in TRUE, then the program should feel free to replace all occurrences of one operand in the program with occurrences of the other, for optimization, because generic "=" returning TRUE means one is just as good as the other. This assumes generally that we're dealing with immutable value types. -- Darren Duncan
pgsql-hackers by date: