Re: Range Types and extensions - Mailing list pgsql-hackers

From Darren Duncan
Subject Re: Range Types and extensions
Date
Msg-id 4DEE6DE3.4090308@darrenduncan.net
Whole thread Raw
In response to Re: Range Types and extensions  (Jeff Davis <pgsql@j-davis.com>)
List pgsql-hackers
Jeff Davis wrote:
> On Tue, 2011-06-07 at 11:15 -0400, Tom Lane wrote:
>> Merlin Moncure <mmoncure@gmail.com> writes:
>>> right. hm -- can you have multiple range type definitions for a
>>> particular type?
>> In principle, sure, if the type has multiple useful sort orderings.
> 
> Right. Additionally, you might want to use different "canonical"
> functions for the same subtype.
> 
>> I don't immediately see any core types for which we'd bother.
> 
> Agreed.
> 
>> BTW, Jeff, have you worked out the implications of collations for
>> textual range types?
> 
> Well, "it seems to work" is about as far as I've gotten.
> 
> As far as the implications, I'll need to do a little more research and
> thinking. But I don't immediately see anything too worrisome.

I would expect ranges to have exactly the same semantics as ORDER BY or "<" etc 
with respect to collations for textual range types.

If collation is an attribute of a textual type, meaning that the textual type or 
its values have a sense of their collation built-in, then ranges for those 
textual types should "just work" without any extra range-specific syntax, same 
as you could say ORDER BY without any further qualifiers.

If collation is not an attribute of a textual type, meaning that you normally 
have to qualify the desired collation for each order-sensitive operation using 
it (even if that can be defined by a session/etc setting which still just 
ultimately works at the operator rather than type level), or if a textual type 
can have it built in but it is overridable per operator, then either ranges 
should have an extra attribute saying what collation (or other type-specific 
order-determining function) to use, or all range operators take the optional 
collation parameter like with ORDER BY.

Personally, I think it is a more elegant programming language design for an 
ordered type to have its own sense of a one true canonical ordering of its 
values, and where one could conceptually have multiple orderings, there would be 
a separate data type for each one.  That is, while you probably only need a 
single type with respect to ordering for any real numeric type, for textual 
types you could have a separate textual type for each collation.

In particular, I say separate type because a collation can sometimes affect 
differently what text values compare as "same", as far as I know.

On a tangent, I believe that various insensitive comparisons or sortings are 
very reasonably expressed as collations rather than some other mechanism, eg if 
you wanted sortings that compare different letter case as same or not, or with 
or without accents as same or not.

So under this "elegant" system, there is no need to ever specify collation at 
the operator level (which could become quite verbose and unweildy), but instead 
you can cast data types if you want to change their sense of canonical ordering.

Now if the various text-specific operators are polymorphic across these text 
type variants, users don't generally have to know the difference except when it 
matters.

On a tangent, I believe that the best definition of "equal" or "same" in a type 
system is global substitutability.  Ignoring implementation details, if a 
program ever finds that 2 operands to the generic "=" (equality test) operator 
result in TRUE, then the program should feel free to replace all occurrences of 
one operand in the program with occurrences of the other, for optimization, 
because generic "=" returning TRUE means one is just as good as the other.  This 
assumes generally that we're dealing with immutable value types.

-- Darren Duncan



pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: reducing the overhead of frequent table locks - now, with WIP patch
Next
From: Heikki Linnakangas
Date:
Subject: Re: SIREAD lock versus ACCESS EXCLUSIVE lock