Re: Range Types and extensions - Mailing list pgsql-hackers

From Florian Pflug
Subject Re: Range Types and extensions
Date
Msg-id B593F1CC-C33B-4525-9130-7A56C20454C9@phlo.org
Whole thread Raw
In response to Re: Range Types and extensions  (Jeff Davis <pgsql@j-davis.com>)
Responses Re: Range Types and extensions
List pgsql-hackers
On Jun18, 2011, at 10:10 , Jeff Davis wrote:
> On Fri, 2011-06-10 at 00:26 +0200, Florian Pflug wrote:
> So, I believe that you are proposing to change the concept of a range
> value from "a contiguous set of values" to "a pair of bounds".

Yeah. Mostly though because I figured that'd make defining their
semantics easier, not necessarily because that interpretation is
better, though.

> There are
> numerous implications, one of which is that I don't think that we can
> maintain the equality of all empty ranges. Consider these expressions,
> where x is a non-empty range with collation "A", but is empty in
> collation "B" (and "*" means "range intersection"):
>
>  (x COLLATE "B") COLLATE "A"
>  ((x COLLATE "B") * '(-Inf, Inf)') COLLATE "A"
>  ('-'::textrange * '(-Inf, Inf)') COLLATE "A"
>
> All of those expressions should be equal (according to global
> substitutibility, as Darren mentioned). But they can't be, because the
> last expression is always an empty range, whereas the first one is not
> (because merely changing the collation back and forth offers no
> opportunity to even notice that you have an empty range at one point).
> So, I believe that we'd be stuck with non-equal empty ranges, as well as
> many other possibly non-intuitive implications.

Yeah. Once you give up the idea that range is a set, extensionality
(i.e. the axiom "there's only one empty range" or more precisely
"there only one range which no object is a member of") has to go too.

> So, I lean strongly toward the interpretation that a range is a
> contiguous set of values,

Yeah, I agree now, mainly because defining them as a set give rise
to richer semantics than defining them to be a pair. If someone
needs just a pair of values and maybe a BETWEEN operator, that is
easily done with CREATE TYPE and a few SQL or PLPGSQL functions.

> and changing the collation should not change
> the value. Things that do change the value (like a typecast) should
> offer the opportunity to handle cases like this with a function call,
> but changing collation does not.
>
> This leaves making the collation a part of the range type itself (as
> Robert suggested).

Yes, that seems necessary for consistency. That leaves the question
of what to do if someone tries to modify a textrange's collation with
a COLLATE clause. For example,

For example, whats the result of 'Ä' in '[A,Z']::textrange_german COLLATE 'C'
where 'Ä' is a german Umlaut-A which sorts after 'A' but before 'B'
in locale 'de_DE' but sorts after 'Z' in locale 'C'. (I'm assuming
that textrange_german was defined with collation 'de_DE').

With the set-based definition of ranges, the only sensible thing
is to simply ignore the COLLATE clause I think.

best regards,
Florian Pflug



pgsql-hackers by date:

Previous
From: Mariano Mara
Date:
Subject: Grouping Sets
Next
From: Andrew Dunstan
Date:
Subject: Re: pika buildfarm member failure on isolationCheck tests