Re: Range Types and extensions - Mailing list pgsql-hackers

From Darren Duncan
Subject Re: Range Types and extensions
Date
Msg-id 4DF59660.8080206@darrenduncan.net
Whole thread Raw
In response to Re: Range Types and extensions  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Range Types and extensions
List pgsql-hackers
Robert Haas wrote:
> On Sun, Jun 12, 2011 at 7:53 AM, Florian Pflug <fgp@phlo.org> wrote:
>>> I think the collation is going to have to be baked into the type
>>> definition, no?  You can't just up and change the collation of the
>>> column as you could for a straight text column, if that might cause
>>> the contents of some rows to be viewed as invalid.
>> Now you've lost me. If a text range is simply a pair of strings,
>> as I suggested, and collations are applied only during comparison
>> and RANGE_EMPTY(), why would the collation have to be baked into
>> the type?
>>
>> If you're referring to the case
>>  (1) Create table with text-range column and collation C1
>>  (2) Add check constraint containing RANGE_EMPTY()
>>  (3) Add data
>>  (4) Alter column to have collation C2, possibly changing
>>      the result of RANGE_EMPTY() for existing ranges.
>> then that points to a problem with ALTER COLUMN.
> 
> No, I'm saying that you might have a column containing  '[a, Z)', and
> someone might change the collation of the column from en_US to C.
> When the collation was en_US, the column could legally contain that
> value, but now that the collation is C, it can't.  ALTER TABLE isn't
> going to recheck the validity of the data when someone changes the
> collation: that's only supposed to affect the sort order, not the
> definition of what is a legal value.

You can have the same collation problem even without range types.

Consider the following: (1) Create table with the 2 text columns {L,R} and both columns have the 
collation en_US. (2) Add check constraint requiring "L <= R". (3) Add a record with the value 'a' for L and 'Z' for R.
(4)Alter the columns to have the collation C.
 

Good language design principles demand that the semantics for this simplified 
case and the semantics for replacing {L,R} with a single range-of-text-typed 
column be the same, including what happens with CHECK and ALTER TABLE.

Likewise, anything that affects ORDER BY should affect {<,>,<=,>=} and friends 
the same way and vice-versa and likewise should affect range validity.

It makes sense for collation to be considered part of text data types, and 
changing collation is casting from one text type to another.  Generally 
speaking, any inherent or applied aspect of a text or other value (such as 
collation) that affects the results of any deterministic operations on those 
values (such as sorting) should be considered part of the data type of those values.

-- Darren Duncan


pgsql-hackers by date:

Previous
From: Greg Smith
Date:
Subject: Re: pgbench--new transaction type
Next
From: Robert Haas
Date:
Subject: Re: Range Types and extensions