Re: TODO: Fix CREATE CAST on DOMAINs - Mailing list pgsql-hackers

From Mark Dilger
Subject Re: TODO: Fix CREATE CAST on DOMAINs
Date
Msg-id 451180B8.4020507@markdilger.com
Whole thread Raw
In response to Re: TODO: Fix CREATE CAST on DOMAINs  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: TODO: Fix CREATE CAST on DOMAINs  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: TODO: Fix CREATE CAST on DOMAINs  (Martijn van Oosterhout <kleptog@svana.org>)
List pgsql-hackers
Tom Lane wrote:
> Mark Dilger <pgsql@markdilger.com> writes:
>> Mark Dilger wrote:
>>> Casts from int2 -> int4, int2 -> int8, and int4 -> int8 would all be 
>>> SAFE, I think, because they are not lossy.  But perhaps I have not 
>>> thought enough about this and these should be IMPLICIT rather than SAFE.
> 
>> I have thought about this some more.  I think these are indeed SAFE.  The 
>> distinction between SAFE and IMPLICIT should not, I think, be whether the 
>> storage type is identical, but rather whether there is any possible loss of 
>> precision, range, accuracy, etc., or whether there is any change in the 
>> fundamental interpretation of the data when cast from the source to destination 
>> type.
> 
> You are going in exactly the wrong direction --- this line of thought is
> aiming to make *more* casts possible by default, which is not what we
> need, at least not among the collection of base types.
> 

If I understand correctly, you are worried about two issues:  ambiguity and 
performance.  You don't want the system to be slower from the extra searching 
needed to find possible multiple step casts, and you don't want any new 
ambiguity where the system can't deterministically decide which choice of 
cast(s) should be used.  Is that right?

If the system chooses cast chains based on a breadth-first search, then the 
existing int2 -> int8 cast would be chosen over an int2 -> int4 -> int8 chain, 
or an int2 -> int3 -> int4 -> int8 chain, or in fact any chain at all, because 
the int2 -> int8 cast is the shortest.

So the code to search chains should only be invoked in what would currently be 
an *error condition*, that being that the SQL includes a request for a cast that 
cannot be resolved without chaining.

Since the chaining code would be new, and the rules for it would be new, we can 
still design them however we like (within reason.)  I would propose:

1) Shorter chains trump longer chains.

2) When comparing two equal length chains, one made entirely of SAFE casts 
trumps one which contains an IMPLICIT cast.

3) When two or more chains remain that cannot be resolved under the above two 
rules, the SQL is considered ambiguous and an error condition is raised.

I don't see how this would break any existing valid SQL.  But it seems like it 
would solve both the DOMAIN problem you mentioned and the oft lamented problem 
that adding a new datatype requires quadratically many casts to the system.

mark


pgsql-hackers by date:

Previous
From: "Joshua D. Drake"
Date:
Subject: Re: Release Notes: Major Changes in 8.2
Next
From: Heikki Linnakangas
Date:
Subject: Phantom Command ID