Re: TODO: Fix CREATE CAST on DOMAINs - Mailing list pgsql-hackers

From Mark Dilger
Subject Re: TODO: Fix CREATE CAST on DOMAINs
Date
Msg-id 4512BFB5.9010501@markdilger.com
Whole thread Raw
In response to Re: TODO: Fix CREATE CAST on DOMAINs  (Martijn van Oosterhout <kleptog@svana.org>)
List pgsql-hackers
Martijn van Oosterhout wrote:
> On Wed, Sep 20, 2006 at 10:56:08AM -0700, Mark Dilger wrote:
>> If the system chooses cast chains based on a breadth-first search, then the 
>> existing int2 -> int8 cast would be chosen over an int2 -> int4 -> int8 
>> chain, or an int2 -> int3 -> int4 -> int8 chain, or in fact any chain at 
>> all, because the int2 -> int8 cast is the shortest.
> 
> But we're not talking about a search here, we don't always know where
> the endpoint is. Imagine you have the following three functions:
> 
> abs(int8)
> abs(float4)
> abs(numeric)
> 
> And you have an int2. Which is the best cast to use? What's the answer
> if you have a float8? What if it's an unknown type text string?
> 
> Now, consider that functions can have up to 32 arguments and that this
> resolution might have to be applied to each argument and you find that
> searching is going to get very expensive very quickly.
> 
> The current system of requiring only a single step is at least
> predictable. If you have the choice between:
> 
> - first argument matches, second needs three "safe" conversions, and
> - first argument need one "unsafe" conversion, second matches exactly
> 
> Which is cheaper?
> 
> To make this manageable you have to keep the number of types you can
> cast to small, or you'll get lost in the possibilites. Adding just a
> single step domain to base type conversion seems pretty safe, but
> anything more is going to be hard.
> 
> Have a nice day,

The searching never needs to be done at runtime.  It should be computable at 
cast creation time.  A new cast creates a potential bridge between any two types 
in the system.  Using a shortest path algorithm, the best chain (if any exists) 
from one type to another can be computed and pre-compiled, right?

So, assume the following already exists:

Types A,B,C, fully connected with casts A->B, B->A, A->C, C->A, B->C, C->B, with 
some marked IMPLICIT, some marked EXPLICIT, and some marked SAFE.

Types X,Y,Z, also fully connected with casts, as above.

Then assume someone comes along and creates a new type M with conversions A->M, 
M->A, X->M, and M->X.  At the time that type and those casts are added to the 
system, the system could calculate any additional casts to/from B, C, Y, and Z.  A simple implementation (but maybe not
optimal)would be for the system to 
 
autogenerate code like:

CREATE FUNCTION cast_M_Y (arg M) RETURNS Y AS $$SELECT arg::X::Y;
$$ LANGUAGE SQL;
CREATE CAST (M AS Y) WITH FUNCTION cast_M_Y(M) [ AS ASSIGNMENT | AS IMPLICIT ]

And then load that function and cast.  The only real trick seems to be 
determining the rules for which cast chain gets used within that autogenerated 
function, and whether the generated cast is IMPLICIT, EXPLICIT, or ASSIGNMENT.



Looking over what I have just written, another idea pops up.  To avoid having 
the system decide which casts are reasonable, you could extend the syntax and 
allow an easy shorthand for the user.  Something like:

CREATE CAST (M AS A)WITH FUNCTION cast_M_AAS ASSIGNMENTPROPOGATES TO B AS ASSIGNMENT,PROPOGATES TO C AS ASSIGNMENT;

CREATE CAST (A AS M)WITH FUNCTION cast_A_MAS ASSIGNMENTPROPOGATES FROM B,PROPOGATES FROM C;

And then the casts from M->B, M->C, B->M, and C->M would all be added to the system.

Thoughts?

mark



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [PATCHES] large object regression tests
Next
From: Tom Lane
Date:
Subject: Cause of moving-target FSM space-needed reports