Re: Type Categories for User-Defined Types - Mailing list pgsql-hackers

From David E. Wheeler
Subject Re: Type Categories for User-Defined Types
Date
Msg-id 055450C5-1386-45F3-B2D3-8FD72E781B0A@kineticode.com
Whole thread Raw
In response to Re: Type Categories for User-Defined Types  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Type Categories for User-Defined Types  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Jul 29, 2008, at 14:00, Tom Lane wrote:

> Well, a rough estimate of the places where implicit coercion to text
> might be relevant to resolving ambiguity is
>
> select proname from pg_proc
>  where 'text'::regtype = any(proargtypes)
>  group by proname having count(*)>1;
>
> select oprname from pg_operator
>  where oprleft='text'::regtype or oprright='text'::regtype
>  group by oprname having count(*)> 1;
>
> I count 37 functions and 10 operators as of CVS HEAD.  Perhaps not all
> would need to be fixed in practical use, but if you wanted seamless
> integration of citext it's quite possible that you'd need alias
> functions/operators (maybe more than one) in each of those cases.

Well, there are already citext aliases for all of those operators, for  
this very reason. There are citext aliases for a bunch of the  
functions, too (ltrim(), substring(), etc.), so I wouldn't worry about  
adding more. I've added more of them since I last sent a patch, mainly  
for the regexp functions, replace(), strpos(), etc. I'd guess that I'm  
about half-way there already, and there probably are a few I wouldn't  
bother with (like timezone()).

Anyway, would this issue then go away once the type stuff was added  
and citext was specified as TYPE = 'S'?

> [ squint... ]  Actually, this is an underestimate since these queries
> aren't finding cases like quote_literal, where there is ambiguity but
> only one of the alternatives takes 'text'.  I'm too lazy to work out a
> better query though.

Thanks.

>> Perhaps tangential: What does it mean for a type to be "preferred"?
>
> See the ambiguous-function resolution rules in chapter 10 of the fine
> manual ...

I see this:

> C. Run through all candidates and keep those that accept preferred  
> types (of the input data type's type category) at the most positions  
> where type conversion will be required. Keep all candidates if none  
> accept preferred types. If only one candidate remains, use it; else  
> continue to the next step.

That doesn't exactly explain what "preferred" means, just that it  
seems to prioritize the resolution of a function a bit. Which, I  
guess, is the point.

>> Wouldn't this then limit them to 52 possible categories?
>
> It'd be either 94 - 26 or 94 - 26 - 26 depending on what the policy is
> about lower-case letters (and assuming they wanted to stay away from
> control characters, which seems like a good idea).  Considering the
> world supply of categories up to now has been about ten, it's hard
> to imagine that this is really a limitation.

Okay.

>> Does that
>> matter? Given your suggestion, I'm assuming that a single character  
>> is
>> somehow more efficient than an enum, yes?
>
> Marginally so; but an enum wouldn't help anyway unless we are prepared
> to invent ALTER ENUM.  We'd have to go to an actual new system catalog
> if we wanted something noticeably better than the poor-mans-enum
> approach, and as I mentioned earlier, that just seems like overkill.
> (Besides, we could always add it later if there's suddenly a gold rush
> for categories.  The only thing we'd be locking ourselves into, if
> we view this as a stopgap implementation, is the need to accept
> single-character abbreviations in future, even after the system knows
> actual names for categories.)

Makes sense.

Thanks,

David



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Type Categories for User-Defined Types
Next
From: Tom Lane
Date:
Subject: Re: Type Categories for User-Defined Types