Re: Type Categories for User-Defined Types - Mailing list pgsql-hackers
From | David E. Wheeler |
---|---|
Subject | Re: Type Categories for User-Defined Types |
Date | |
Msg-id | 055450C5-1386-45F3-B2D3-8FD72E781B0A@kineticode.com Whole thread Raw |
In response to | Re: Type Categories for User-Defined Types (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Type Categories for User-Defined Types
(Tom Lane <tgl@sss.pgh.pa.us>)
|
List | pgsql-hackers |
On Jul 29, 2008, at 14:00, Tom Lane wrote: > Well, a rough estimate of the places where implicit coercion to text > might be relevant to resolving ambiguity is > > select proname from pg_proc > where 'text'::regtype = any(proargtypes) > group by proname having count(*)>1; > > select oprname from pg_operator > where oprleft='text'::regtype or oprright='text'::regtype > group by oprname having count(*)> 1; > > I count 37 functions and 10 operators as of CVS HEAD. Perhaps not all > would need to be fixed in practical use, but if you wanted seamless > integration of citext it's quite possible that you'd need alias > functions/operators (maybe more than one) in each of those cases. Well, there are already citext aliases for all of those operators, for this very reason. There are citext aliases for a bunch of the functions, too (ltrim(), substring(), etc.), so I wouldn't worry about adding more. I've added more of them since I last sent a patch, mainly for the regexp functions, replace(), strpos(), etc. I'd guess that I'm about half-way there already, and there probably are a few I wouldn't bother with (like timezone()). Anyway, would this issue then go away once the type stuff was added and citext was specified as TYPE = 'S'? > [ squint... ] Actually, this is an underestimate since these queries > aren't finding cases like quote_literal, where there is ambiguity but > only one of the alternatives takes 'text'. I'm too lazy to work out a > better query though. Thanks. >> Perhaps tangential: What does it mean for a type to be "preferred"? > > See the ambiguous-function resolution rules in chapter 10 of the fine > manual ... I see this: > C. Run through all candidates and keep those that accept preferred > types (of the input data type's type category) at the most positions > where type conversion will be required. Keep all candidates if none > accept preferred types. If only one candidate remains, use it; else > continue to the next step. That doesn't exactly explain what "preferred" means, just that it seems to prioritize the resolution of a function a bit. Which, I guess, is the point. >> Wouldn't this then limit them to 52 possible categories? > > It'd be either 94 - 26 or 94 - 26 - 26 depending on what the policy is > about lower-case letters (and assuming they wanted to stay away from > control characters, which seems like a good idea). Considering the > world supply of categories up to now has been about ten, it's hard > to imagine that this is really a limitation. Okay. >> Does that >> matter? Given your suggestion, I'm assuming that a single character >> is >> somehow more efficient than an enum, yes? > > Marginally so; but an enum wouldn't help anyway unless we are prepared > to invent ALTER ENUM. We'd have to go to an actual new system catalog > if we wanted something noticeably better than the poor-mans-enum > approach, and as I mentioned earlier, that just seems like overkill. > (Besides, we could always add it later if there's suddenly a gold rush > for categories. The only thing we'd be locking ourselves into, if > we view this as a stopgap implementation, is the need to accept > single-character abbreviations in future, even after the system knows > actual names for categories.) Makes sense. Thanks, David
pgsql-hackers by date: