Re: Type Categories for User-Defined Types - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Type Categories for User-Defined Types
Date
Msg-id 15217.1217362329@sss.pgh.pa.us
Whole thread Raw
In response to Type Categories for User-Defined Types  (David E. Wheeler <david@kineticode.com>)
Responses Re: Type Categories for User-Defined Types  ("David E. Wheeler" <david@kineticode.com>)
Re: Type Categories for User-Defined Types  (Gregory Stark <stark@enterprisedb.com>)
List pgsql-hackers
"David E. Wheeler" <david@kineticode.com> writes:
> On Jul 29, 2008, at 11:41, Tom Lane wrote:
>> and I notice that cases like
>> contrib_regression=# select 'a'::text || 'b'::citext;
>> ERROR:  operator is not unique: text || citext
>> still don't work even though you put in an alias || operator.

> Damn, I didn't even notice that! Can that be fixed?

Given the present infrastructure I think the only way would be with
two more alias operators, text||citext and citext||text.  But that way
madness lies.

>> Obviously the solution should involve a new column in pg_type and
>> a new type property in CREATE TYPE, but what should the representation
>> be?  A full-on approach would make the type categories be real SQL
>> objects with their own system catalog and reference them by OID,
>> but I can't help thinking that that's overkill.

> It kinda sounds that way, yeah. What happens with DOMAINs, BTW? Do  
> they need to write hacky functions like the above, or are they aware  
> of their types because of the types from which they inherit?

Domains are treated as their base types in general.  Elein has been
complaining about that for years ;-) ... but I think improving it
is unrelated to this issue.

>> Anyway, debating that is probably material for a separate thread ...

> Here you go! ;-)

After a quick look to verify my recollection: the only two things
that the system does with type categories are
extern CATEGORY TypeCategory(Oid type);

Returns the category a type belongs to.
extern bool IsPreferredType(CATEGORY category, Oid type);

Detects whether a type is a preferred type in its category (there can
be more than one preferred type in a category, and in fact the
traditional setup is that *every* user-defined type is a preferred
type in the USER_TYPE category).

The categories themselves are pretty much opaque values, except that
parse_func.c has special behavior to prefer STRING_TYPE when in doubt.

So this can fairly obviously be replaced by two new pg_type columns,
say "typcategory" and "typpreferred", where the latter is a bool.
Since the list of categories is pretty short and there's no obvious
reason to extend it a lot, I propose that we just represent typcategory
as a "char", using a mapping along the lines of
BITSTRING_TYPE        bBOOLEAN_TYPE        BDATETIME_TYPE        DGENERIC_TYPE        P (think
"pseudotype")GEOMETRIC_TYPE       GINVALID_TYPE        \0 (not allowed in catalog anyway)NETWORK_TYPE
nNUMERIC_TYPE       NSTRING_TYPE        STIMESPAN_TYPE        TUNKNOWN_TYPE        uUSER_TYPE        U
 

Users would be allowed to select any single ASCII character as the
"category" of a user-defined type, should they have a need to make their
own new category.  Of course CREATE TYPE's default is category = U and
preferred = true for backward compatibility reasons.  We could put down
a rule that system-defined categories are always upper or lower case
letters (or even always upper, if we wanted to strain some of the
assignments a bit) so that it's clear what can be used for a
user-defined category.

It might possibly be worth making new categories for arrays, composites,
and enums; they're currently effectively USER_TYPE but that doesn't seem
quite right.  Also, the rules for domains should likely be "same
category as base type, never a preferred type" instead of the current
behavior where they're user types.  (I think the latter doesn't really
matter now, because we always smash a domain to its base type before
inquiring about categories anyway.  But it might give Elein a bit more
room to maneuver with the functions-on-domains issue.)

A possible objection is that this will make TypeCategory and
IsPreferredType slower than before, since they'll involve a syscache
lookup instead of a simple switch statement.  I don't think this will
be too bad though; all the paths they are used in are full of catalog
lookups anyway, so it's hard to credit that there would be much
percentage slowdown.

Thoughts?
        regards, tom lane


pgsql-hackers by date:

Previous
From: David Fetter
Date:
Subject: Re: [PATCH] "\ef " in psql
Next
From: "David E. Wheeler"
Date:
Subject: Re: Type Categories for User-Defined Types