Re: VARIANT / ANYTYPE datatype - Mailing list pgsql-hackers

From Joseph Adams
Subject Re: VARIANT / ANYTYPE datatype
Date
Msg-id BANLkTikxw73=Cy63pQSFe58cF0Gttq=WUw@mail.gmail.com
Whole thread Raw
In response to Re: VARIANT / ANYTYPE datatype  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: VARIANT / ANYTYPE datatype
List pgsql-hackers
On Wed, May 11, 2011 at 7:53 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> That's likely to be how it gets implemented, but you seem to have
> missed the point of some of the discussion upthread: the big problem
> with that is that someone might type "DROP TYPE foo", and when they
> do, you need an efficient way to figure out whether foo is in use
> inside an instance of the variant type anywhere in the system.  The
> devil is in the details...

Sorry, I missed that.  That in mind, I think I would lean more toward
the union proposal as well.  Can anyone think of a case where VARIANT
would be more useful?

As for using one or two bytes to store the type of a UNION, that
creates a problem when you want to extend the union in the future.
That is, if a UNION is simply a collection of possible types values of
the UNION type can hold.

If UNION is implemented more like a tagged union:
   CREATE TYPE token AS TAGGED UNION (identifier TEXT, keyword TEXT,
number INT);

Then the problem of altering it is much like the problem of altering an ENUM.

On Tue, May 10, 2011 at 5:19 PM, Darren Duncan <darren@darrenduncan.net> wrote:
> Examples of open union types could be number, which all the numeric types
> compose, and so you can know say that you can use the generic numeric
> operators on values you have simply if their types compose the number union
> type, and it still works if more numeric types appear later.  Likewise, the
> string open union could include both text and blob, as both support
> catenation and substring matches or extraction, for example.
>
> This would aid to operator overloading in a generic way, letting you use the
> same syntax for different types, but allowing types to mix is optional; eg,
> you could support "add(int,int)" and "add(real,real)" without supporting
> "add(int,real)" etc but the syntax "add(x,y)" is shared, and you do this
> while still having a strong type system; allowing the mixing is optional
> case-by-case.

Coming from a Haskell perspective, this is a great idea, but I don't
think the "union" feature should be used to implement it.  Closed
unions correspond to algebraic data types in Haskell, e.g.:
   data Ordering = LT | EQ | GT

while open unions are better-suited to type classes:
   (+) :: (Num a) => a -> a -> a

I, for one, would like to see PostgreSQL steal some features from
Haskell's type system.  PostgreSQL seems to implement a subset of
Haskell's system, without type classes and where functions can have
only one type variable (anyelement).

To express the (+) example in PostgreSQL, it would be tempting to simply say:
   add(real, real) returns real

However, what if each real is a different type (e.g. INT and FLOAT).
Is that allowed?  In the Haskell example above, (+) constraints both
of its arguments to the same type.  In ad-hoc syntax, it would look
like this in PostgreSQL:
   real anyelement => add(anyelement, anyelement) returns anyelement

Another thing to consider: attempting to use a type class as a column
type, e.g.:
   CREATE TABLE foo (n real);

Normally in Haskell, type information is passed implicitly as
parameters (hence the term "parametric polymorphism"), rather than
carried alongside values (like in object-oriented languages).  In the
case above, the type information would have to be carried with each
value.  Haskell actually supports this, but under a somewhat-weird
extension called "Existential types" (see
http://www.haskell.org/haskellwiki/Existential_type#Examples for an
example).  It isn't terribly useful in Haskell, and I don't think it
will be in PostgreSQL either.


Joey Adams


pgsql-hackers by date:

Previous
From: Merlin Moncure
Date:
Subject: Re: hint bit cache v5
Next
From: "Kevin Grittner"
Date:
Subject: Re: the big picture for index-only scans