Re: SQL99 ARRAY support proposal - Mailing list pgsql-hackers

From Tom Lane
Subject Re: SQL99 ARRAY support proposal
Date
Msg-id 20239.1047236328@sss.pgh.pa.us
Whole thread Raw
In response to Re: SQL99 ARRAY support proposal  (Joe Conway <mail@joeconway.com>)
Responses Re: SQL99 ARRAY support proposal  (Joe Conway <mail@joeconway.com>)
List pgsql-hackers
Joe Conway <mail@joeconway.com> writes:
> Tom Lane wrote:
>> The array type is determined how?  I'd like this syntax better if there
>> were a way to force the choice of array type...

> What about:
> select integer ARRAY[1,2,3];
>     result '{1,2,3}'::integer[]

By analogy to the "type 'literal'" syntax?  I'd prefer not to go that
route, because that syntax for literals is a horrid kluge --- to keep
bison from spitting up, we've had to put a bunch of nasty restrictions
on the type names that can appear in such constructs.  All those
restrictions would have to apply here, too.

It's possible that we could use the other cast syntaxes:ARRAY[1,2,3]::integer[]CAST(ARRAY[1,2,3] AS integer[])
It would take some hackery to propagate the destination type down into
the ARRAY[] before the latter's type resolution is done, but at least
it'd be a quite localized hack.

>> Puh-leez tell me that's not in the spec.  How is one supposed to
>> distinguish this usage from the scalar-subselect case?

> but SQL200x has this:

> <array value constructor> ::=
>    <array value constructor by enumeration> |
>    <array value constructor by query>
> <array value constructor by enumeration> ::=
>    ARRAY <left bracket or trigraph>
>          <array element list>
>          <right bracket or trigraph>
> <array value constructor by query> ::=
>    ARRAY <left paren>
>          <query expression> [ <order by clause> ]
>          <right paren>

This I could live with --- note the difference in punctuation.  There
would be a clear syntactic and semantic difference between
ARRAY(SELECT ...) and ARRAY[(SELECT ...)].

>> How many variants of the || operator do you plan to offer?

> One for each builtin datatype[]/datatype pair (e.g. integer[]/integer), 
> and another for each datatype[] (e.g. integer[]/integer[])

That does not seem workable at all.  It'll be a maintenance nightmare
even worse than what we already have for array equality; and I suspect
you'll find that the parser will have trouble resolving which || to pick.

I have been toying with the idea of replacing all the array-equality
operators with a single pg_proc/pg_operator entry for "ANYARRAY = ANYARRAY".
Now that arrays carry their element type in the header, a single
function could implement all these variants.  Type mismatch (eg,
int[] = float[]) would become a runtime error instead of a parse error,
but that seems acceptable.

I see how the array || array cases could be handled similarly, but I
don't see how to make array || scalar work that way.  If we declared an
operator ANYARRAY || ANY then there'd be no way for it to check at
runtime that the right-hand operand matched the lefthand element type.
(Also, it'd create resolution conflicts with ANYARRAY || ANYARRAY.)

It'd be nice to have a general-purpose solution for this problem.
If we did, the stuff in contrib/array would benefit too, and perhaps
have (at last) a shot at becoming mainstream.

Speculating wildly: what if we invent a new pseudotype ANYARRAYELEMENT,
which would be allowed only in operator/function declarations that also
use ANYARRAY, egANYARRAY || ANYARRAYELEMENT
and would match only the element type of the array in the ANYARRAY
position.  I think it'd be possible to hack the parser to make such a
check in parse_coerce.  There might be a cleaner way to do this, but
I do feel that something along this line is a better approach than
creating N different pg_operator entries for the same operation.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: regression failure in CVS HEAD
Next
From: Tom Lane
Date:
Subject: DECLARE CURSOR to become utility statement