Thread: Arrays versus 'type constant' syntax

Arrays versus 'type constant' syntax

From
Tom Lane
Date:
I spent some time today trying to persuade the grammar to accept
unadorned array subscripting, ieSELECT arraycolname[2] FROM table;
rather than what you have to do in 6.5:SELECT table.arraycolname[2] FROM table;

It's easy enough to add "opt_indirection" to the rules that use ColId,
but I find one ends up with a bunch of reduce/reduce conflicts.

The basic problem is that at the start of an expression, the inputident [
could be the beginning of a Typename with subscripts, or it could be
a column name with subscripts.  The way the grammar is constructed,
the parser has to reduce the ident to either ColId or a typename
nonterminal before it can shift the '[' ... and there's no way to
decide which.

Now how did Typename get into the picture?  There is one rule that
is the culprit, namely "AexprConst ::= Typename Sconst".  Without
that rule, a type name never appears at the start of an expression
so there is no conflict.

I can see three ways to proceed:

1. Forget about making arrays easier to use.

2. Remove "AexprConst ::= Typename Sconst" from the grammar.  I do
not believe this rule is in SQL92.  However, we've recommended
constructions like "default text 'now'" often enough that we might
not be able to get away with that.

3. Simplify the AexprConst rule to only allow a subset of Typename
--- it looks like forbidding array types in this context is enough.
(You could still write a cast using :: or AS, of course, instead of
"int4[3] '{1,2,3}'".  The latter has never worked anyway.)

I'm leaning to choice #3, but I wonder if anyone has a better idea.
        regards, tom lane


Re: [HACKERS] Arrays versus 'type constant' syntax

From
Bruce Momjian
Date:
> I spent some time today trying to persuade the grammar to accept
> unadorned array subscripting, ie
>     SELECT arraycolname[2] FROM table;
> rather than what you have to do in 6.5:
>     SELECT table.arraycolname[2] FROM table;
> 
> It's easy enough to add "opt_indirection" to the rules that use ColId,
> but I find one ends up with a bunch of reduce/reduce conflicts.

You know, that has been on the TODO list for a long time, so I should
have guessed it was some tricky problem.

> The basic problem is that at the start of an expression, the input
>     ident [
> could be the beginning of a Typename with subscripts, or it could be
> a column name with subscripts.  The way the grammar is constructed,
> the parser has to reduce the ident to either ColId or a typename
> nonterminal before it can shift the '[' ... and there's no way to
> decide which.

This reminds me of C grammar, where the scanner has to be able to ask
the grammar if a token is a type or not, because typedef can create its
own types.  This is why C grammar/scanning is not totally simple.  We
have avoided that complexity so far.

> Now how did Typename get into the picture?  There is one rule that
> is the culprit, namely "AexprConst ::= Typename Sconst".  Without
> that rule, a type name never appears at the start of an expression
> so there is no conflict.

That is quite interesting.

> I can see three ways to proceed:
> 
> 1. Forget about making arrays easier to use.
> 
> 2. Remove "AexprConst ::= Typename Sconst" from the grammar.  I do
> not believe this rule is in SQL92.  However, we've recommended
> constructions like "default text 'now'" often enough that we might
> not be able to get away with that.
> 
> 3. Simplify the AexprConst rule to only allow a subset of Typename
> --- it looks like forbidding array types in this context is enough.
> (You could still write a cast using :: or AS, of course, instead of
> "int4[3] '{1,2,3}'".  The latter has never worked anyway.)
> 
> I'm leaning to choice #3, but I wonder if anyone has a better idea.

Yes, if it is easy, #3 sounds good.  This is a very rarly used area of
the grammer, so any restriction on Arrays and Casting will probably
never be hit by a user, though there are so many users, I am sure
someone will find it soon enough.


--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Arrays versus 'type constant' syntax

From
Thomas Lockhart
Date:
> I can see three ways to proceed:
> 1. Forget about making arrays easier to use.
> 2. Remove "AexprConst ::= Typename Sconst" from the grammar.  I do
> not believe this rule is in SQL92.  However, we've recommended
> constructions like "default text 'now'" often enough that we might
> not be able to get away with that.

Sorry, this *is* SQL92 syntax. The older Postgres syntax using
"::typename" is also supported, but is not standard anything, so I've
been trying to move examples, etc. to the standard syntax when I can.

> 3. Simplify the AexprConst rule to only allow a subset of Typename
> --- it looks like forbidding array types in this context is enough.
> (You could still write a cast using :: or AS, of course, instead of
> "int4[3] '{1,2,3}'".  The latter has never worked anyway.)
> I'm leaning to choice #3, but I wonder if anyone has a better idea.

I don't have a strong opinion about what #3 would introduce as far as
future constraints.
                      - Thomas

-- 
Thomas Lockhart                lockhart@alumni.caltech.edu
South Pasadena, California


Re: Arrays versus 'type constant' syntax

From
Tom Lane
Date:
Thomas Lockhart <lockhart@alumni.caltech.edu> writes:
>> 2. Remove "AexprConst ::= Typename Sconst" from the grammar.  I do
>> not believe this rule is in SQL92.

> Sorry, this *is* SQL92 syntax.

I've just grepped the SQL92 spec in some detail, and I see noplace
that allows "typename stringconstant".  "::" is indeed not standard,
but the only type conversion syntax I see in the spec isCAST (value AS type)

If I'm missing something, please cite chapter and verse.

>> 3. Simplify the AexprConst rule to only allow a subset of Typename
>> --- it looks like forbidding array types in this context is enough.
>> (You could still write a cast using :: or AS, of course, instead of
>> "int4[3] '{1,2,3}'".  The latter has never worked anyway.)
>> I'm leaning to choice #3, but I wonder if anyone has a better idea.

> I don't have a strong opinion about what #3 would introduce as far as
> future constraints.

If "typename stringconstant" actually is standard then we have a
problem, because I would not like to forbid array types in a standard
construct.  But the grammar is not LALR(1) in the presence of array
types, so we may not have much choice...
        regards, tom lane


Re: Arrays versus 'type constant' syntax

From
Thomas Lockhart
Date:
> >> 2. Remove "AexprConst ::= Typename Sconst" from the grammar.  I do
> >> not believe this rule is in SQL92.
> > Sorry, this *is* SQL92 syntax.
> I've just grepped the SQL92 spec in some detail, and I see noplace
> that allows "typename stringconstant".  "::" is indeed not standard,
> but the only type conversion syntax I see in the spec is
>         CAST (value AS type)
> If I'm missing something, please cite chapter and verse.

Well, ahem, er...

It isn't an explicit general construct in SQL92, since there are only
a few data types defined in the language, and since type extensibility
is not supported.

However, the language does define syntax for specifying date/time
literals (the only string-like literal which is not a string type) and
that would seem to suggest the general solution. 

Allowed in SQL92 (according to my 2 reference books, and I may have
missed more info):

'Bastille Day' -- string literal
DATE '7/14/1999' -- date literal
TIMESTAMP '7/14/1999 09:47' -- date/time literal
TIME '09:47' -- time literal

SQL3 should have more to say on the subject, and does, but I've got
old versions of draft docs and have (so far) only found brief mention
of ADTs etc. Perhaps they intend the CAST construct to cover this, but
istm that it isn't a natural extension of the older forms mentioned
above.
                      - Thomas

-- 
Thomas Lockhart                lockhart@alumni.caltech.edu
South Pasadena, California


Re: [HACKERS] Re: Arrays versus 'type constant' syntax

From
Thomas Lockhart
Date:
btw, in a different context the "type string" form is allowed since
 _charset 'literal'

specifies the character set for a literal string; the leading
underscore is required by SQL92 in this context so isn't exactly
equivalent to the general case Postgres currently allows.
                   - Thomas

-- 
Thomas Lockhart                lockhart@alumni.caltech.edu
South Pasadena, California


Re: [HACKERS] Re: Arrays versus 'type constant' syntax

From
Tom Lane
Date:
Thomas Lockhart <lockhart@alumni.caltech.edu> writes:
> Well, ahem, er...
> It isn't an explicit general construct in SQL92, since there are only
> a few data types defined in the language, and since type extensibility
> is not supported.
> However, the language does define syntax for specifying date/time
> literals (the only string-like literal which is not a string type) and
> that would seem to suggest the general solution. 

Hmm.  OK, then, we're stuck with a tradeoff that (fortunately) only
affects arrays.  Is it better to force subscripted column names to be
fully qualified "table.column[subscripts]" (the current situation),
or to allow bare column names to be subscripted at the cost of requiring
casts from string constants to array types to use the long-winded CAST
notation (or nonstandard :: notation)?

I would guess that the cast issue comes up *far* less frequently than
subscripting, so we'd be better off changing the behavior.  But the
floor is open for discussion.

I have this change implemented and tested here, btw, but I won't check
it in until I see if there are objections...
        regards, tom lane