Thread: Arrays versus 'type constant' syntax
I spent some time today trying to persuade the grammar to accept unadorned array subscripting, ieSELECT arraycolname[2] FROM table; rather than what you have to do in 6.5:SELECT table.arraycolname[2] FROM table; It's easy enough to add "opt_indirection" to the rules that use ColId, but I find one ends up with a bunch of reduce/reduce conflicts. The basic problem is that at the start of an expression, the inputident [ could be the beginning of a Typename with subscripts, or it could be a column name with subscripts. The way the grammar is constructed, the parser has to reduce the ident to either ColId or a typename nonterminal before it can shift the '[' ... and there's no way to decide which. Now how did Typename get into the picture? There is one rule that is the culprit, namely "AexprConst ::= Typename Sconst". Without that rule, a type name never appears at the start of an expression so there is no conflict. I can see three ways to proceed: 1. Forget about making arrays easier to use. 2. Remove "AexprConst ::= Typename Sconst" from the grammar. I do not believe this rule is in SQL92. However, we've recommended constructions like "default text 'now'" often enough that we might not be able to get away with that. 3. Simplify the AexprConst rule to only allow a subset of Typename --- it looks like forbidding array types in this context is enough. (You could still write a cast using :: or AS, of course, instead of "int4[3] '{1,2,3}'". The latter has never worked anyway.) I'm leaning to choice #3, but I wonder if anyone has a better idea. regards, tom lane
> I spent some time today trying to persuade the grammar to accept > unadorned array subscripting, ie > SELECT arraycolname[2] FROM table; > rather than what you have to do in 6.5: > SELECT table.arraycolname[2] FROM table; > > It's easy enough to add "opt_indirection" to the rules that use ColId, > but I find one ends up with a bunch of reduce/reduce conflicts. You know, that has been on the TODO list for a long time, so I should have guessed it was some tricky problem. > The basic problem is that at the start of an expression, the input > ident [ > could be the beginning of a Typename with subscripts, or it could be > a column name with subscripts. The way the grammar is constructed, > the parser has to reduce the ident to either ColId or a typename > nonterminal before it can shift the '[' ... and there's no way to > decide which. This reminds me of C grammar, where the scanner has to be able to ask the grammar if a token is a type or not, because typedef can create its own types. This is why C grammar/scanning is not totally simple. We have avoided that complexity so far. > Now how did Typename get into the picture? There is one rule that > is the culprit, namely "AexprConst ::= Typename Sconst". Without > that rule, a type name never appears at the start of an expression > so there is no conflict. That is quite interesting. > I can see three ways to proceed: > > 1. Forget about making arrays easier to use. > > 2. Remove "AexprConst ::= Typename Sconst" from the grammar. I do > not believe this rule is in SQL92. However, we've recommended > constructions like "default text 'now'" often enough that we might > not be able to get away with that. > > 3. Simplify the AexprConst rule to only allow a subset of Typename > --- it looks like forbidding array types in this context is enough. > (You could still write a cast using :: or AS, of course, instead of > "int4[3] '{1,2,3}'". The latter has never worked anyway.) > > I'm leaning to choice #3, but I wonder if anyone has a better idea. Yes, if it is easy, #3 sounds good. This is a very rarly used area of the grammer, so any restriction on Arrays and Casting will probably never be hit by a user, though there are so many users, I am sure someone will find it soon enough. -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
> I can see three ways to proceed: > 1. Forget about making arrays easier to use. > 2. Remove "AexprConst ::= Typename Sconst" from the grammar. I do > not believe this rule is in SQL92. However, we've recommended > constructions like "default text 'now'" often enough that we might > not be able to get away with that. Sorry, this *is* SQL92 syntax. The older Postgres syntax using "::typename" is also supported, but is not standard anything, so I've been trying to move examples, etc. to the standard syntax when I can. > 3. Simplify the AexprConst rule to only allow a subset of Typename > --- it looks like forbidding array types in this context is enough. > (You could still write a cast using :: or AS, of course, instead of > "int4[3] '{1,2,3}'". The latter has never worked anyway.) > I'm leaning to choice #3, but I wonder if anyone has a better idea. I don't have a strong opinion about what #3 would introduce as far as future constraints. - Thomas -- Thomas Lockhart lockhart@alumni.caltech.edu South Pasadena, California
Thomas Lockhart <lockhart@alumni.caltech.edu> writes: >> 2. Remove "AexprConst ::= Typename Sconst" from the grammar. I do >> not believe this rule is in SQL92. > Sorry, this *is* SQL92 syntax. I've just grepped the SQL92 spec in some detail, and I see noplace that allows "typename stringconstant". "::" is indeed not standard, but the only type conversion syntax I see in the spec isCAST (value AS type) If I'm missing something, please cite chapter and verse. >> 3. Simplify the AexprConst rule to only allow a subset of Typename >> --- it looks like forbidding array types in this context is enough. >> (You could still write a cast using :: or AS, of course, instead of >> "int4[3] '{1,2,3}'". The latter has never worked anyway.) >> I'm leaning to choice #3, but I wonder if anyone has a better idea. > I don't have a strong opinion about what #3 would introduce as far as > future constraints. If "typename stringconstant" actually is standard then we have a problem, because I would not like to forbid array types in a standard construct. But the grammar is not LALR(1) in the presence of array types, so we may not have much choice... regards, tom lane
> >> 2. Remove "AexprConst ::= Typename Sconst" from the grammar. I do > >> not believe this rule is in SQL92. > > Sorry, this *is* SQL92 syntax. > I've just grepped the SQL92 spec in some detail, and I see noplace > that allows "typename stringconstant". "::" is indeed not standard, > but the only type conversion syntax I see in the spec is > CAST (value AS type) > If I'm missing something, please cite chapter and verse. Well, ahem, er... It isn't an explicit general construct in SQL92, since there are only a few data types defined in the language, and since type extensibility is not supported. However, the language does define syntax for specifying date/time literals (the only string-like literal which is not a string type) and that would seem to suggest the general solution. Allowed in SQL92 (according to my 2 reference books, and I may have missed more info): 'Bastille Day' -- string literal DATE '7/14/1999' -- date literal TIMESTAMP '7/14/1999 09:47' -- date/time literal TIME '09:47' -- time literal SQL3 should have more to say on the subject, and does, but I've got old versions of draft docs and have (so far) only found brief mention of ADTs etc. Perhaps they intend the CAST construct to cover this, but istm that it isn't a natural extension of the older forms mentioned above. - Thomas -- Thomas Lockhart lockhart@alumni.caltech.edu South Pasadena, California
btw, in a different context the "type string" form is allowed since _charset 'literal' specifies the character set for a literal string; the leading underscore is required by SQL92 in this context so isn't exactly equivalent to the general case Postgres currently allows. - Thomas -- Thomas Lockhart lockhart@alumni.caltech.edu South Pasadena, California
Thomas Lockhart <lockhart@alumni.caltech.edu> writes: > Well, ahem, er... > It isn't an explicit general construct in SQL92, since there are only > a few data types defined in the language, and since type extensibility > is not supported. > However, the language does define syntax for specifying date/time > literals (the only string-like literal which is not a string type) and > that would seem to suggest the general solution. Hmm. OK, then, we're stuck with a tradeoff that (fortunately) only affects arrays. Is it better to force subscripted column names to be fully qualified "table.column[subscripts]" (the current situation), or to allow bare column names to be subscripted at the cost of requiring casts from string constants to array types to use the long-winded CAST notation (or nonstandard :: notation)? I would guess that the cast issue comes up *far* less frequently than subscripting, so we'd be better off changing the behavior. But the floor is open for discussion. I have this change implemented and tested here, btw, but I won't check it in until I see if there are objections... regards, tom lane