Keyword classifications - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Keyword classifications |
Date | |
Msg-id | 20197.1451592838@sss.pgh.pa.us Whole thread Raw |
Responses |
Re: Keyword classifications
Re: Keyword classifications |
List | pgsql-hackers |
I did a bit of initial poking at the problem complained of in bug #13840, namely that if you use "none" as the value of a reloption, you get a syntax error when trying to reload a dump containing the table or index declaration. The submitter blames this on pg_dump but it is surely not pg_dump's fault; it's just printing what pg_get_indexdef() gave it. And what pg_get_indexdef() prints is just verbatim what is in each array element of pg_class.reloptions. Now, one line of thought here is that flatten_reloptions() is out of its mind to not be worrying about quoting the reloption values. And perhaps it is, but I think if we go that direction, we may be fighting similar fires for awhile to come. psql's describe.c, for example, doesn't worry about quoting anything when printing reloptions, and there's likely similar code in third-party clients. Also, a solution like this would do nothing for existing dump files. The other line of thought is that we're already making an effort to allow any keyword to appear as the value of a def_arg, and maybe we should try to make that work 100% instead of only 90%. The existing code in gram.y is: def_arg: func_type { $$ = (Node *)$1; } | reserved_keyword { $$ = (Node *)makeString(pstrdup($1));} | qual_all_Op { $$ = (Node *)$1; } | NumericOnly { $$ = (Node *)$1; } | Sconst { $$ = (Node *)makeString($1); } ; so we already allow any keyword that can be a type name, as well as all fully reserved keywords. The only thing that's missing is col_name_keyword (which, not coincidentally, includes NONE). Now, if you just try to add a production for col_name_keyword like the one for reserved_keyword, you get a bunch of reduce-reduce errors because col_name_keyword includes several words that can also be type names, and Bison can't decide whether those should go through the func_type or col_name_keyword paths. But some experimentation suggests that we could fix that by subdividing col_name_keyword into two categories, one being the keywords that can also be type names (BIGINT, BIT, etc) and one being those that can't (BETWEEN, COALESCE, etc). Everywhere col_name_keyword is used now, just mention both sub-categories. And in def_arg, add a production for only the second sub-category. I think such a categorization would actually be cleaner than what we have now; if you read the comments for col_name_keyword, they're pretty squishy about whether these keywords can be type or function names or not, and this subdivision would make that a little clearer. Interestingly, it also seems that the grammar tables become slightly smaller, suggesting that Bison also finds this more regular. However, we have a problem, which is that in the distant past somebody had the bright idea of exposing the keyword categories to userspace via pg_get_keywords(). That means we have to either add a new category code to pg_get_keywords' output, or have it print the same code for both of these new categories, neither of which seems exactly ideal. Another question is whether we'd risk back-patching such a change. It'd be relatively self-contained in terms of our own code, but I'm wondering whether there's any third-party code with dependencies on what keyword categories exist. Thoughts? regards, tom lane
pgsql-hackers by date: