Thread: Re: [COMMITTERS] pgsql: Arrange for quote_identifier() and pg_dump to not quote keywords
Re: [COMMITTERS] pgsql: Arrange for quote_identifier() and pg_dump to not quote keywords
From
Gregory Stark
Date:
"Tom Lane" <tgl@postgresql.org> writes: > For the moment, lie about WITH's status in the table so it will still get > quoted --- this is because of the expectation that WITH will become reserved > when the SQL recursive-queries patch gets done. Out of curiosity I'm checking what consequences some other future grammer changes might have for us. Today I checked out full spec compliant GROUP BY syntax including ROLLUP, CUBE, and GROUPING SETS. There are two conclusions of note: 1) ROLLUP and CUBE would have to be col_name_keyword keywords. That could be an annoyance for the cube contrib package because it defines a few constructors like cube(float8[]). Youcould still have a type named "cube" but the functions would have to be renamed. Personally I always found "cube" astrange name anyways, I think of this data type more as a n-space vector than a cube anyways. quote_identifier would start quoting "cube" and "rollup" everywhere. My first inclination was that it's probably not necessaryto start preemptively quoting them this release because people are more likely to use them as column names thanfunction names anyways. But perhaps that's not true given the contrib module. 2) Assuming we keep our extension of allowing arbitrary expressions in GROUP BY lists then there is a conflict between ourundecorated row constructor '(' expr_list ')' and the spec which allows parenthesized sublists in the grouping list. I'm not sure this is a real problem though. As near as I can tell the semantics of grouping by a ROW(a,b) and groupingby columns (a,b) as a grouping set element are basically the same anyways. So I think we can just accept any arbitraryexpression including row constructors as what the spec calls an "ordinary grouping set". For what it's worth here's the grammar I get by basically copying the grammar straight out of the spec and then cleaning up the conflicts including making ordinary_grouping_set a straight expr_list as described above: opt_group_set_clause: DISTINCT | ALL | /*EMPTY*/ ; group_clause: GROUP_P BY opt_group_set_clause grouping_element_list | /*EMPTY*/ ; grouping_element_list: grouping_element | grouping_element_list ',' grouping_element ; grouping_element: a_expr | rollup_list | cube_list | grouping_sets_specification | empty_grouping_set ; rollup_list: ROLLUP '(' expr_list ')' ; cube_list: CUBE '(' expr_list ')' ; grouping_sets_specification: GROUPING SETS '(' grouping_set_list ')' ; grouping_set_list: grouping_set | grouping_set_list ',' grouping_set ; grouping_set: a_expr | rollup_list | cube_list | grouping_sets_specification | empty_grouping_set ; empty_grouping_set: '(' ')' ; -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
Re: [COMMITTERS] pgsql: Arrange for quote_identifier() and pg_dump to not quote keywords
From
Gregory Stark
Date:
"Gregory Stark" <stark@enterprisedb.com> writes: > "Tom Lane" <tgl@postgresql.org> writes: > >> For the moment, lie about WITH's status in the table so it will still get >> quoted --- this is because of the expectation that WITH will become reserved >> when the SQL recursive-queries patch gets done. > > Out of curiosity I'm checking what consequences some other future grammer > changes might have for us. And checking OLAP window functions it at first glance at least it seems we would have to make ROWS, WINDOW, RANGE, and PARTITION reserved keywords. Again, I just sucked in the spec's grammar and fixed it up to fit in our grammar. I did make PARTITION BY and ORDER BY take arbitrary expressions as is our general approach. It's possible there may be clever ways around these conflicts, in particular it seems like it might be possible to make PARTITION not a keyword given the nearby BY. However it appears in parentheses which looks like it makes things a bit tricky. Here's the relevant grammar snippet, and I attached the diff for gram.y to parse both window clause and rollup/cube/grouping sets. window_clause: WINDOW window_definition_list | /* EMPTY */ ; window_definition_list: window_definition | window_definition ',' window_definition ; window_definition: ColId AS window_specification ; window_specification: '(' window_specification_details ')' ; window_specification_details: opt_window_name window_partition_clause window_order_clause window_frame_clause ; opt_window_name: ColId | /*EMPTY*/ ; ; window_partition_clause: PARTITION BY expr_list | /* EMPTY */ ; window_order_clause: ORDER BY sortby_list | /* EMPTY */ ; window_frame_clause: window_frame_units window_frame_extent opt_window_frame_exclusion | /* EMPTY */ ; window_frame_units: ROWS | RANGE ; window_frame_extent: window_frame_start | window_frame_between ; window_frame_start: UNBOUNDED PRECEDING | window_frame_preceding | CURRENT_P ROW ; window_frame_preceding: Iconst PRECEDING ; window_frame_between: BETWEEN window_frame_bound AND window_frame_bound ; window_frame_bound: window_frame_start | UNBOUNDED FOLLOWING | window_frame_following ; window_frame_following: Iconst FOLLOWING ; opt_window_frame_exclusion: /*EMPTY*/ | EXCLUDE CURRENT_P ROW | EXCLUDE GROUP_P | EXCLUDE TIES | EXCLUDE NO OTHERS ; -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
Attachment
Re: [COMMITTERS] pgsql: Arrange for quote_identifier() and pg_dump to not quote keywords
From
Gregory Stark
Date:
"Gregory Stark" <stark@enterprisedb.com> writes: > "Tom Lane" <tgl@postgresql.org> writes: > >> For the moment, lie about WITH's status in the table so it will still get >> quoted --- this is because of the expectation that WITH will become reserved >> when the SQL recursive-queries patch gets done. > > Out of curiosity I'm checking what consequences some other future grammer > changes might have for us. Oh, and I forgot OVER (assuming we want to actually use the windows, not just define them). It looks like OVER and also PARTITION can get by as type_func_name_keyword. So that gives us: col_name_keyword: CUBE ROLLUP type_func_name_keyword: OVER PARTITION reserved_keyword: ROWS WINDOW RANGE unreserved_keyword: GROUPING SETS PRECEDING FOLLOWING OTHERS UNBOUNDED TIES EXCLUDE I'm thinking it may make sense to lie about all of these in quote_identifier so that someone who upgrades from 8.2 to 8.3 can then upgrade to 8.4. If we add reserved words in one step then there's no way to upgrade except to rename things before dumping. Any comments? Does anyone else have any pet features from the spec which introduce new syntax they would like to see in Postgres soon? -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
Re: [COMMITTERS] pgsql: Arrange for quote_identifier() and pg_dump to not quote keywords
From
Andrew Dunstan
Date:
Gregory Stark wrote: > I'm thinking it may make sense to lie about all of these in quote_identifier > so that someone who upgrades from 8.2 to 8.3 can then upgrade to 8.4. If we > add reserved words in one step then there's no way to upgrade except to rename > things before dumping. Any comments? > > > I'm confused. Won't pg_dump quote the keywords it knows its postgres version will need quoted? We can't expect it to have knowledge of future requirements for quoting, unless someone really does invent time travel (in which case someone could just go to the future and bring back version 107.6 and save us all the trouble). cheers andrew
Re: [COMMITTERS] pgsql: Arrange for quote_identifier() and pg_dump to not quote keywords
From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes: > I'm confused. Won't pg_dump quote the keywords it knows its postgres > version will need quoted? We can't expect it to have knowledge of future > requirements for quoting, unless someone really does invent time travel > (in which case someone could just go to the future and bring back > version 107.6 and save us all the trouble). Yeah. I'm disinclined to pre-emptively quote things for pie-in-the-sky patches. WITH is already a grammar keyword, so it's not a big deal to tweak things to quote it, but adding a dozen keywords that have zero functionality in the grammar is another thing entirely. Also, the fact that this particular form of the grammar requires reserving the keywords does not prove that there is no way to have the features without that. I'd want to see us try a little harder first. regards, tom lane
Re: [COMMITTERS] pgsql: Arrange for quote_identifier() and pg_dump to not quote keywords
From
Gregory Stark
Date:
"Andrew Dunstan" <andrew@dunslane.net> writes: > Gregory Stark wrote: >> I'm thinking it may make sense to lie about all of these in quote_identifier >> so that someone who upgrades from 8.2 to 8.3 can then upgrade to 8.4. If we >> add reserved words in one step then there's no way to upgrade except to rename >> things before dumping. Any comments? > > I'm confused. Won't pg_dump quote the keywords it knows its postgres version > will need quoted? We can't expect it to have knowledge of future requirements > for quoting Well as far as standard syntax from the spec we can. But you're right, as Heikki pointed out offline, the approved upgrade path is to use the pg_dump from the target version to dump the old database. It may still be worth telling one version of Postgres about anticipated keywords prior to a release which adds them so that people who don't follow instructions and try a dump generated with an old pg_dump have a fighting chance. Besides, what's a person to do if they have a dump lying around? Restore it in an old database so they can re-dump it with the new pg_dump? That said, this would only bite someone who is using columns named "cube" or "rows" *and* uses the old pg_dump or tries to restore a dump they already have without re-dumping with the new pg_dump. It may just not be worth worrying about. > unless someone really does invent time travel (in which case someone could > just go to the future and bring back version 107.6 and save us all the > trouble). That would revolutionize the IP law field.... -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
Re: [COMMITTERS] pgsql: Arrange for quote_identifier() and pg_dump to not quote keywords
From
Andrew Dunstan
Date:
Gregory Stark wrote: > It may still be worth telling one version of Postgres about anticipated > keywords prior to a release which adds them so that people who don't follow > instructions and try a dump generated with an old pg_dump have a fighting > chance. Besides, what's a person to do if they have a dump lying around? > Restore it in an old database so they can re-dump it with the new pg_dump? > > That is in fact the only thing we fairly much guarantee to work. cheers andrew
Re: [COMMITTERS] pgsql: Arrange for quote_identifier() and pg_dump to not quote keywords
From
Gregory Stark
Date:
"Tom Lane" <tgl@sss.pgh.pa.us> writes: > Yeah. I'm disinclined to pre-emptively quote things for pie-in-the-sky > patches. WITH is already a grammar keyword, so it's not a big deal to > tweak things to quote it, but adding a dozen keywords that have zero > functionality in the grammar is another thing entirely. Sure. > Also, the fact that this particular form of the grammar requires > reserving the keywords does not prove that there is no way to have the > features without that. I'd want to see us try a little harder first. At least some of them are unavoidable conflicts: select a,b,count(*) from tab GROUP BY cube(a,b) select a,b,count(*) from tab GROUP BY rollup(a,b) -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
Re: [COMMITTERS] pgsql: Arrange for quote_identifier() and pg_dump to not quote keywords
From
Tom Lane
Date:
Gregory Stark <stark@enterprisedb.com> writes: > "Tom Lane" <tgl@sss.pgh.pa.us> writes: >> Also, the fact that this particular form of the grammar requires >> reserving the keywords does not prove that there is no way to have the >> features without that. I'd want to see us try a little harder first. > At least some of them are unavoidable conflicts: > select a,b,count(*) from tab GROUP BY cube(a,b) > select a,b,count(*) from tab GROUP BY rollup(a,b) In principle this case could be handled by having parse analysis treat a FuncCall node at the top level of GROUP BY specially when the function name is CUBE etc. I'm not saying that might not be uglier than having the grammar recognize it, just pointing out that there's usually more than one way to skin a cat. regards, tom lane