Thread: generate documentation keywords table automatically
The SQL keywords table in the documentation had until now been generated by me every year by some ad hoc scripting outside the source tree once for each major release. This patch changes it to an automated process. We have the PostgreSQL keywords available in a parseable format in parser/kwlist.h[*]. For the relevant SQL standard versions, keep the keyword lists in new text files. A new script generate-keywords-table.pl pulls it all together and produces a DocBook table. The final output in the documentation should be identical after this change. (Updates for SQL:2016 to come.) -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes: > The SQL keywords table in the documentation had until now been generated > by me every year by some ad hoc scripting outside the source tree once > for each major release. This patch changes it to an automated process. Didn't test this, but +1 for the concept. Would it make more sense to have just one source file per SQL standard version, and distinguish the keyword types by labels within the file? The extreme version of that would be to format the standards-info files just like parser/kwlist.h, which perhaps would even save a bit of parsing code in the Perl script. I don't insist you have to go that far, but lists of keywords-and-categories seem to make sense. The thing in the back of my mind here is that at some point the SQL standard might have more than two keyword categories. What you've got here would take some effort to handle that, whereas it'd be an entirely trivial data change in the scheme I'm thinking of. A policy issue, independent of this mechanism, is how many different SQL spec versions we want to show in the table. HEAD currently shows just three (2011, 2008, SQL92), and it doesn't look to me like the table can accommodate more than one or at most two more columns without getting too wide for most output formats. We could buy back some space by making the "cannot be X" annotations for PG keywords more compact, but I fear that'd still not be enough for the seven spec versions you propose to show in this patch. (And, presumably, the committee's not done.) Can we pick a different table layout? regards, tom lane
On 2019-04-27 17:25, Tom Lane wrote: > Would it make more sense to have just one source file per SQL standard > version, and distinguish the keyword types by labels within the file? The way I have written it, the lists can be compared directly with the relevant standards by a human. Otherwise we'd need another level of tooling to compose and verify those lists. > A policy issue, independent of this mechanism, is how many different > SQL spec versions we want to show in the table. We had previously established that we want to show 92 and the latest two. I don't propose to change that. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 4/29/19 2:45 PM, Peter Eisentraut wrote: >> A policy issue, independent of this mechanism, is how many different >> SQL spec versions we want to show in the table. > > We had previously established that we want to show 92 and the latest > two. I don't propose to change that. An annoying API requirement imposed by the JDBC spec applies to its method DatabaseMetaData.getSQLKeywords(): It is required to return a list of the keywords supported by the DBMS that are NOT also SQL:2003 keywords. [1] Why? I have no idea. Were the JDBC spec authors afraid of infringing ISO copyright if they specified a method that just returns all the keywords? So instead they implicitly require every JDBC developer to know just what all the SQL:2003 keywords are, to make any practical use of the JDBC method that returns only the keywords that aren't those. To make it even goofier, the requirement in the JDBC spec has changed (once, that I know of). It has been /all the keywords not in SQL:2003/ since JDBC 4 / Java SE 6 [2], but before that, it (the same method!) was spec'd to return /all the keywords not in SQL92/. [3] So the ideal JDBC developer will know (a) exactly what keywords are SQL92, (b) exactly what keywords are SQL:2003, and (c) which JDBC version the driver in use is implementing (though, mercifully, drivers from pre-4.0 should be rare by now). If the reorganization happening in this thread were to make possible run-time-enumerable keyword lists that could be filtered for SQL92ness or SQL:2003ness, that might relieve an implementation headache that, at present, both PgJDBC and PL/Java have to deal with. Regards, -Chap [1] https://docs.oracle.com/en/java/javase/12/docs/api/java.sql/java/sql/DatabaseMetaData.html#getSQLKeywords() [2] https://docs.oracle.com/javase/6/docs/api/index.html?overview-summary.html [3] https://docs.oracle.com/javase/1.5.0/docs/api/index.html?overview-summary.html
On 2019-04-29 21:19, Chapman Flack wrote: > If the reorganization happening in this thread were to make possible > run-time-enumerable keyword lists that could be filtered for SQL92ness > or SQL:2003ness, that might relieve an implementation headache that, > at present, both PgJDBC and PL/Java have to deal with. Good information, but probably too big of a change at this point. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services