Thread: Exposing keywords to clients
Hi, The attached patch implements a new function, pg_get_keywords(), which returns a set of records describing the keywords recognised by the server. This allows clients such as pgAdmin to get quoting rules correct, and helps with other tasks such as syntax highlighting where we need to support multiple server versions. Example output (edited of course): postgres=# select * from pg_get_keywords(); word | category -------------------+----------------------- all | Reserved binary | Type or function name xmlserialize | Column name zone | Unreserved (372 rows) I wasn't sure about the best way to describe the categories - obviously they need to be non-translatable (for client software to interpret), but human readable is also nice. I'm happy to hear alternate suggestions. -- Dave Page EnterpriseDB UK: http://www.enterprisedb.com
Attachment
Dave Page wrote: > Hi, > > The attached patch implements a new function, pg_get_keywords(), which > returns a set of records describing the keywords recognised by the > server. This allows clients such as pgAdmin to get quoting rules > correct, and helps with other tasks such as syntax highlighting where > we need to support multiple server versions. FWIW pg_dump has fmtId() which does something related. I think it's a bit bogus to be using the list as compiled client-side, precisely due to the theoretical chance that it could change from one server version to the next, but it's probably not very likely that we ever remove a keyword from the server grammar. And highlighting a keyword that's not really a keyword is unlikely to be a problem in practice -- in fact it makes it obvious that the user is likely to be in trouble later when they upgrade. > postgres=# select * from pg_get_keywords(); > word | category > -------------------+----------------------- > all | Reserved > binary | Type or function name > xmlserialize | Column name > zone | Unreserved > (372 rows) > > I wasn't sure about the best way to describe the categories - > obviously they need to be non-translatable (for client software to > interpret), but human readable is also nice. I'm happy to hear > alternate suggestions. Perhaps use a separate string for machine parse (say R, T, C, U), and let the string be translatable. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Alvaro Herrera <alvherre@commandprompt.com> writes: > FWIW pg_dump has fmtId() which does something related. > I think it's a bit bogus to be using the list as compiled client-side, > precisely due to the theoretical chance that it could change from one > server version to the next, but it's probably not very likely that we > ever remove a keyword from the server grammar. Actually, it's 100% intentional that pg_dump does it that way --- I would not support modifying it to use this function (even if it existed in the back branches). The reason is exactly that pg_dump wants to generate output that is correct for its own PG version, not that of the server it's dumping from. The tradeoffs are probably different for pgAdmin, but it is important to realize that either way might be the best thing for a particular case. regards, tom lane
On Sat, May 3, 2008 at 12:24 AM, Alvaro Herrera <alvherre@commandprompt.com> wrote: > Dave Page wrote: > > Hi, > > > > The attached patch implements a new function, pg_get_keywords(), which > > returns a set of records describing the keywords recognised by the > > server. This allows clients such as pgAdmin to get quoting rules > > correct, and helps with other tasks such as syntax highlighting where > > we need to support multiple server versions. > > FWIW pg_dump has fmtId() which does something related. > > I think it's a bit bogus to be using the list as compiled client-side, > precisely due to the theoretical chance that it could change from one > server version to the next, but it's probably not very likely that we > ever remove a keyword from the server grammar. And highlighting a > keyword that's not really a keyword is unlikely to be a problem in > practice -- in fact it makes it obvious that the user is likely to be in > trouble later when they upgrade. Yeah, we currently lift a copy of keywords.c into the pgAdmin source and use that in our own qtIdent() function, but it's clearly only correct for the version of Postgres we pull it from, whilst pgAdmin supports back to 7.3 in current releases. It's also a pain because we try to support some of the derivative servers like those offered by Greenplum and EnterpriseDB which may have additional keywords (though that obviously is not a problem for this community). > > postgres=# select * from pg_get_keywords(); > > word | category > > -------------------+----------------------- > > all | Reserved > > binary | Type or function name > > xmlserialize | Column name > > zone | Unreserved > > (372 rows) > > > > I wasn't sure about the best way to describe the categories - > > obviously they need to be non-translatable (for client software to > > interpret), but human readable is also nice. I'm happy to hear > > alternate suggestions. > > Perhaps use a separate string for machine parse (say R, T, C, U), and > let the string be translatable. I considered that, but wasn't sure if folks would like the redundancy. It's trivial to do of course - any objections? -- Dave Page EnterpriseDB UK: http://www.enterprisedb.com
Dave Page wrote: > > Perhaps use a separate string for machine parse (say R, T, C, U), and > > let the string be translatable. > > I considered that, but wasn't sure if folks would like the redundancy. > It's trivial to do of course - any objections? Is there anything useful you would do with this information? Or would you just quote all listed words anyway?
Peter Eisentraut <peter_e@gmx.net> writes: > Dave Page wrote: >>> Perhaps use a separate string for machine parse (say R, T, C, U), and >>> let the string be translatable. >> >> I considered that, but wasn't sure if folks would like the redundancy. >> It's trivial to do of course - any objections? > Is there anything useful you would do with this information? Or would you > just quote all listed words anyway? I think the practical application would be to avoid quoting unreserved keywords, as pg_dump for instance already does. I doubt anyone would bother distinguishing the different types of partially/wholly reserved words. So maybe a boolean would be sufficient --- but I have nothing against the R/T/C/U suggestion. A more radical alternative is just to omit unreserved words from the view altogether. regards, tom lane
On Sat, May 3, 2008 at 6:12 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Peter Eisentraut <peter_e@gmx.net> writes: > > Is there anything useful you would do with this information? Or would you > > just quote all listed words anyway? Currently, yes, we just quote all listed words. > I think the practical application would be to avoid quoting unreserved > keywords, as pg_dump for instance already does. I doubt anyone would > bother distinguishing the different types of partially/wholly reserved > words. So maybe a boolean would be sufficient --- but I have nothing > against the R/T/C/U suggestion. > > A more radical alternative is just to omit unreserved words from the > view altogether. Well my thinking is that it costs nothing extra bar a dozen lines of code to include the info now in case it's useful in the future, if not for pgAdmin, then maybe for Lightning Admin or one of the other tools. If a need for it arises in the future and we haven't included it now we'll either want to add a second function or break compatibility both of which strike me as a lot more objectionable than doing it all now. Attached is an updated patch, giving the following output. The catdesc column can be translated. postgres=# select * from pg_get_keywords(); word | catcode | catdesc -------------------+---------+----------------------- abort | U | Unreserved all | R | Reserved bigint | C | Column name binary | T | Type or function name -- Dave Page EnterpriseDB UK: http://www.enterprisedb.com
Attachment
"Dave Page" <dpage@pgadmin.org> writes: > Attached is an updated patch, giving the following output. The catdesc > column can be translated. Documentation has got a couple of problems: > + contains the keyword, the <structfield>catcode</> column contains a > + category code of 'U' for unknown, 'C' for column name, 'T' for type or > + function name or 'U' for unreserved. The <structfield>catdesc</> > + column contains a localised stribg describing the category. I wouldn't document the "unknown" case at all, much less document it incorrectly, and you forgot the "reserved" case, and "stribg"? regards, tom lane
"Dave Page" <dpage@pgadmin.org> writes: > Attached is an updated patch, giving the following output. Oh, one other thing: dropping externs into random modules unrelated to their source module is completely awful programming style, because there is nothing preventing incompatible declarations. Put those externs in keywords.h instead. I suspect you have ignored a compiler warning about not declaring pg_get_keywords itself, too --- it should be extern'd in builtins.h. regards, tom lane
On Sat, May 3, 2008 at 9:06 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > "Dave Page" <dpage@pgadmin.org> writes: > > Attached is an updated patch, giving the following output. > > Oh, one other thing: dropping externs into random modules unrelated to > their source module is completely awful programming style, because there > is nothing preventing incompatible declarations. Put those externs in > keywords.h instead. OK. > I suspect you have ignored a compiler warning > about not declaring pg_get_keywords itself, too --- it should be > extern'd in builtins.h. No, no warning (I'm using VC++ today) - but fixed anyway. Update attached, including corrected docs. Note to self - proof read docs *after* putting the kids to bed in future. -- Dave Page EnterpriseDB UK: http://www.enterprisedb.com
Attachment
Hi,
Here are some comments from me:
* doc/src/sgml/func.sgml
a) Changed "localised" to "localized" to be consistent with the references elsewhere in the same file.
* src/backend/utils/adt/misc.c
b) I wonder if we need the default case in the switch statement at all, since we are scanning the statically populated ScanKeywords array with proper category values for each entry.
c) There was a warning during compilation since we were assigning a const pointer to a char pointer
values[0] = ScanKeywords[funcctx->call_cntr].name;
* src/include/catalog/pg_proc.h
d) oid 2700 has been claimed by another function in the meanwhile. Modified it to 2701.
DATA(insert OID = 2701 ( pg_get_keywords PGNSP PGUID 12 10 400 f f t t s 0 2249
e) I was wondering why pronargs is set to 0 above. But I see other functions doing the same, so its ok I guess for such kinds of usages.
PFA, version 4 of this patch with a,c and d taken care of.
Regards,
Nikhils
--
EnterpriseDB http://www.enterprisedb.com
> > Attached is an updated patch, giving the following output.OK.
>
> Oh, one other thing: dropping externs into random modules unrelated to
> their source module is completely awful programming style, because there
> is nothing preventing incompatible declarations. Put those externs in
> keywords.h instead.No, no warning (I'm using VC++ today) - but fixed anyway.
> I suspect you have ignored a compiler warning
> about not declaring pg_get_keywords itself, too --- it should be
> extern'd in builtins.h.
Update attached, including corrected docs. Note to self - proof read
docs *after* putting the kids to bed in future.
Here are some comments from me:
* doc/src/sgml/func.sgml
a) Changed "localised" to "localized" to be consistent with the references elsewhere in the same file.
* src/backend/utils/adt/misc.c
b) I wonder if we need the default case in the switch statement at all, since we are scanning the statically populated ScanKeywords array with proper category values for each entry.
c) There was a warning during compilation since we were assigning a const pointer to a char pointer
values[0] = ScanKeywords[funcctx->call_cntr].name;
* src/include/catalog/pg_proc.h
d) oid 2700 has been claimed by another function in the meanwhile. Modified it to 2701.
DATA(insert OID = 2701 ( pg_get_keywords PGNSP PGUID 12 10 400 f f t t s 0 2249
e) I was wondering why pronargs is set to 0 above. But I see other functions doing the same, so its ok I guess for such kinds of usages.
PFA, version 4 of this patch with a,c and d taken care of.
Regards,
Nikhils
--
EnterpriseDB http://www.enterprisedb.com
Attachment
Nikhils <nikkhils@gmail.com> writes: >>> Attached is an updated patch, giving the following output. Applied with minor revisions. > Here are some comments from me: > a) Changed "localised" to "localized" to be consistent with the references > elsewhere in the same file. What I didn't like about the docs was the classification of the function as a "session information function". There's certainly nothing session-dependent about it. I ended up putting it under "System Catalog Information Functions", which isn't really quite right either but it's closer than the other one. > b) I wonder if we need the default case in the switch statement at all, > since we are scanning the statically populated ScanKeywords array with > proper category values for each entry. I left it in for safety, but made it just return nulls --- there's no point in wasting more code space on it than necessary, and certainly no point in asking translators to translate a useless string. > c) There was a warning during compilation since we were assigning a const > pointer to a char pointer > values[0] = ScanKeywords[funcctx->call_cntr].name; Yeah, I couldn't see a good way around that either --- we could pstrdup but that seems overkill, and adjusting the API of BuildTupleFromCStrings to take const char ** doesn't look appetizing either. > d) oid 2700 has been claimed by another function in the meanwhile. Modified > it to 2701. > DATA(insert OID = 2701 ( pg_get_keywords PGNSP PGUID 12 10 400 f f t t s > 0 2249 2701 was claimed too. You need to use the unused_oids script to find unique free OIDs. regards, tom lane