Thread: genbki.pl not quoting keywords in postgres.bki output

genbki.pl not quoting keywords in postgres.bki output

From
Mark Dilger
Date:
Hackers,

There are not yet any examples in the postgres sources where this
oversight causes problems, but in genbki.pl, where it makes the
decision whether to quote a token:

        # Quote value if needed.  We need not quote values that satisfy
        # the "id" pattern in bootscanner.l, currently "[-A-Za-z0-9_]+".
        $bki_value = sprintf(qq'"%s"', $bki_value)
          if length($bki_value) == 0
          or $bki_value =~ /[^-A-Za-z0-9_]/;

it should also quote anything that is a keyword, such as "open", as
otherwise you get a syntax error during initdb.

It might be nice to quote the keywords as a defense against this in
future catalog changes in the public sources. As an example, try adding
the following:

diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 66c6c224a8..d48aeb4fd3 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -10202,4 +10202,8 @@
   proisstrict => 'f', prorettype => 'bool', proargtypes => 'oid int4 int4 any',
   proargmodes => '{i,i,i,v}', prosrc => 'satisfies_hash_partition' },

+{ oid => '9999',
+  proname => 'open', prolang => '14', prorettype => 'bool',
+  proargtypes => 'text',
+  prosrc => 'select true' }, # prosrc doesn't matter for this example of genbki behavior
 ]


Perhaps you don't think that is a big enough problem to be worth
guarding against.

This patch is not that complicated, but it does create a new coding
requirement to keep bootparse.y and genbki.pl from getting out of sync.
It might be simpler to just change genbki.pl to quote everything rather
than applying this patch.  I don't have an opinion on that.

mark


Attachment

Re: genbki.pl not quoting keywords in postgres.bki output

From
Tom Lane
Date:
Mark Dilger <hornschnorter@gmail.com> writes:
> There are not yet any examples in the postgres sources where this
> oversight causes problems, but in genbki.pl, where it makes the
> decision whether to quote a token: ...
> it should also quote anything that is a keyword, such as "open", as
> otherwise you get a syntax error during initdb.

Good point.  Up to now, I'd have written that off as one of the many
undocumented gotchas involved in construction of DATA lines.  However,
I think we've made a conscious effort in the bootstrap conversion work
to eliminate undocumented special cases, so we oughta do something to
make such cases work without extra hacking.

> This patch is not that complicated, but it does create a new coding
> requirement to keep bootparse.y and genbki.pl from getting out of sync.
> It might be simpler to just change genbki.pl to quote everything rather
> than applying this patch.  I don't have an opinion on that.

I don't like adding a lot of unnecessary quoting to the .bki file.

The patch as you have it isn't that awful from a maintenance perspective;
I don't think we've added new keywords to bootparse.y very often, so we
could surely cope with keeping genbki.pl in sync.

However, really this ought to be solved in bootparse.y itself, I think:
it should be possible to have it treat keywords like identifiers in INSERT
commands, just as (most) keywords aren't reserved in the main grammar.
Let me go poke at that idea.  If it doesn't work nicely I'll use your
patch.

            regards, tom lane