Thread: BUG #13840: pg_dump generates unloadable SQL when third party string type index option is used

The following bug has been logged on the website:

Bug reference:      13840
Logged by:          Kouhei Sutou
Email address:      kou@clear-code.com
PostgreSQL version: 9.4.5
Operating system:   Debian GNU/Linux sid
Description:

PostgreSQL supports amindex extension. But pg_dump doesn't support
string type custom index options for amindex extension.

Amindex extension can support custom index options that can be used by
"CREATE INDEX ... WITH (option_name = option_value)" syntax. Amindex
extension uses add_string_reloption() to add a string type option.

PostgreSQL requires quotation for option_value such as:

  CREATE INDEX pgroonga_index ON t
    USING pgroonga (c)
    WITH (normalizer = 'none');

PGroonga(*) is used in the above example.
PGroonga is an amindex extension that adds "normalizer" option.
(*) http://pgroonga.github.io/

If we specify option value without quotation, PostgreSQL reports an
error:

  CREATE TABLE t (c text);
  CREATE INDEX pgroonga_index ON t USING pgroonga (c) WITH (normalizer =
none);
  -- ERROR:  syntax error at or near "none"
  -- LINE 1: ...onga_index ON t USING pgroonga (c) WITH (normalizer =
none);
  --                                                                  ^

pg_dump generates "CREATE INDEX ... WITH ..." without quotation for
option value:

  % pg_dump -d option_test
  ...
  CREATE TABLE t (
      c text
  );
  ...
  CREATE INDEX pgroonga_index ON t USING pgroonga (c) WITH
(normalizer=none);
  ...

"normalizer=none" causes error on restoring the dump.

pg_dump should generates "normalizer='none'" as option to avoid the
error.

FYI1: This problem isn't occurred for built-in options such as
"buffering" option for gist. Because values for "buffering" option are
registered as keywords. PostgreSQL can parse keyword-ed values without
quotation such as "buffering = on".

It's implemented in src/backend/parser/gram.y:

  def_arg:    func_type                        { $$ = (Node *)$1; }
                          | reserved_keyword                { $$ = (Node
*)makeString(pstrdup($1)); }
                          | qual_all_Op                    { $$ = (Node *)$1; }
                          | NumericOnly                    { $$ = (Node *)$1; }
                          | Sconst                        { $$ = (Node *)makeString($1); }

"reserved_keyword" doesn't require quotation.

It means that using PGroonga is easy-to-reproduce this problem. Here
is an install document of PGroonga: http://pgroonga.github.io/install/
Hi,

In <20151231153522.1117.56276@wrigleys.postgresql.org>
  "[BUGS] BUG #13840: pg_dump generates unloadable SQL when third party string type index option is used" on Thu, 31
Dec2015 15:35:22 +0000, 
  kou@clear-code.com wrote:

> Bug reference:      13840
> Logged by:          Kouhei Sutou
> Email address:      kou@clear-code.com
> PostgreSQL version: 9.4.5
> Operating system:   Debian GNU/Linux sid
> Description:
...
> pg_dump generates "CREATE INDEX ... WITH ..." without quotation for
> option value:
>
>   % pg_dump -d option_test
>   ...
>   CREATE TABLE t (
>       c text
>   );
>   ...
>   CREATE INDEX pgroonga_index ON t USING pgroonga (c) WITH (normalizer=none);
>   ...
>
> "normalizer=none" causes error on restoring the dump.
>
> pg_dump should generates "normalizer='none'" as option to avoid the
> error.

I attach a patch.
With this patch, pg_dump generates string index option value
with quote like the following:

  CREATE INDEX pgrn_index ON memos USING pgroonga (title) WITH (normalizer='none');


Thanks,
--
kou
Kouhei Sutou <kou@clear-code.com> writes:
>> pg_dump should generates "normalizer='none'" as option to avoid the
>> error.

> I attach a patch.

Pushed with some adjustments (notably, I thought the quoting rule was
too complicated and not necessarily 100% correct).

Thanks for the report and patch!

            regards, tom lane
Hi,

In <15194.1451680235@sss.pgh.pa.us>
  "Re: [BUGS] BUG #13840: pg_dump generates unloadable SQL when third party string type index option is used" on Fri,
01Jan 2016 15:30:35 -0500, 
  Tom Lane <tgl@sss.pgh.pa.us> wrote:

>>> pg_dump should generates "normalizer='none'" as option to avoid the
>>> error.
>
>> I attach a patch.
>
> Pushed with some adjustments (notably, I thought the quoting rule was
> too complicated and not necessarily 100% correct).

Thanks for merging my patch and backporting to released
series.

I didn't know quote_identifier() and simple_quote_literal()
functions. I learned from your adjustments.


Thanks,
--
kou
Kouhei Sutou <kou@clear-code.com> writes:
>   Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Pushed with some adjustments (notably, I thought the quoting rule was
>> too complicated and not necessarily 100% correct).

> Thanks for merging my patch and backporting to released
> series.

I looked into pg_dump and realized that this fixes only a few of the
problems in this area.  While pg_dump does rely on ruleutils.c to
print reloptions of simple indexes, it does not do that for reloptions
of tables or views, nor for reloptions of indexes that are constraints.
So eventually that's going to bite us on the rear, though I'm not sure
if we have a live problem today.

One could imagine exporting flatten_reloptions via a separate SQL
function, but that could only exist in future releases, so I'm afraid
we're going to have to duplicate the functionality inside pg_dump.

            regards, tom lane
On Sun, Jan 3, 2016 at 2:50 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Kouhei Sutou <kou@clear-code.com> writes:
> >   Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> Pushed with some adjustments (notably, I thought the quoting rule was
> >> too complicated and not necessarily 100% correct).
>
> > Thanks for merging my patch and backporting to released
> > series.
>
> I looked into pg_dump and realized that this fixes only a few of the
> problems in this area.  While pg_dump does rely on ruleutils.c to
> print reloptions of simple indexes, it does not do that for reloptions
> of tables or views, nor for reloptions of indexes that are constraints.
> So eventually that's going to bite us on the rear, though I'm not sure
> if we have a live problem today.

That's actually something that has been discussed some months ago
after trying to add units to those parameters (e23014f3 that got
reverted). At some point I guess we should add those quotes within
pg_class/reloptions directly in each element of its array, the new
per-relation log_autovacuum_min_duration being an extra argument in
favor of it.

> One could imagine exporting flatten_reloptions via a separate SQL
> function, but that could only exist in future releases, so I'm afraid
> we're going to have to duplicate the functionality inside pg_dump.

Yep, that's another option, though it seems to me that we had better
tackle this at its root, this routine doing some manual parsing of
each element.
--
Michael