Thread: BUG #16502: EXPLAIN JSON format adds extra quotes around index names

BUG #16502: EXPLAIN JSON format adds extra quotes around index names

From
PG Bug reporting form
Date:
The following bug has been logged on the website:

Bug reference:      16502
Logged by:          Maciek Sakrejda
Email address:      m.sakrejda@gmail.com
PostgreSQL version: 12.3
Operating system:   Ubuntu 19.10
Description:

When working with a table that needs quotes around its name (e.g., with
spaces in the name), EXPLAIN is inconsistent in handling this between
relation names and index names. A relation name is presented as-is in the
"Relation Name" field, but the "Index Name" field includes the quotes inside
the field value.

Here's an example:

maciek=# create table "needs quotes"(a int);
CREATE TABLE
maciek=# explain (costs off, format json) select * from "needs quotes";
               QUERY PLAN               
----------------------------------------
 [                                     +
   {                                   +
     "Plan": {                         +
       "Node Type": "Seq Scan",        +
       "Parallel Aware": false,        +
       "Relation Name": "needs quotes",+
       "Alias": "needs quotes"         +
     }                                 +
   }                                   +
 ]
(1 row)

maciek=# create index on "needs quotes"(a);
CREATE INDEX
maciek=# set enable_seqscan = off;
SET
maciek=# explain (costs off, format json) select * from "needs quotes";
                    QUERY PLAN                    
--------------------------------------------------
 [                                               +
   {                                             +
     "Plan": {                                   +
       "Node Type": "Bitmap Heap Scan",          +
       "Parallel Aware": false,                  +
       "Relation Name": "needs quotes",          +
       "Alias": "needs quotes",                  +
       "Plans": [                                +
         {                                       +
           "Node Type": "Bitmap Index Scan",     +
           "Parent Relationship": "Outer",       +
           "Parallel Aware": false,              +
           "Index Name": "\"needs quotes_a_idx\""+
         }                                       +
       ]                                         +
     }                                           +
   }                                             +
 ]
(1 row)

Given that there is no ambiguity here because the index or relation name is
the full value of the field, the quotes don't really add anything and I'd
argue they should be dropped.

I ran into this in 12.3, but it looks like it's still the case in HEAD. It
looks like this is because `explain_get_index_name` in `explain.c` always
quotes the index name, regardless of the format being used. I'd send a
patch, but I'm not sure how `explain_get_index_name_hook` should fit into
this.

Thanks,
Maciek


Re: BUG #16502: EXPLAIN JSON format adds extra quotes around index names

From
Euler Taveira
Date:
On Thu, 18 Jun 2020 at 16:50, PG Bug reporting form <noreply@postgresql.org> wrote:

I ran into this in 12.3, but it looks like it's still the case in HEAD. It
looks like this is because `explain_get_index_name` in `explain.c` always
quotes the index name, regardless of the format being used. I'd send a
patch, but I'm not sure how `explain_get_index_name_hook` should fit into
this.


Indeed, relation names should return the same. The fact that explain_get_index_name always calls quote_identifier is the culprit; it should call quote_identifier only when the format is TEXT. Patch is attached. I'm not sure if it should be backpatched to released versions because there should be applications that rely on this format. However, it would be good to backpatch it to v13 too.


--
Euler Taveira                 http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
Euler Taveira <euler.taveira@2ndquadrant.com> writes:
> Indeed, relation names should return the same. The fact that
> explain_get_index_name always calls quote_identifier is the culprit; it
> should call quote_identifier only when the format is TEXT. Patch is
> attached.

Agreed as to the bug, but I think we ought to fix it by redefining
explain_get_index_name's API as "return the bare index name always",
and let the callers apply quoting.  The callers seem to all have
separate code paths for text format already.  Furthermore, if
explain_get_index_name needs to have different behavior for text
format, that requirement propagates to explain_get_index_name_hook
functions --- and I don't think we want to change the signature
for that hook.

            regards, tom lane



I wrote:
> Agreed as to the bug, but I think we ought to fix it by redefining
> explain_get_index_name's API as "return the bare index name always",
> and let the callers apply quoting.

Concretely, as attached.

I am unsure about back-patching this, but am leaning to doing so.

(1) From the perspective of end users, this makes no difference for
text output and fixes a pretty clear bug in non-text formats.  If
we don't fix it then users may be tempted to try to dequote in
application code, which won't work very reliably and will do
completely the wrong thing after the bug is fixed.  You might argue
that some apps might already be applying such dequoting, but
I doubt it; wouldn't they have reported the bug?

(2) From the perspective of extensions using explain_get_index_name_hook,
this is a silent API change: before they were supposed to quote,
now they are not.  That's not great, but I don't think there is any
fix that doesn't involve an API change for hook users.

There's certainly a case to be made that it'd be better if the API
change happened only in v13 and not in minor releases.  But the
consequences of not updating a hook-using extension immediately
wouldn't be that severe --- non-text output is just as it was, and
the double-double-quoting in text output would only be visible if the
extension chooses quote-requiring names for hypothetical indexes.
Even if it were visible it probably would just be ugly, and not
fundamentally break anything.

In any case, I think there are few enough people using index-advisor
extensions that we should not let consideration (2) outweigh
consideration (1).

            regards, tom lane

diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 67bdcb2b27..a131d15ac0 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -1456,7 +1456,8 @@ ExplainNode(PlanState *planstate, List *ancestors,
                 explain_get_index_name(bitmapindexscan->indexid);

                 if (es->format == EXPLAIN_FORMAT_TEXT)
-                    appendStringInfo(es->str, " on %s", indexname);
+                    appendStringInfo(es->str, " on %s",
+                                     quote_identifier(indexname));
                 else
                     ExplainPropertyText("Index Name", indexname, es);
             }
@@ -3267,6 +3268,10 @@ show_eval_params(Bitmapset *bms_params, ExplainState *es)
  *
  * We allow plugins to get control here so that plans involving hypothetical
  * indexes can be explained.
+ *
+ * Note: names returned by this function should be "raw"; the caller will
+ * apply quoting if needed.  Formerly the convention was to do quoting here,
+ * but we don't want that in non-text output formats.
  */
 static const char *
 explain_get_index_name(Oid indexId)
@@ -3279,11 +3284,10 @@ explain_get_index_name(Oid indexId)
         result = NULL;
     if (result == NULL)
     {
-        /* default behavior: look in the catalogs and quote it */
+        /* default behavior: look it up in the catalogs */
         result = get_rel_name(indexId);
         if (result == NULL)
             elog(ERROR, "cache lookup failed for index %u", indexId);
-        result = quote_identifier(result);
     }
     return result;
 }
@@ -3463,7 +3467,7 @@ ExplainIndexScanDetails(Oid indexid, ScanDirection indexorderdir,
     {
         if (ScanDirectionIsBackward(indexorderdir))
             appendStringInfoString(es->str, " Backward");
-        appendStringInfo(es->str, " using %s", indexname);
+        appendStringInfo(es->str, " using %s", quote_identifier(indexname));
     }
     else
     {

Re: BUG #16502: EXPLAIN JSON format adds extra quotes around index names

From
Euler Taveira
Date:
On Sat, 20 Jun 2020 at 19:11, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I wrote:
> Agreed as to the bug, but I think we ought to fix it by redefining
> explain_get_index_name's API as "return the bare index name always",
> and let the callers apply quoting.

Concretely, as attached.


I thought about that but decided to go the other way. I agree that mixing
layers (get and format) is not a good idea.

I am unsure about back-patching this, but am leaning to doing so.


Extension authors can always add an ifdef to handle those cases if (s)he
cares about it. Since this bug report did not come from an index-advisor
extension user and that the code is like this for a decade, I bet that is
safe to backpatch to all supported versions.

BTW, your patch looks good to me.


--
Euler Taveira                 http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: BUG #16502: EXPLAIN JSON format adds extra quotes around index names

From
Maciek Sakrejda
Date:
Thanks for the quick reply. That fix (and reasoning) makes sense to
me. For what it's worth, the patch looks good to me as well.



Re: BUG #16502: EXPLAIN JSON format adds extra quotes around index names

From
Julien Rouhaud
Date:
On Sun, Jun 21, 2020 at 12:11 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> I wrote:
> > Agreed as to the bug, but I think we ought to fix it by redefining
> > explain_get_index_name's API as "return the bare index name always",
> > and let the callers apply quoting.
>
> Concretely, as attached.
>
> I am unsure about back-patching this, but am leaning to doing so.
>
> (1) From the perspective of end users, this makes no difference for
> text output and fixes a pretty clear bug in non-text formats.  If
> we don't fix it then users may be tempted to try to dequote in
> application code, which won't work very reliably and will do
> completely the wrong thing after the bug is fixed.  You might argue
> that some apps might already be applying such dequoting, but
> I doubt it; wouldn't they have reported the bug?
>
> (2) From the perspective of extensions using explain_get_index_name_hook,
> this is a silent API change: before they were supposed to quote,
> now they are not.  That's not great, but I don't think there is any
> fix that doesn't involve an API change for hook users.
>
> There's certainly a case to be made that it'd be better if the API
> change happened only in v13 and not in minor releases.  But the
> consequences of not updating a hook-using extension immediately
> wouldn't be that severe --- non-text output is just as it was, and
> the double-double-quoting in text output would only be visible if the
> extension chooses quote-requiring names for hypothetical indexes.
> Even if it were visible it probably would just be ugly, and not
> fundamentally break anything.
>
> In any case, I think there are few enough people using index-advisor
> extensions that we should not let consideration (2) outweigh
> consideration (1).

It turns out that in hypopg I totally missed that
explain_get_index_name_hook should take care of quoting the indexname.
People only want to know whether the hypothetical index is used or
not, so the lack of correct quoting didn't prevent that (the
hypothetical index name is automatically generated and includes its
oid).  The patch looks good to me, and +1 for backpatch!



Julien Rouhaud <rjuju123@gmail.com> writes:
> It turns out that in hypopg I totally missed that
> explain_get_index_name_hook should take care of quoting the indexname.
> People only want to know whether the hypothetical index is used or
> not, so the lack of correct quoting didn't prevent that (the
> hypothetical index name is automatically generated and includes its
> oid).  The patch looks good to me, and +1 for backpatch!

Ah, I'd wondered whether there might be hook users that "should" be
doing quoting and were not.  So the impact on hypopg would be:

1. Non-text EXPLAIN formats are already properly quoted, and will
remain so.

2. Text-format EXPLAIN output will grow double-quotes around the
hypothetical index names, which are not there at present (I believe,
but didn't test it).  This shouldn't bother human users particularly,
but conceivably it might break hypopg's regression tests?

According to codesearch.debian.net and a Google search, hypopg is the
only active external user of explain_get_index_name_hook.  (I should
have thought to do that search yesterday, but did not.)

            regards, tom lane



Re: BUG #16502: EXPLAIN JSON format adds extra quotes around index names

From
Julien Rouhaud
Date:
On Sun, Jun 21, 2020 at 4:29 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Julien Rouhaud <rjuju123@gmail.com> writes:
> > It turns out that in hypopg I totally missed that
> > explain_get_index_name_hook should take care of quoting the indexname.
> > People only want to know whether the hypothetical index is used or
> > not, so the lack of correct quoting didn't prevent that (the
> > hypothetical index name is automatically generated and includes its
> > oid).  The patch looks good to me, and +1 for backpatch!
>
> Ah, I'd wondered whether there might be hook users that "should" be
> doing quoting and were not.  So the impact on hypopg would be:
>
> 1. Non-text EXPLAIN formats are already properly quoted, and will
> remain so.

Correct.  Simple test using hypothetical index on a "T1" (id integer) table:

=# select * from hypopg_create_index('create index on "T1"(id)');
 indexrelid |      indexname
------------+---------------------
     576402 | <576402>btree_T1_id
(1 row)

Before:

=# explain (format yaml) select * from "T1" where id = 1;
              QUERY PLAN
---------------------------------------
 - Plan:                              +
     Node Type: "Index Only Scan"     +
     Parallel Aware: false            +
     Scan Direction: "Forward"        +
     Index Name: "<576402>btree_T1_id"+
     Relation Name: "T1"              +
     Alias: "T1"                      +
     Startup Cost: 0.04               +
     Total Cost: 8.06                 +
     Plan Rows: 1                     +
     Plan Width: 4                    +
     Index Cond: "(id = 1)"
(1 row)

After:

=# explain (format yaml)select * from "T1" where id = 1;
              QUERY PLAN
--------------------------------------
 - Plan:                             +
     Node Type: "Index Only Scan"    +
     Parallel Aware: false           +
     Scan Direction: "Forward"       +
     Index Name: "<16442>btree_T1_id"+
     Relation Name: "T1"             +
     Alias: "T1"                     +
     Startup Cost: 0.04              +
     Total Cost: 4.06                +
     Plan Rows: 1                    +
     Plan Width: 4                   +
     Index Cond: "(id = 1)"
(1 row)

> 2. Text-format EXPLAIN output will grow double-quotes around the
> hypothetical index names, which are not there at present (I believe,
> but didn't test it).  This shouldn't bother human users particularly,
> but conceivably it might break hypopg's regression tests?

Yes, the patch is working as expected.  Before:

=# explain select * from "T1" where id = 1;
                                     QUERY PLAN
-------------------------------------------------------------------------------------
 Index Only Scan using <576402>btree_T1_id on "T1"  (cost=0.04..8.06
rows=1 width=4)
   Index Cond: (id = 1)
(2 rows)

Patched:

=# explain select * from "T1" where id = 1;
                                      QUERY PLAN
--------------------------------------------------------------------------------------
 Index Only Scan using "<16442>btree_T1_id" on "T1"  (cost=0.04..4.06
rows=1 width=4)
   Index Cond: (id = 1)
(2 rows)

All of the regression tests except the one I added when the fix for
brin hypothetical indexes was applied are still working, as they're
looking for patterns like "Index.*<\d+>${amname}_${relname}" in the
textual explain output.  There are a few fixes not released yet, so
I'll update the broken brin regression test and publish a release soon
to make sure that the packages don't get broken when the new minor
versions are released.



Julien Rouhaud <rjuju123@gmail.com> writes:
> On Sun, Jun 21, 2020 at 4:29 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> 2. Text-format EXPLAIN output will grow double-quotes around the
>> hypothetical index names, which are not there at present (I believe,
>> but didn't test it).  This shouldn't bother human users particularly,
>> but conceivably it might break hypopg's regression tests?

> Yes, the patch is working as expected.  Before:
> ...
> All of the regression tests except the one I added when the fix for
> brin hypothetical indexes was applied are still working, as they're
> looking for patterns like "Index.*<\d+>${amname}_${relname}" in the
> textual explain output.  There are a few fixes not released yet, so
> I'll update the broken brin regression test and publish a release soon
> to make sure that the packages don't get broken when the new minor
> versions are released.

Doesn't sound too painful then.  I pushed the patch to all branches;
it should appear in 13beta2 this week, and in the August quarterly
updates.

            regards, tom lane



Re: BUG #16502: EXPLAIN JSON format adds extra quotes around index names

From
Julien Rouhaud
Date:
On Mon, Jun 22, 2020 at 5:49 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Julien Rouhaud <rjuju123@gmail.com> writes:
> > On Sun, Jun 21, 2020 at 4:29 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> 2. Text-format EXPLAIN output will grow double-quotes around the
> >> hypothetical index names, which are not there at present (I believe,
> >> but didn't test it).  This shouldn't bother human users particularly,
> >> but conceivably it might break hypopg's regression tests?
>
> > Yes, the patch is working as expected.  Before:
> > ...
> > All of the regression tests except the one I added when the fix for
> > brin hypothetical indexes was applied are still working, as they're
> > looking for patterns like "Index.*<\d+>${amname}_${relname}" in the
> > textual explain output.  There are a few fixes not released yet, so
> > I'll update the broken brin regression test and publish a release soon
> > to make sure that the packages don't get broken when the new minor
> > versions are released.
>
> Doesn't sound too painful then.  I pushed the patch to all branches;
> it should appear in 13beta2 this week, and in the August quarterly
> updates.

Great, thanks!  I already pushed a fix for hypopg regression test, and
I confirm that everything works as intended, both with and without the
commit.