Home > mailing lists
Re: row_to_json bug with index only scans: empty keys! - Mailing list pgsql-hackers

From	Tom Lane
Subject	Re: row_to_json bug with index only scans: empty keys!
Date	November 8, 2014 07:28:03
Msg-id	31531.1415420876@sss.pgh.pa.us Whole thread Raw
In response to	Re: row_to_json bug with index only scans: empty keys! (Andrew Dunstan <andrew@dunslane.net>)
Responses	Re: row_to_json bug with index only scans: empty keys! (Robert Haas <robertmhaas@gmail.com>)
List	pgsql-hackers
Tree view
Andrew Dunstan <andrew@dunslane.net> writes:
> On 11/07/2014 04:59 PM, Tom Lane wrote:
>> I think this is probably a variant of bug #11210, in which the problem is
>> that tupledescs bubbled up from inheritance children never get column
>> names assigned to them.  I've been speculating about ways to fix that
>> but I've not thought of anything that's not kinda painful.

> Yeah, I've been running this down a bit with a debugger, and it looked
> kinda like that.

After chewing on this for awhile, it occurred to me that the problem could
be solved in ExecEvalWholeRowVar, which is really the only place that
produces composite datums "from scratch".  We can look up the RTE that
the whole-row Var references and scrape column aliases from it.  The
attached draft patch does that, and appears to fix both Ross' example
and the one in bug #11210.

Although this patch doesn't change any existing regression test outputs
as far as I can find, it's not hard to think of cases where it will change
the results.  In particular:

regression=# create table foo (f1 int, f2 int);
CREATE TABLE
regression=# insert into foo values(1,2);
INSERT 0 1
regression=# select row_to_json(f) from foo f;
   row_to_json
-----------------
 {"f1":1,"f2":2}
(1 row)

regression=# select row_to_json(f) from foo f(x,y);
  row_to_json
---------------
 {"x":1,"y":2}
(1 row)

Without this patch, you get the same result from the second query as
from the first.  I argue that using the column aliases is clearly saner
behavior; I think there have been complaints about that in the past,
but am too lazy to trawl the archives right now.  However, it's also
clear that people might have applications that are expecting the old
behavior and don't realize that it might be thought a bug.

I'm hesitant to back-patch this anyway, because I'm not sure that there
aren't any other unexpected side-effects.  Note in particular the
assertion I had to weaken in ExecEvalConvertRowtype because a whole-row
Var is now capable of returning tuple datums that are marked with a
different rowtype than the parser marked the Var with.  That's a little
bit scary ...

On the positive side, this lets us remove some kluges, like the place in
nodeFunctionscan.c where we were hacking the output tupledesc's column
names in pretty much the same way this patch has ExecEvalWholeRowVar do.
I thought that was a crock when it went in, because no other plan node
is doing anything like that, so it's good to be able to remove it.

Thoughts?

            regards, tom lane

diff --git a/src/backend/executor/execQual.c b/src/backend/executor/execQual.c
index 7cfa63f..3ab460c 100644
*** a/src/backend/executor/execQual.c
--- b/src/backend/executor/execQual.c
***************
*** 50,55 ****
--- 50,56 ----
  #include "nodes/nodeFuncs.h"
  #include "optimizer/planner.h"
  #include "parser/parse_coerce.h"
+ #include "parser/parsetree.h"
  #include "pgstat.h"
  #include "utils/acl.h"
  #include "utils/builtins.h"
*************** ExecEvalWholeRowVar(WholeRowVarExprState
*** 712,717 ****
--- 713,720 ----
  {
      Var           *variable = (Var *) wrvstate->xprstate.expr;
      TupleTableSlot *slot;
+     TupleDesc    output_tupdesc;
+     MemoryContext oldcontext;
      bool        needslow = false;

      if (isDone)
*************** ExecEvalWholeRowVar(WholeRowVarExprState
*** 787,794 ****
              /* If so, build the junkfilter in the query memory context */
              if (junk_filter_needed)
              {
-                 MemoryContext oldcontext;
-
                  oldcontext = MemoryContextSwitchTo(econtext->ecxt_per_query_memory);
                  wrvstate->wrv_junkFilter =
                      ExecInitJunkFilter(subplan->plan->targetlist,
--- 790,795 ----
*************** ExecEvalWholeRowVar(WholeRowVarExprState
*** 860,869 ****
                  needslow = true;    /* need runtime check for null */
          }

          ReleaseTupleDesc(var_tupdesc);
      }

!     /* Skip the checking on future executions of node */
      if (needslow)
          wrvstate->xprstate.evalfunc = (ExprStateEvalFunc) ExecEvalWholeRowSlow;
      else
--- 861,946 ----
                  needslow = true;    /* need runtime check for null */
          }

+         /*
+          * Use the variable's declared rowtype as the descriptor for the
+          * output values, modulo possibly assigning new column names below. In
+          * particular, we *must* absorb any attisdropped markings.
+          */
+         oldcontext = MemoryContextSwitchTo(econtext->ecxt_per_query_memory);
+         output_tupdesc = CreateTupleDescCopy(var_tupdesc);
+         MemoryContextSwitchTo(oldcontext);
+
          ReleaseTupleDesc(var_tupdesc);
      }
+     else
+     {
+         /*
+          * In the RECORD case, we use the input slot's rowtype as the
+          * descriptor for the output values, modulo possibly assigning new
+          * column names below.
+          */
+         oldcontext = MemoryContextSwitchTo(econtext->ecxt_per_query_memory);
+         output_tupdesc = CreateTupleDescCopy(slot->tts_tupleDescriptor);
+         MemoryContextSwitchTo(oldcontext);
+     }

!     /*
!      * Construct a tuple descriptor for the composite values we'll produce,
!      * and make sure its record type is "blessed".  The main reason to do this
!      * is to be sure that operations such as row_to_json() will see the
!      * desired column names when they look up the descriptor from the type
!      * information embedded in the composite values.
!      *
!      * We already got the correct physical datatype info above, but now we
!      * should try to find the source RTE and adopt its column aliases, in case
!      * they are different from the original rowtype's names.  For example, in
!      * "SELECT foo(t) FROM tab t(x,y)", the first two columns in the composite
!      * output should be named "x" and "y" regardless of tab's column names.
!      *
!      * If we can't locate the RTE, assume the column names we've got are OK.
!      * (As of this writing, the only cases where we can't locate the RTE are
!      * in execution of trigger WHEN clauses, and then the Var will have the
!      * trigger's relation's rowtype, so its names are fine.)
!      */
!     if (econtext->ecxt_estate &&
!         variable->varno <= list_length(econtext->ecxt_estate->es_range_table))
!     {
!         RangeTblEntry *rte = rt_fetch(variable->varno,
!                                       econtext->ecxt_estate->es_range_table);
!         bool        modified = false;
!         int            colno = 0;
!         ListCell   *lc;
!
!         Assert(output_tupdesc->natts >= list_length(rte->eref->colnames));
!         foreach(lc, rte->eref->colnames)
!         {
!             char       *cname = strVal(lfirst(lc));
!             Form_pg_attribute attr = output_tupdesc->attrs[colno++];
!
!             /* Ignore empty aliases (these must be for dropped columns) */
!             if (cname[0] == '\0')
!                 continue;
!             /* Change tupdesc only if alias is actually different */
!             if (strcmp(cname, NameStr(attr->attname)) != 0)
!             {
!                 namestrcpy(&(attr->attname), cname);
!                 modified = true;
!             }
!         }
!         /* If we modified the tupdesc, it's now a new record type */
!         if (modified)
!         {
!             output_tupdesc->tdtypeid = RECORDOID;
!             output_tupdesc->tdtypmod = -1;
!         }
!     }
!
!     /* Bless the tupdesc if needed, and save it in the execution state */
!     if (output_tupdesc->tdtypeid == RECORDOID && output_tupdesc->tdtypmod < 0)
!         assign_record_type_typmod(output_tupdesc);
!     wrvstate->wrv_tupdesc = output_tupdesc;
!
!     /* Skip all the above on future executions of node */
      if (needslow)
          wrvstate->xprstate.evalfunc = (ExprStateEvalFunc) ExecEvalWholeRowSlow;
      else
*************** ExecEvalWholeRowFast(WholeRowVarExprStat
*** 886,892 ****
  {
      Var           *variable = (Var *) wrvstate->xprstate.expr;
      TupleTableSlot *slot;
-     TupleDesc    slot_tupdesc;
      HeapTupleHeader dtuple;

      if (isDone)
--- 963,968 ----
*************** ExecEvalWholeRowFast(WholeRowVarExprStat
*** 917,949 ****
          slot = ExecFilterJunk(wrvstate->wrv_junkFilter, slot);

      /*
-      * If it's a RECORD Var, we'll use the slot's type ID info.  It's likely
-      * that the slot's type is also RECORD; if so, make sure it's been
-      * "blessed", so that the Datum can be interpreted later.  (Note: we must
-      * do this here, not in ExecEvalWholeRowVar, because some plan trees may
-      * return different slots at different times.  We have to be ready to
-      * bless additional slots during the run.)
-      */
-     slot_tupdesc = slot->tts_tupleDescriptor;
-     if (variable->vartype == RECORDOID &&
-         slot_tupdesc->tdtypeid == RECORDOID &&
-         slot_tupdesc->tdtypmod < 0)
-         assign_record_type_typmod(slot_tupdesc);
-
-     /*
       * Copy the slot tuple and make sure any toasted fields get detoasted.
       */
      dtuple = DatumGetHeapTupleHeader(ExecFetchSlotTupleDatum(slot));

      /*
!      * If the Var identifies a named composite type, label the datum with that
!      * type; otherwise we'll use the slot's info.
       */
!     if (variable->vartype != RECORDOID)
!     {
!         HeapTupleHeaderSetTypeId(dtuple, variable->vartype);
!         HeapTupleHeaderSetTypMod(dtuple, variable->vartypmod);
!     }

      return PointerGetDatum(dtuple);
  }
--- 993,1007 ----
          slot = ExecFilterJunk(wrvstate->wrv_junkFilter, slot);

      /*
       * Copy the slot tuple and make sure any toasted fields get detoasted.
       */
      dtuple = DatumGetHeapTupleHeader(ExecFetchSlotTupleDatum(slot));

      /*
!      * Label the datum with the composite type info we identified before.
       */
!     HeapTupleHeaderSetTypeId(dtuple, wrvstate->wrv_tupdesc->tdtypeid);
!     HeapTupleHeaderSetTypMod(dtuple, wrvstate->wrv_tupdesc->tdtypmod);

      return PointerGetDatum(dtuple);
  }
*************** ExecEvalWholeRowSlow(WholeRowVarExprStat
*** 997,1004 ****
      tuple = ExecFetchSlotTuple(slot);
      tupleDesc = slot->tts_tupleDescriptor;

      Assert(variable->vartype != RECORDOID);
!     var_tupdesc = lookup_rowtype_tupdesc(variable->vartype, -1);

      /* Check to see if any dropped attributes are non-null */
      for (i = 0; i < var_tupdesc->natts; i++)
--- 1055,1063 ----
      tuple = ExecFetchSlotTuple(slot);
      tupleDesc = slot->tts_tupleDescriptor;

+     /* wrv_tupdesc is a good enough representation of the Var's rowtype */
      Assert(variable->vartype != RECORDOID);
!     var_tupdesc = wrvstate->wrv_tupdesc;

      /* Check to see if any dropped attributes are non-null */
      for (i = 0; i < var_tupdesc->natts; i++)
*************** ExecEvalWholeRowSlow(WholeRowVarExprStat
*** 1025,1036 ****
      dtuple = DatumGetHeapTupleHeader(ExecFetchSlotTupleDatum(slot));

      /*
!      * Reset datum's type ID fields to match the Var.
       */
!     HeapTupleHeaderSetTypeId(dtuple, variable->vartype);
!     HeapTupleHeaderSetTypMod(dtuple, variable->vartypmod);
!
!     ReleaseTupleDesc(var_tupdesc);

      return PointerGetDatum(dtuple);
  }
--- 1084,1093 ----
      dtuple = DatumGetHeapTupleHeader(ExecFetchSlotTupleDatum(slot));

      /*
!      * Label the datum with the composite type info we identified before.
       */
!     HeapTupleHeaderSetTypeId(dtuple, wrvstate->wrv_tupdesc->tdtypeid);
!     HeapTupleHeaderSetTypMod(dtuple, wrvstate->wrv_tupdesc->tdtypmod);

      return PointerGetDatum(dtuple);
  }
*************** ExecEvalConvertRowtype(ConvertRowtypeExp
*** 2850,2857 ****
          cstate->initialized = false;
      }

!     Assert(HeapTupleHeaderGetTypeId(tuple) == cstate->indesc->tdtypeid);
!     Assert(HeapTupleHeaderGetTypMod(tuple) == cstate->indesc->tdtypmod);

      /* if first time through, initialize conversion map */
      if (!cstate->initialized)
--- 2907,2920 ----
          cstate->initialized = false;
      }

!     /*
!      * We used to be able to assert that incoming tuples are marked with
!      * exactly the rowtype of cstate->indesc.  However, now that
!      * ExecEvalWholeRowVar might change the tuples' marking to plain RECORD
!      * due to inserting aliases, we can only make this weak test:
!      */
!     Assert(HeapTupleHeaderGetTypeId(tuple) == cstate->indesc->tdtypeid ||
!            HeapTupleHeaderGetTypeId(tuple) == RECORDOID);

      /* if first time through, initialize conversion map */
      if (!cstate->initialized)
*************** ExecInitExpr(Expr *node, PlanState *pare
*** 4375,4380 ****
--- 4438,4444 ----
                  WholeRowVarExprState *wstate = makeNode(WholeRowVarExprState);

                  wstate->parent = parent;
+                 wstate->wrv_tupdesc = NULL;
                  wstate->wrv_junkFilter = NULL;
                  state = (ExprState *) wstate;
                  state->evalfunc = (ExprStateEvalFunc) ExecEvalWholeRowVar;
diff --git a/src/backend/executor/nodeFunctionscan.c b/src/backend/executor/nodeFunctionscan.c
index 945a414..3a7f0ae 100644
*** a/src/backend/executor/nodeFunctionscan.c
--- b/src/backend/executor/nodeFunctionscan.c
*************** FunctionScanState *
*** 279,286 ****
  ExecInitFunctionScan(FunctionScan *node, EState *estate, int eflags)
  {
      FunctionScanState *scanstate;
-     RangeTblEntry *rte = rt_fetch(node->scan.scanrelid,
-                                   estate->es_range_table);
      int            nfuncs = list_length(node->functions);
      TupleDesc    scan_tupdesc;
      int            i,
--- 279,284 ----
*************** ExecInitFunctionScan(FunctionScan *node,
*** 494,515 ****
          Assert(attno == natts);
      }

-     /*
-      * Make sure the scan result tupdesc has the column names the query
-      * expects.  This affects the output of constructs like row_to_json which
-      * read the column names from the passed-in tupdesc.
-      */
-     i = 0;
-     foreach(lc, rte->eref->colnames)
-     {
-         char       *attname = strVal(lfirst(lc));
-
-         if (i >= scan_tupdesc->natts)
-             break;                /* shouldn't happen, but just in case */
-         namestrcpy(&(scan_tupdesc->attrs[i]->attname), attname);
-         i++;
-     }
-
      ExecAssignScanType(&scanstate->ss, scan_tupdesc);

      /*
--- 492,497 ----
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 39d2c10..6d8cb93 100644
*** a/src/include/nodes/execnodes.h
--- b/src/include/nodes/execnodes.h
*************** typedef struct WholeRowVarExprState
*** 577,582 ****
--- 577,583 ----
  {
      ExprState    xprstate;
      struct PlanState *parent;    /* parent PlanState, or NULL if none */
+     TupleDesc    wrv_tupdesc;    /* descriptor for resulting tuples */
      JunkFilter *wrv_junkFilter; /* JunkFilter to remove resjunk cols */
  } WholeRowVarExprState;
pgsql-hackers by date:
From: Alvaro Herrera
Date: 08 November 2014, 06:27:24
Subject: Re: BRIN indexes - TRAP: BadArgument
From: Michael Paquier
Date: 08 November 2014, 08:01:59
Subject: Re: remove pg_standby?
Re: row_to_json bug with index only scans: empty keys! - Mailing list pgsql-hackers

Previous

Next