Re: Badly broken logic in plpython composite-type handling - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Badly broken logic in plpython composite-type handling
Date
Msg-id 28643.1440029148@sss.pgh.pa.us
Whole thread Raw
In response to Badly broken logic in plpython composite-type handling  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Badly broken logic in plpython composite-type handling  ("Joshua D. Drake" <jd@commandprompt.com>)
List pgsql-hackers
I wrote:
> I looked into the crash reported in bug #13579.  The proximate cause
> of the crash is that PLyString_ToComposite does this:
> ...
> I'm inclined to think that maybe PLyString_ToComposite needs to be doing
> something more like what PLyObject_ToComposite does, ie doing its own
> lookup in a private descriptor; but I'm not sure if that's right, and
> anyway it would just be doubling down on a bad design.  Not being able
> to cache these I/O function lookups is really horrid.

I wrote a draft patch that does it that way.  It indeed stops the crash,
and even better, makes PLyString_ToComposite actually work for RECORD,
which AFAICT it's never done before.  We'd conveniently omitted to test
the case in the plpython regression tests, thus masking the fact that
it would crash horribly if you invoked such a case more than once.

To make it work, I had to modify record_in to allow inputting a value
of type RECORD as long as it's given a correct typmod.  While that would
not happen in ordinary SQL, it's possible in this and probably other
usages, and I can't really see any downside.  The emitted record will
be properly marked with type RECORDOID and typmod whatever the internal
anonymous-type identifier is, and it's no more or less type-safe than
any other such value.

This version of PLyString_ToComposite is no better than
PLyObject_ToComposite as far as leaking memory goes.  We could probably
fix that in a crude fashion by using plpython's "scratch context" for the
transient type-lookup data, but I'd rather see a proper fix wherein the
lookup results stay cached.  In any case, that's a separate and less
critical matter.

Barring objections, I propose to commit and back-patch this.  The crash
can be demonstrated back to 9.1.

            regards, tom lane

diff --git a/src/backend/utils/adt/rowtypes.c b/src/backend/utils/adt/rowtypes.c
index a65e18d..ec4f71e 100644
*** a/src/backend/utils/adt/rowtypes.c
--- b/src/backend/utils/adt/rowtypes.c
*************** record_in(PG_FUNCTION_ARGS)
*** 73,84 ****
  {
      char       *string = PG_GETARG_CSTRING(0);
      Oid            tupType = PG_GETARG_OID(1);
!
! #ifdef NOT_USED
!     int32        typmod = PG_GETARG_INT32(2);
! #endif
      HeapTupleHeader result;
-     int32        tupTypmod;
      TupleDesc    tupdesc;
      HeapTuple    tuple;
      RecordIOData *my_extra;
--- 73,80 ----
  {
      char       *string = PG_GETARG_CSTRING(0);
      Oid            tupType = PG_GETARG_OID(1);
!     int32        tupTypmod = PG_GETARG_INT32(2);
      HeapTupleHeader result;
      TupleDesc    tupdesc;
      HeapTuple    tuple;
      RecordIOData *my_extra;
*************** record_in(PG_FUNCTION_ARGS)
*** 91,106 ****
      StringInfoData buf;

      /*
!      * Use the passed type unless it's RECORD; we can't support input of
!      * anonymous types, mainly because there's no good way to figure out which
!      * anonymous type is wanted.  Note that for RECORD, what we'll probably
!      * actually get is RECORD's typelem, ie, zero.
       */
!     if (tupType == InvalidOid || tupType == RECORDOID)
          ereport(ERROR,
                  (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
             errmsg("input of anonymous composite types is not implemented")));
-     tupTypmod = -1;                /* for all non-anonymous types */

      /*
       * This comes from the composite type's pg_type.oid and stores system oids
--- 87,103 ----
      StringInfoData buf;

      /*
!      * Give a friendly error message if we did not get enough info to identify
!      * the target record type.  (lookup_rowtype_tupdesc would fail anyway, but
!      * with a non-user-facing message.)  Note that for RECORD, what we'll
!      * usually actually get is RECORD's typelem, ie, zero.  However there are
!      * cases where we do get a valid typmod and can do something useful.
       */
!     if (tupType == InvalidOid ||
!         (tupType == RECORDOID && tupTypmod < 0))
          ereport(ERROR,
                  (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
             errmsg("input of anonymous composite types is not implemented")));

      /*
       * This comes from the composite type's pg_type.oid and stores system oids
*************** record_recv(PG_FUNCTION_ARGS)
*** 449,460 ****
  {
      StringInfo    buf = (StringInfo) PG_GETARG_POINTER(0);
      Oid            tupType = PG_GETARG_OID(1);
!
! #ifdef NOT_USED
!     int32        typmod = PG_GETARG_INT32(2);
! #endif
      HeapTupleHeader result;
-     int32        tupTypmod;
      TupleDesc    tupdesc;
      HeapTuple    tuple;
      RecordIOData *my_extra;
--- 446,453 ----
  {
      StringInfo    buf = (StringInfo) PG_GETARG_POINTER(0);
      Oid            tupType = PG_GETARG_OID(1);
!     int32        tupTypmod = PG_GETARG_INT32(2);
      HeapTupleHeader result;
      TupleDesc    tupdesc;
      HeapTuple    tuple;
      RecordIOData *my_extra;
*************** record_recv(PG_FUNCTION_ARGS)
*** 466,481 ****
      bool       *nulls;

      /*
!      * Use the passed type unless it's RECORD; we can't support input of
!      * anonymous types, mainly because there's no good way to figure out which
!      * anonymous type is wanted.  Note that for RECORD, what we'll probably
!      * actually get is RECORD's typelem, ie, zero.
       */
!     if (tupType == InvalidOid || tupType == RECORDOID)
          ereport(ERROR,
                  (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
             errmsg("input of anonymous composite types is not implemented")));
!     tupTypmod = -1;                /* for all non-anonymous types */
      tupdesc = lookup_rowtype_tupdesc(tupType, tupTypmod);
      ncolumns = tupdesc->natts;

--- 459,476 ----
      bool       *nulls;

      /*
!      * Give a friendly error message if we did not get enough info to identify
!      * the target record type.  (lookup_rowtype_tupdesc would fail anyway, but
!      * with a non-user-facing message.)  Note that for RECORD, what we'll
!      * usually actually get is RECORD's typelem, ie, zero.  However there are
!      * cases where we do get a valid typmod and can do something useful.
       */
!     if (tupType == InvalidOid ||
!         (tupType == RECORDOID && tupTypmod < 0))
          ereport(ERROR,
                  (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
             errmsg("input of anonymous composite types is not implemented")));
!
      tupdesc = lookup_rowtype_tupdesc(tupType, tupTypmod);
      ncolumns = tupdesc->natts;

diff --git a/src/pl/plpython/expected/plpython_composite.out b/src/pl/plpython/expected/plpython_composite.out
index ad454f3..0ef0c21 100644
*** a/src/pl/plpython/expected/plpython_composite.out
--- b/src/pl/plpython/expected/plpython_composite.out
*************** elif typ == 'obj':
*** 65,70 ****
--- 65,72 ----
      type_record.first = first
      type_record.second = second
      return type_record
+ elif typ == 'str':
+     return "('%s',%r)" % (first, second)
  $$ LANGUAGE plpythonu;
  SELECT * FROM multiout_record_as('dict', 'foo', 1, 'f');
   first | second
*************** SELECT multiout_record_as('dict', 'foo',
*** 78,83 ****
--- 80,217 ----
   (foo,1)
  (1 row)

+ SELECT * FROM multiout_record_as('dict', null, null, false);
+  first | second
+ -------+--------
+        |
+ (1 row)
+
+ SELECT * FROM multiout_record_as('dict', 'one', null, false);
+  first | second
+ -------+--------
+  one   |
+ (1 row)
+
+ SELECT * FROM multiout_record_as('dict', null, 2, false);
+  first | second
+ -------+--------
+        |      2
+ (1 row)
+
+ SELECT * FROM multiout_record_as('dict', 'three', 3, false);
+  first | second
+ -------+--------
+  three |      3
+ (1 row)
+
+ SELECT * FROM multiout_record_as('dict', null, null, true);
+  first | second
+ -------+--------
+        |
+ (1 row)
+
+ SELECT * FROM multiout_record_as('tuple', null, null, false);
+  first | second
+ -------+--------
+        |
+ (1 row)
+
+ SELECT * FROM multiout_record_as('tuple', 'one', null, false);
+  first | second
+ -------+--------
+  one   |
+ (1 row)
+
+ SELECT * FROM multiout_record_as('tuple', null, 2, false);
+  first | second
+ -------+--------
+        |      2
+ (1 row)
+
+ SELECT * FROM multiout_record_as('tuple', 'three', 3, false);
+  first | second
+ -------+--------
+  three |      3
+ (1 row)
+
+ SELECT * FROM multiout_record_as('tuple', null, null, true);
+  first | second
+ -------+--------
+        |
+ (1 row)
+
+ SELECT * FROM multiout_record_as('list', null, null, false);
+  first | second
+ -------+--------
+        |
+ (1 row)
+
+ SELECT * FROM multiout_record_as('list', 'one', null, false);
+  first | second
+ -------+--------
+  one   |
+ (1 row)
+
+ SELECT * FROM multiout_record_as('list', null, 2, false);
+  first | second
+ -------+--------
+        |      2
+ (1 row)
+
+ SELECT * FROM multiout_record_as('list', 'three', 3, false);
+  first | second
+ -------+--------
+  three |      3
+ (1 row)
+
+ SELECT * FROM multiout_record_as('list', null, null, true);
+  first | second
+ -------+--------
+        |
+ (1 row)
+
+ SELECT * FROM multiout_record_as('obj', null, null, false);
+  first | second
+ -------+--------
+        |
+ (1 row)
+
+ SELECT * FROM multiout_record_as('obj', 'one', null, false);
+  first | second
+ -------+--------
+  one   |
+ (1 row)
+
+ SELECT * FROM multiout_record_as('obj', null, 2, false);
+  first | second
+ -------+--------
+        |      2
+ (1 row)
+
+ SELECT * FROM multiout_record_as('obj', 'three', 3, false);
+  first | second
+ -------+--------
+  three |      3
+ (1 row)
+
+ SELECT * FROM multiout_record_as('obj', null, null, true);
+  first | second
+ -------+--------
+        |
+ (1 row)
+
+ SELECT * FROM multiout_record_as('str', 'one', 1, false);
+  first | second
+ -------+--------
+  'one' |      1
+ (1 row)
+
+ SELECT * FROM multiout_record_as('str', 'one', 2, false);
+  first | second
+ -------+--------
+  'one' |      2
+ (1 row)
+
  SELECT *, s IS NULL AS snull FROM multiout_record_as('tuple', 'xxx', NULL, 'f') AS f(f, s);
    f  | s | snull
  -----+---+-------
*************** elif typ == 'dict':
*** 179,188 ****
--- 313,326 ----
      d = {'first': n * 2, 'second': n * 3, 'extra': 'not important'}
  elif typ == 'tuple':
      d = (n * 2, n * 3)
+ elif typ == 'list':
+     d = [ n * 2, n * 3 ]
  elif typ == 'obj':
      class d: pass
      d.first = n * 2
      d.second = n * 3
+ elif typ == 'str':
+     d = "(%r,%r)" % (n * 2, n * 3)
  for i in range(n):
      yield (i, d)
  $$ LANGUAGE plpythonu;
*************** SELECT * FROM multiout_table_type_setof(
*** 200,205 ****
--- 338,355 ----
   2 | (6,9)
  (3 rows)

+ SELECT * FROM multiout_table_type_setof('dict', 'f', 7);
+  n | column2
+ ---+---------
+  0 | (14,21)
+  1 | (14,21)
+  2 | (14,21)
+  3 | (14,21)
+  4 | (14,21)
+  5 | (14,21)
+  6 | (14,21)
+ (7 rows)
+
  SELECT * FROM multiout_table_type_setof('tuple', 'f', 2);
   n | column2
  ---+---------
*************** SELECT * FROM multiout_table_type_setof(
*** 207,212 ****
--- 357,385 ----
   1 | (4,6)
  (2 rows)

+ SELECT * FROM multiout_table_type_setof('tuple', 'f', 3);
+  n | column2
+ ---+---------
+  0 | (6,9)
+  1 | (6,9)
+  2 | (6,9)
+ (3 rows)
+
+ SELECT * FROM multiout_table_type_setof('list', 'f', 2);
+  n | column2
+ ---+---------
+  0 | (4,6)
+  1 | (4,6)
+ (2 rows)
+
+ SELECT * FROM multiout_table_type_setof('list', 'f', 3);
+  n | column2
+ ---+---------
+  0 | (6,9)
+  1 | (6,9)
+  2 | (6,9)
+ (3 rows)
+
  SELECT * FROM multiout_table_type_setof('obj', 'f', 4);
   n | column2
  ---+---------
*************** SELECT * FROM multiout_table_type_setof(
*** 216,221 ****
--- 389,427 ----
   3 | (8,12)
  (4 rows)

+ SELECT * FROM multiout_table_type_setof('obj', 'f', 5);
+  n | column2
+ ---+---------
+  0 | (10,15)
+  1 | (10,15)
+  2 | (10,15)
+  3 | (10,15)
+  4 | (10,15)
+ (5 rows)
+
+ SELECT * FROM multiout_table_type_setof('str', 'f', 6);
+  n | column2
+ ---+---------
+  0 | (12,18)
+  1 | (12,18)
+  2 | (12,18)
+  3 | (12,18)
+  4 | (12,18)
+  5 | (12,18)
+ (6 rows)
+
+ SELECT * FROM multiout_table_type_setof('str', 'f', 7);
+  n | column2
+ ---+---------
+  0 | (14,21)
+  1 | (14,21)
+  2 | (14,21)
+  3 | (14,21)
+  4 | (14,21)
+  5 | (14,21)
+  6 | (14,21)
+ (7 rows)
+
  SELECT * FROM multiout_table_type_setof('dict', 't', 3);
   n | column2
  ---+---------
diff --git a/src/pl/plpython/plpy_typeio.c b/src/pl/plpython/plpy_typeio.c
index 7a45b22..05add6e 100644
*** a/src/pl/plpython/plpy_typeio.c
--- b/src/pl/plpython/plpy_typeio.c
*************** PLySequence_ToArray(PLyObToDatum *arg, i
*** 912,931 ****
  static Datum
  PLyString_ToComposite(PLyTypeInfo *info, TupleDesc desc, PyObject *string)
  {
      HeapTuple    typeTup;
      PLyExecutionContext *exec_ctx = PLy_current_execution_context();

      typeTup = SearchSysCache1(TYPEOID, ObjectIdGetDatum(desc->tdtypeid));
      if (!HeapTupleIsValid(typeTup))
          elog(ERROR, "cache lookup failed for type %u", desc->tdtypeid);

!     PLy_output_datum_func2(&info->out.d, typeTup,
                             exec_ctx->curr_proc->langid,
                             exec_ctx->curr_proc->trftypes);

      ReleaseSysCache(typeTup);

!     return PLyObject_ToDatum(&info->out.d, info->out.d.typmod, string);
  }


--- 912,941 ----
  static Datum
  PLyString_ToComposite(PLyTypeInfo *info, TupleDesc desc, PyObject *string)
  {
+     Datum        result;
      HeapTuple    typeTup;
+     PLyTypeInfo locinfo;
      PLyExecutionContext *exec_ctx = PLy_current_execution_context();

+     /* Create a dummy PLyTypeInfo */
+     MemSet(&locinfo, 0, sizeof(PLyTypeInfo));
+     PLy_typeinfo_init(&locinfo);
+
      typeTup = SearchSysCache1(TYPEOID, ObjectIdGetDatum(desc->tdtypeid));
      if (!HeapTupleIsValid(typeTup))
          elog(ERROR, "cache lookup failed for type %u", desc->tdtypeid);

!     PLy_output_datum_func2(&locinfo.out.d, typeTup,
                             exec_ctx->curr_proc->langid,
                             exec_ctx->curr_proc->trftypes);

      ReleaseSysCache(typeTup);

!     result = PLyObject_ToDatum(&locinfo.out.d, desc->tdtypmod, string);
!
!     PLy_typeinfo_dealloc(&locinfo);
!
!     return result;
  }


diff --git a/src/pl/plpython/sql/plpython_composite.sql b/src/pl/plpython/sql/plpython_composite.sql
index cb5fffe..473342c 100644
*** a/src/pl/plpython/sql/plpython_composite.sql
--- b/src/pl/plpython/sql/plpython_composite.sql
*************** elif typ == 'obj':
*** 32,41 ****
--- 32,71 ----
      type_record.first = first
      type_record.second = second
      return type_record
+ elif typ == 'str':
+     return "('%s',%r)" % (first, second)
  $$ LANGUAGE plpythonu;

  SELECT * FROM multiout_record_as('dict', 'foo', 1, 'f');
  SELECT multiout_record_as('dict', 'foo', 1, 'f');
+
+ SELECT * FROM multiout_record_as('dict', null, null, false);
+ SELECT * FROM multiout_record_as('dict', 'one', null, false);
+ SELECT * FROM multiout_record_as('dict', null, 2, false);
+ SELECT * FROM multiout_record_as('dict', 'three', 3, false);
+ SELECT * FROM multiout_record_as('dict', null, null, true);
+
+ SELECT * FROM multiout_record_as('tuple', null, null, false);
+ SELECT * FROM multiout_record_as('tuple', 'one', null, false);
+ SELECT * FROM multiout_record_as('tuple', null, 2, false);
+ SELECT * FROM multiout_record_as('tuple', 'three', 3, false);
+ SELECT * FROM multiout_record_as('tuple', null, null, true);
+
+ SELECT * FROM multiout_record_as('list', null, null, false);
+ SELECT * FROM multiout_record_as('list', 'one', null, false);
+ SELECT * FROM multiout_record_as('list', null, 2, false);
+ SELECT * FROM multiout_record_as('list', 'three', 3, false);
+ SELECT * FROM multiout_record_as('list', null, null, true);
+
+ SELECT * FROM multiout_record_as('obj', null, null, false);
+ SELECT * FROM multiout_record_as('obj', 'one', null, false);
+ SELECT * FROM multiout_record_as('obj', null, 2, false);
+ SELECT * FROM multiout_record_as('obj', 'three', 3, false);
+ SELECT * FROM multiout_record_as('obj', null, null, true);
+
+ SELECT * FROM multiout_record_as('str', 'one', 1, false);
+ SELECT * FROM multiout_record_as('str', 'one', 2, false);
+
  SELECT *, s IS NULL AS snull FROM multiout_record_as('tuple', 'xxx', NULL, 'f') AS f(f, s);
  SELECT *, f IS NULL AS fnull, s IS NULL AS snull FROM multiout_record_as('tuple', 'xxx', 1, 't') AS f(f, s);
  SELECT * FROM multiout_record_as('obj', NULL, 10, 'f');
*************** elif typ == 'dict':
*** 92,109 ****
--- 122,150 ----
      d = {'first': n * 2, 'second': n * 3, 'extra': 'not important'}
  elif typ == 'tuple':
      d = (n * 2, n * 3)
+ elif typ == 'list':
+     d = [ n * 2, n * 3 ]
  elif typ == 'obj':
      class d: pass
      d.first = n * 2
      d.second = n * 3
+ elif typ == 'str':
+     d = "(%r,%r)" % (n * 2, n * 3)
  for i in range(n):
      yield (i, d)
  $$ LANGUAGE plpythonu;

  SELECT * FROM multiout_composite(2);
  SELECT * FROM multiout_table_type_setof('dict', 'f', 3);
+ SELECT * FROM multiout_table_type_setof('dict', 'f', 7);
  SELECT * FROM multiout_table_type_setof('tuple', 'f', 2);
+ SELECT * FROM multiout_table_type_setof('tuple', 'f', 3);
+ SELECT * FROM multiout_table_type_setof('list', 'f', 2);
+ SELECT * FROM multiout_table_type_setof('list', 'f', 3);
  SELECT * FROM multiout_table_type_setof('obj', 'f', 4);
+ SELECT * FROM multiout_table_type_setof('obj', 'f', 5);
+ SELECT * FROM multiout_table_type_setof('str', 'f', 6);
+ SELECT * FROM multiout_table_type_setof('str', 'f', 7);
  SELECT * FROM multiout_table_type_setof('dict', 't', 3);

  -- check what happens if a type changes under us

pgsql-hackers by date:

Previous
From: Jim Nasby
Date:
Subject: Re: proposal: function parse_ident
Next
From: "Joshua D. Drake"
Date:
Subject: Re: Badly broken logic in plpython composite-type handling