Re: Composite Datums containing toasted fields are a bad idea(?) - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Composite Datums containing toasted fields are a bad idea(?)
Date
Msg-id 1918.1397860474@sss.pgh.pa.us
Whole thread Raw
In response to Re: Composite Datums containing toasted fields are a bad idea(?)  (Noah Misch <noah@leadboat.com>)
Responses Re: Composite Datums containing toasted fields are a bad idea(?)
List pgsql-hackers
Noah Misch <noah@leadboat.com> writes:
> Interesting bug.
> On Fri, Mar 28, 2014 at 04:34:41PM -0400, Tom Lane wrote:
>> I think we might be better off to get rid of toast_flatten_tuple_attribute
>> and instead insist that composite Datums never contain any external toast
>> pointers in the first place.

> Performance is the dominant issue here; the hacker-centric considerations you
> outlined vanish next to it.

I'm not exactly convinced by that line of argument, especially when it's
entirely unsupported by any evidence as to which implementation approach
will actually perform worse.

However, I have done some investigation into what it'd take to keep the
current approach and teach the array code to detoast composite-type array
elements properly.  The attached draft patch only fixes arrays; ranges
need work too, and I'm not sure what else might need adjustment.  But this
patch does fix the test case provided by Jan Pecek.

The main problem with this patch, as I see it, is that it'll introduce
extra syscache lookup overhead even when there are no toasted fields
anywhere.  I've not really tried to quantify how much, since that would
require first agreeing on a benchmark case --- anybody have a thought
about what that ought to be?  But in at least some of these functions,
we'll be paying a cache lookup or two per array element, which doesn't
sound promising.

This could be alleviated by caching the lookup results over longer
intervals, but I don't see any way to do that without invasive changes
to the APIs of a number of exported array functions, which doesn't seem
like a good thing for a bug fix that we need to back-patch.  I think the
back-branch fix would have to be pretty much what you see here, even if
we then make changes to reduce the costs in HEAD.

Comments?

            regards, tom lane

diff --git a/src/backend/access/common/heaptuple.c b/src/backend/access/common/heaptuple.c
index aea9d40..48b09b8 100644
*** a/src/backend/access/common/heaptuple.c
--- b/src/backend/access/common/heaptuple.c
*************** heap_form_tuple(TupleDesc tupleDescripto
*** 649,674 ****
       * Check for nulls and embedded tuples; expand any toasted attributes in
       * embedded tuples.  This preserves the invariant that toasting can only
       * go one level deep.
-      *
-      * We can skip calling toast_flatten_tuple_attribute() if the attribute
-      * couldn't possibly be of composite type.  All composite datums are
-      * varlena and have alignment 'd'; furthermore they aren't arrays. Also,
-      * if an attribute is already toasted, it must have been sent to disk
-      * already and so cannot contain toasted attributes.
       */
      for (i = 0; i < numberOfAttributes; i++)
      {
          if (isnull[i])
              hasnull = true;
!         else if (att[i]->attlen == -1 &&
!                  att[i]->attalign == 'd' &&
!                  att[i]->attndims == 0 &&
!                  !VARATT_IS_EXTENDED(DatumGetPointer(values[i])))
!         {
!             values[i] = toast_flatten_tuple_attribute(values[i],
!                                                       att[i]->atttypid,
!                                                       att[i]->atttypmod);
!         }
      }

      /*
--- 649,661 ----
       * Check for nulls and embedded tuples; expand any toasted attributes in
       * embedded tuples.  This preserves the invariant that toasting can only
       * go one level deep.
       */
      for (i = 0; i < numberOfAttributes; i++)
      {
          if (isnull[i])
              hasnull = true;
!         else
!             values[i] = toast_flatten_tuple_attribute(values[i], att[i]);
      }

      /*
*************** heap_form_minimal_tuple(TupleDesc tupleD
*** 1403,1428 ****
       * Check for nulls and embedded tuples; expand any toasted attributes in
       * embedded tuples.  This preserves the invariant that toasting can only
       * go one level deep.
-      *
-      * We can skip calling toast_flatten_tuple_attribute() if the attribute
-      * couldn't possibly be of composite type.  All composite datums are
-      * varlena and have alignment 'd'; furthermore they aren't arrays. Also,
-      * if an attribute is already toasted, it must have been sent to disk
-      * already and so cannot contain toasted attributes.
       */
      for (i = 0; i < numberOfAttributes; i++)
      {
          if (isnull[i])
              hasnull = true;
!         else if (att[i]->attlen == -1 &&
!                  att[i]->attalign == 'd' &&
!                  att[i]->attndims == 0 &&
!                  !VARATT_IS_EXTENDED(values[i]))
!         {
!             values[i] = toast_flatten_tuple_attribute(values[i],
!                                                       att[i]->atttypid,
!                                                       att[i]->atttypmod);
!         }
      }

      /*
--- 1390,1402 ----
       * Check for nulls and embedded tuples; expand any toasted attributes in
       * embedded tuples.  This preserves the invariant that toasting can only
       * go one level deep.
       */
      for (i = 0; i < numberOfAttributes; i++)
      {
          if (isnull[i])
              hasnull = true;
!         else
!             values[i] = toast_flatten_tuple_attribute(values[i], att[i]);
      }

      /*
diff --git a/src/backend/access/heap/tuptoaster.c b/src/backend/access/heap/tuptoaster.c
index 9a821d3..2d1a922 100644
*** a/src/backend/access/heap/tuptoaster.c
--- b/src/backend/access/heap/tuptoaster.c
*************** toast_insert_or_update(Relation rel, Hea
*** 991,996 ****
--- 991,999 ----
   *
   *    "Flatten" a tuple to contain no out-of-line toasted fields.
   *    (This does not eliminate compressed or short-header datums.)
+  *
+  *    Note: we expect the caller already checked HeapTupleHasExternal(tup),
+  *    so there is no need for a short-circuit path.
   * ----------
   */
  HeapTuple
*************** toast_flatten_tuple(HeapTuple tup, Tuple
*** 1068,1097 ****


  /* ----------
!  * toast_flatten_tuple_attribute -
   *
!  *    If a Datum is of composite type, "flatten" it to contain no toasted fields.
!  *    This must be invoked on any potentially-composite field that is to be
!  *    inserted into a tuple.    Doing this preserves the invariant that toasting
!  *    goes only one level deep in a tuple.
   *
!  *    Note that flattening does not mean expansion of short-header varlenas,
!  *    so in one sense toasting is allowed within composite datums.
   * ----------
   */
  Datum
! toast_flatten_tuple_attribute(Datum value,
!                               Oid typeId, int32 typeMod)
  {
-     TupleDesc    tupleDesc;
      HeapTupleHeader olddata;
      HeapTupleHeader new_data;
      int32        new_header_len;
      int32        new_data_len;
      int32        new_tuple_len;
      HeapTupleData tmptup;
!     Form_pg_attribute *att;
!     int            numAttrs;
      int            i;
      bool        need_change = false;
      bool        has_nulls = false;
--- 1071,1110 ----


  /* ----------
!  * toast_flatten_tuple_datum -
   *
!  *    "Flatten" a composite Datum to contain no toasted fields.
!  *    This must be invoked on any composite value that is to be inserted into
!  *    a tuple, array, range, etc.  Doing this preserves the invariant that
!  *    toasting goes only one level deep in a tuple.
   *
!  *    Unlike toast_flatten_tuple, this does expand compressed fields.  That's
!  *    not necessary for correctness, but is a policy decision based on the
!  *    assumption that compression will be more effective if applied to the
!  *    whole tuple not individual fields.
!  *
!  *    On the other hand, in-line short-header varlena fields are left alone.
!  *    If we "untoasted" them here, they'd just get changed back to short-header
!  *    format anyway within heap_fill_tuple.
!  *
!  *    Note: generally speaking, callers should skip applying this to Datums
!  *    that are toasted overall, even just to the extent of being in short-header
!  *    format.  That implies that the Datum has previously been stored within
!  *    some tuple, array, range, etc, and therefore it should already not contain
!  *    any out-of-line fields.
   * ----------
   */
  Datum
! toast_flatten_tuple_datum(Datum value, TupleDesc tupleDesc)
  {
      HeapTupleHeader olddata;
      HeapTupleHeader new_data;
      int32        new_header_len;
      int32        new_data_len;
      int32        new_tuple_len;
      HeapTupleData tmptup;
!     Form_pg_attribute *att = tupleDesc->attrs;
!     int            numAttrs = tupleDesc->natts;
      int            i;
      bool        need_change = false;
      bool        has_nulls = false;
*************** toast_flatten_tuple_attribute(Datum valu
*** 1100,1120 ****
      bool        toast_free[MaxTupleAttributeNumber];

      /*
-      * See if it's a composite type, and get the tupdesc if so.
-      */
-     tupleDesc = lookup_rowtype_tupdesc_noerror(typeId, typeMod, true);
-     if (tupleDesc == NULL)
-         return value;            /* not a composite type */
-
-     att = tupleDesc->attrs;
-     numAttrs = tupleDesc->natts;
-
-     /*
       * Break down the tuple into fields.
       */
      olddata = DatumGetHeapTupleHeader(value);
!     Assert(typeId == HeapTupleHeaderGetTypeId(olddata));
!     Assert(typeMod == HeapTupleHeaderGetTypMod(olddata));
      /* Build a temporary HeapTuple control structure */
      tmptup.t_len = HeapTupleHeaderGetDatumLength(olddata);
      ItemPointerSetInvalid(&(tmptup.t_self));
--- 1113,1128 ----
      bool        toast_free[MaxTupleAttributeNumber];

      /*
       * Break down the tuple into fields.
+      *
+      * Note: if the supplied datum is toasted overall, DatumGetHeapTupleHeader
+      * will detoast it, and then we'll leak the detoasted copy.  This is not
+      * worth fixing because callers shouldn't call us on toasted datums
+      * anyway.
       */
      olddata = DatumGetHeapTupleHeader(value);
!     Assert(HeapTupleHeaderGetTypeId(olddata) == tupleDesc->tdtypeid);
!     Assert(HeapTupleHeaderGetTypMod(olddata) == tupleDesc->tdtypmod);
      /* Build a temporary HeapTuple control structure */
      tmptup.t_len = HeapTupleHeaderGetDatumLength(olddata);
      ItemPointerSetInvalid(&(tmptup.t_self));
*************** toast_flatten_tuple_attribute(Datum valu
*** 1150,1162 ****
      }

      /*
!      * If nothing to untoast, just return the original tuple.
       */
      if (!need_change)
-     {
-         ReleaseTupleDesc(tupleDesc);
          return value;
-     }

      /*
       * Calculate the new size of the tuple.
--- 1158,1167 ----
      }

      /*
!      * If nothing to untoast, just return the original datum.
       */
      if (!need_change)
          return value;

      /*
       * Calculate the new size of the tuple.
*************** toast_flatten_tuple_attribute(Datum valu
*** 1202,1214 ****
      for (i = 0; i < numAttrs; i++)
          if (toast_free[i])
              pfree(DatumGetPointer(toast_values[i]));
-     ReleaseTupleDesc(tupleDesc);

      return PointerGetDatum(new_data);
  }


  /* ----------
   * toast_compress_datum -
   *
   *    Create a compressed version of a varlena datum
--- 1207,1269 ----
      for (i = 0; i < numAttrs; i++)
          if (toast_free[i])
              pfree(DatumGetPointer(toast_values[i]));

      return PointerGetDatum(new_data);
  }


  /* ----------
+  * toast_flatten_tuple_attribute -
+  *
+  *    If a Datum is of composite type, "flatten" it to contain no toasted fields.
+  *    This is a convenience routine for doing toast_flatten_tuple_datum() on
+  *    tuple fields.
+  * ----------
+  */
+ Datum
+ toast_flatten_tuple_attribute(Datum value, Form_pg_attribute att)
+ {
+     TupleDesc    tupleDesc;
+
+     /*
+      * Exit quickly if the attribute couldn't possibly be of composite type.
+      * All composite datums are varlena and have alignment 'd'; furthermore
+      * they aren't arrays.  This heuristic is sufficient to eliminate all
+      * but a few non-composite types.
+      */
+     if (att->attlen != -1 ||
+         att->attalign != 'd' ||
+         att->attndims != 0)
+         return value;
+
+     /*
+      * Also, if the datum is already toasted, it must have already been stored
+      * within some larger tuple (or array, range, etc) and so it doesn't need
+      * to be flattened again.  (It is not our job here to force it to be
+      * uncompressed, so don't worry about that.)
+      */
+     if (VARATT_IS_EXTENDED(DatumGetPointer(value)))
+         return value;
+
+     /*
+      * See if it's a composite type, and get the tupdesc if so.
+      */
+     tupleDesc = lookup_rowtype_tupdesc_noerror(att->atttypid,
+                                                att->atttypmod,
+                                                true);
+     if (tupleDesc == NULL)
+         return value;            /* not a composite type */
+
+     /* OK, flatten it */
+     value = toast_flatten_tuple_datum(value, tupleDesc);
+
+     ReleaseTupleDesc(tupleDesc);
+
+     return value;
+ }
+
+
+ /* ----------
   * toast_compress_datum -
   *
   *    Create a compressed version of a varlena datum
diff --git a/src/backend/utils/adt/arrayfuncs.c b/src/backend/utils/adt/arrayfuncs.c
index 91df184..530031f 100644
*** a/src/backend/utils/adt/arrayfuncs.c
--- b/src/backend/utils/adt/arrayfuncs.c
***************
*** 17,22 ****
--- 17,24 ----
  #include <ctype.h>

  #include "access/htup_details.h"
+ #include "access/tuptoaster.h"
+ #include "catalog/pg_type.h"
  #include "funcapi.h"
  #include "libpq/pqformat.h"
  #include "utils/array.h"
*************** typedef struct ArrayIteratorData
*** 74,79 ****
--- 76,120 ----
      int            current_item;    /* the item # we're at in the array */
  }    ArrayIteratorData;

+ /* Support for array element detoasting */
+ typedef struct ArrayFlattenState
+ {
+     bool        typiscomposite; /* element type is composite? */
+     TupleDesc    tupdesc;        /* element type's tupledesc, if known */
+ } ArrayFlattenState;
+
+ /* Call this on each non-NULL value about to be inserted into an array */
+ #define FLATTEN_ARRAY_ELEMENT(val, typlen, afstate) \
+     do { \
+         if ((typlen) == -1) \
+         { \
+             if ((afstate).typiscomposite) \
+                 val = flatten_composite_element(val, &(afstate)); \
+             else \
+                 val = PointerGetDatum(PG_DETOAST_DATUM(val)); \
+         } \
+     } while (0)
+
+ /* Variant that guarantees to make a copy */
+ #define FLATTEN_ARRAY_ELEMENT_COPY(val, typlen, typbyval, afstate) \
+     do { \
+         if ((typlen) == -1) \
+         { \
+             if ((afstate).typiscomposite) \
+             { \
+                 Datum _orig_val = (val); \
+                 val = flatten_composite_element(val, &(afstate)); \
+                 if (val == _orig_val) \
+                     val = datumCopy(val, typbyval, typlen); \
+             } \
+             else \
+                 val = PointerGetDatum(PG_DETOAST_DATUM_COPY(val)); \
+         } \
+         else \
+             val = datumCopy(val, typbyval, typlen); \
+     } while (0)
+
+ /* Local functions */
  static bool array_isspace(char ch);
  static int    ArrayCount(const char *str, int *dim, char typdelim);
  static void ReadArrayStr(char *arrayStr, const char *origStr,
*************** static void CopyArrayEls(ArrayType *arra
*** 92,97 ****
--- 133,143 ----
               Datum *values, bool *nulls, int nitems,
               int typlen, bool typbyval, char typalign,
               bool freedata);
+ static int    array_cmp(FunctionCallInfo fcinfo);
+ static void init_flatten_support(ArrayFlattenState *afstate, Oid elmtype,
+                      int16 typlen, char typalign);
+ static Datum flatten_composite_element(Datum value, ArrayFlattenState *afstate);
+ static void end_flatten_support(ArrayFlattenState *afstate);
  static bool array_get_isnull(const bits8 *nullbitmap, int offset);
  static void array_set_isnull(bits8 *nullbitmap, int offset, bool isNull);
  static Datum ArrayCast(char *value, bool byval, int len);
*************** static void array_insert_slice(ArrayType
*** 119,125 ****
                     int ndim, int *dim, int *lb,
                     int *st, int *endp,
                     int typlen, bool typbyval, char typalign);
- static int    array_cmp(FunctionCallInfo fcinfo);
  static ArrayType *create_array_envelope(int ndims, int *dimv, int *lbv, int nbytes,
                        Oid elmtype, int dataoffset);
  static ArrayType *array_fill_internal(ArrayType *dims, ArrayType *lbs,
--- 165,170 ----
*************** ReadArrayStr(char *arrayStr,
*** 864,870 ****
              hasnull = true;
          else
          {
!             /* let's just make sure data is not toasted */
              if (typlen == -1)
                  values[i] = PointerGetDatum(PG_DETOAST_DATUM(values[i]));
              totbytes = att_addlength_datum(totbytes, typlen, values[i]);
--- 909,922 ----
              hasnull = true;
          else
          {
!             /*
!              * Let's just make sure data is not toasted.  It seems quite
!              * unlikely that the input function could have produced a toasted
!              * value at all, and certainly it shouldn't have produced
!              * something containing nested toasted values, so we don't bother
!              * with FLATTEN_ARRAY_ELEMENT().  But a PG_DETOAST_DATUM() call is
!              * reasonably cheap when it's a no-op.
!              */
              if (typlen == -1)
                  values[i] = PointerGetDatum(PG_DETOAST_DATUM(values[i]));
              totbytes = att_addlength_datum(totbytes, typlen, values[i]);
*************** ReadArrayBinary(StringInfo buf,
*** 1466,1472 ****
              hasnull = true;
          else
          {
!             /* let's just make sure data is not toasted */
              if (typlen == -1)
                  values[i] = PointerGetDatum(PG_DETOAST_DATUM(values[i]));
              totbytes = att_addlength_datum(totbytes, typlen, values[i]);
--- 1518,1531 ----
              hasnull = true;
          else
          {
!             /*
!              * Let's just make sure data is not toasted.  It seems quite
!              * unlikely that the receive function could have produced a
!              * toasted value at all, and certainly it shouldn't have produced
!              * something containing nested toasted values, so we don't bother
!              * with FLATTEN_ARRAY_ELEMENT().  But a PG_DETOAST_DATUM() call is
!              * reasonably cheap when it's a no-op.
!              */
              if (typlen == -1)
                  values[i] = PointerGetDatum(PG_DETOAST_DATUM(values[i]));
              totbytes = att_addlength_datum(totbytes, typlen, values[i]);
*************** array_set(ArrayType *array,
*** 2060,2065 ****
--- 2119,2125 ----
            char elmalign)
  {
      ArrayType  *newarray;
+     Oid            elmtype;
      int            i,
                  ndim,
                  dim[MAXDIM],
*************** array_set(ArrayType *array,
*** 2114,2128 ****
                  (errcode(ERRCODE_ARRAY_SUBSCRIPT_ERROR),
                   errmsg("wrong number of array subscripts")));

-     /* make sure item to be inserted is not toasted */
-     if (elmlen == -1 && !isNull)
-         dataValue = PointerGetDatum(PG_DETOAST_DATUM(dataValue));
-
      /* detoast input array if necessary */
      array = DatumGetArrayTypeP(PointerGetDatum(array));

      ndim = ARR_NDIM(array);

      /*
       * if number of dims is zero, i.e. an empty array, create an array with
       * nSubscripts dimensions, and set the lower bounds to the supplied
--- 2174,2195 ----
                  (errcode(ERRCODE_ARRAY_SUBSCRIPT_ERROR),
                   errmsg("wrong number of array subscripts")));

      /* detoast input array if necessary */
      array = DatumGetArrayTypeP(PointerGetDatum(array));

+     elmtype = ARR_ELEMTYPE(array);
      ndim = ARR_NDIM(array);

+     /* make sure item to be inserted is not toasted */
+     if (elmlen == -1 && !isNull)
+     {
+         ArrayFlattenState afstate;
+
+         init_flatten_support(&afstate, elmtype, elmlen, elmalign);
+         FLATTEN_ARRAY_ELEMENT(dataValue, elmlen, afstate);
+         end_flatten_support(&afstate);
+     }
+
      /*
       * if number of dims is zero, i.e. an empty array, create an array with
       * nSubscripts dimensions, and set the lower bounds to the supplied
*************** array_set(ArrayType *array,
*** 2130,2137 ****
       */
      if (ndim == 0)
      {
-         Oid            elmtype = ARR_ELEMTYPE(array);
-
          for (i = 0; i < nSubscripts; i++)
          {
              dim[i] = 1;
--- 2197,2202 ----
*************** array_map(FunctionCallInfo fcinfo, Oid i
*** 2688,2693 ****
--- 2753,2759 ----
      int            bitmask;
      ArrayMetaState *inp_extra;
      ArrayMetaState *ret_extra;
+     ArrayFlattenState afstate;

      /* Get input array */
      if (fcinfo->nargs < 1)
*************** array_map(FunctionCallInfo fcinfo, Oid i
*** 2741,2746 ****
--- 2807,2814 ----
      typbyval = ret_extra->typbyval;
      typalign = ret_extra->typalign;

+     init_flatten_support(&afstate, retType, typlen, typalign);
+
      /* Allocate temporary arrays for new values */
      values = (Datum *) palloc(nitems * sizeof(Datum));
      nulls = (bool *) palloc(nitems * sizeof(bool));
*************** array_map(FunctionCallInfo fcinfo, Oid i
*** 2800,2807 ****
          else
          {
              /* Ensure data is not toasted */
!             if (typlen == -1)
!                 values[i] = PointerGetDatum(PG_DETOAST_DATUM(values[i]));
              /* Update total result size */
              nbytes = att_addlength_datum(nbytes, typlen, values[i]);
              nbytes = att_align_nominal(nbytes, typalign);
--- 2868,2874 ----
          else
          {
              /* Ensure data is not toasted */
!             FLATTEN_ARRAY_ELEMENT(values[i], typlen, afstate);
              /* Update total result size */
              nbytes = att_addlength_datum(nbytes, typlen, values[i]);
              nbytes = att_align_nominal(nbytes, typalign);
*************** construct_md_array(Datum *elems,
*** 2921,2926 ****
--- 2988,2994 ----
      int32        dataoffset;
      int            i;
      int            nelems;
+     ArrayFlattenState afstate;

      if (ndims < 0)                /* we do allow zero-dimension arrays */
          ereport(ERROR,
*************** construct_md_array(Datum *elems,
*** 2938,2943 ****
--- 3006,3013 ----

      nelems = ArrayGetNItems(ndims, dims);

+     init_flatten_support(&afstate, elmtype, elmlen, elmalign);
+
      /* compute required space */
      nbytes = 0;
      hasnulls = false;
*************** construct_md_array(Datum *elems,
*** 2949,2956 ****
              continue;
          }
          /* make sure data is not toasted */
!         if (elmlen == -1)
!             elems[i] = PointerGetDatum(PG_DETOAST_DATUM(elems[i]));
          nbytes = att_addlength_datum(nbytes, elmlen, elems[i]);
          nbytes = att_align_nominal(nbytes, elmalign);
          /* check for overflow of total request */
--- 3019,3026 ----
              continue;
          }
          /* make sure data is not toasted */
!         FLATTEN_ARRAY_ELEMENT(elems[i], elmlen, afstate);
!         /* update total result size */
          nbytes = att_addlength_datum(nbytes, elmlen, elems[i]);
          nbytes = att_align_nominal(nbytes, elmalign);
          /* check for overflow of total request */
*************** construct_md_array(Datum *elems,
*** 2961,2966 ****
--- 3031,3038 ----
                              (int) MaxAllocSize)));
      }

+     end_flatten_support(&afstate);
+
      /* Allocate and initialize result array */
      if (hasnulls)
      {
*************** array_free_iterator(ArrayIterator iterat
*** 4070,4075 ****
--- 4142,4276 ----
  /***************************************************************************/

  /*
+  * Array element detoasting ("flattening") support
+  *
+  * It is critical that no external TOAST pointers appear within an array,
+  * either directly as an array element or within a non-scalar element type,
+  * so we must detoast any value that is about to be inserted into an array.
+  *
+  * Since we allow composite-type datums to contain toasted fields (at the
+  * first level; nested toasted values aren't allowed), we have to work extra
+  * hard to get rid of possible toasting when the array element type is
+  * composite.  We assume that other composites such as ranges contain no
+  * external pointers, though.
+  *
+  * In addition to the no-external-pointers requirement, we make a policy
+  * decision that array elements not be compressed.    We expect that compression
+  * would be better applied to the whole array rather than individual elements.
+  *
+  * The current implementation of FLATTEN_ARRAY_ELEMENT() also forces varlena
+  * elements to not be in short-header format, but this is an implementation
+  * artifact that should probably be removed.
+  */
+
+ /*
+  * Prepare for calls to FLATTEN_ARRAY_ELEMENT()
+  *
+  * Currently, we only have typlen/typbyval/typalign information cached for
+  * the array element type.    To minimize unnecessary syscache lookups, this
+  * function uses the knowledge that a composite type must have typlen -1
+  * (varlena) and alignment 'd'.  There are only a few non-composite, non-array
+  * types for which this is true, so though imperfect this heuristic should
+  * work well enough.
+  */
+ static void
+ init_flatten_support(ArrayFlattenState *afstate, Oid elmtype,
+                      int16 typlen, char typalign)
+ {
+     if (typlen == -1 && typalign == 'd')
+     {
+         /* Possibly composite */
+         if (elmtype == RECORDOID)
+         {
+             /* We can't identify the specific record type yet */
+             afstate->typiscomposite = true;
+             afstate->tupdesc = NULL;
+         }
+         else if (type_is_rowtype(elmtype))
+         {
+             /* Named composite type, we can look it up right away */
+             afstate->typiscomposite = true;
+             afstate->tupdesc = lookup_rowtype_tupdesc(elmtype, -1);
+         }
+         else
+         {
+             /* It's not composite */
+             afstate->typiscomposite = false;
+             afstate->tupdesc = NULL;
+         }
+     }
+     else
+     {
+         /* Element type cannot be composite */
+         afstate->typiscomposite = false;
+         afstate->tupdesc = NULL;
+     }
+ }
+
+ /*
+  * Detoast ("flatten") a composite value
+  *
+  * This is the out-of-line portion of the FLATTEN_ARRAY_ELEMENT macro.
+  * We assume we have a non-null, known-composite value to work on.
+  * Note: for some logic paths, it's important that this does not
+  * copy the Datum unnecessarily.
+  */
+ static Datum
+ flatten_composite_element(Datum value, ArrayFlattenState *afstate)
+ {
+     TupleDesc    tupleDesc;
+     HeapTupleHeader tupdata;
+
+     /*
+      * If the value is itself toasted (even just to the extent of having a
+      * short varlena header), then it must have been incorporated into some
+      * larger composite structure, so it should not contain any embedded toast
+      * pointers.  We need only detoast it overall and we're done.
+      */
+     if (VARATT_IS_EXTENDED(DatumGetPointer(value)))
+         return PointerGetDatum(PG_DETOAST_DATUM(value));
+
+     /*
+      * When working with an array of RECORD, it's possible that each array
+      * element is of a different record subtype, so we have to be prepared to
+      * look up the appropriate tupdesc here.  But we cache the tupdesc across
+      * calls as much as we can.
+      */
+     tupleDesc = afstate->tupdesc;
+     tupdata = DatumGetHeapTupleHeader(value);
+     if (HeapTupleHeaderGetTypeId(tupdata) == RECORDOID)
+     {
+         int32        typmod = HeapTupleHeaderGetTypMod(tupdata);
+
+         if (tupleDesc && tupleDesc->tdtypmod != typmod)
+         {
+             ReleaseTupleDesc(tupleDesc);
+             tupleDesc = NULL;
+         }
+         if (tupleDesc == NULL)
+         {
+             tupleDesc = lookup_rowtype_tupdesc(RECORDOID, typmod);
+             afstate->tupdesc = tupleDesc;
+         }
+     }
+
+     /* tuptoaster.c does the actual work */
+     return toast_flatten_tuple_datum(value, tupleDesc);
+ }
+
+ /*
+  * Clean up after calls to FLATTEN_ARRAY_ELEMENT()
+  */
+ static void
+ end_flatten_support(ArrayFlattenState *afstate)
+ {
+     /* We have to release any refcount we may hold on a typcache tupdesc */
+     if (afstate->tupdesc)
+         ReleaseTupleDesc(afstate->tupdesc);
+ }
+
+
+ /*
   * Check whether a specific array element is NULL
   *
   * nullbitmap: pointer to array's null bitmap (NULL if none)
*************** accumArrayResult(ArrayBuildState *astate
*** 4617,4628 ****
       * because construct_md_array can detoast the array elements later.
       * However, we must not let construct_md_array modify the ArrayBuildState
       * because that would mean array_agg_finalfn damages its input, which is
!      * verboten.  Also, this way frequently saves one copying step.)
       */
      if (!disnull && !astate->typbyval)
      {
          if (astate->typlen == -1)
!             dvalue = PointerGetDatum(PG_DETOAST_DATUM_COPY(dvalue));
          else
              dvalue = datumCopy(dvalue, astate->typbyval, astate->typlen);
      }
--- 4818,4837 ----
       * because construct_md_array can detoast the array elements later.
       * However, we must not let construct_md_array modify the ArrayBuildState
       * because that would mean array_agg_finalfn damages its input, which is
!      * verboten.  Also, this way frequently saves one copying step.  It's
!      * annoying however that we might be doing a syscache lookup for each item
!      * when the element type is composite.)
       */
      if (!disnull && !astate->typbyval)
      {
          if (astate->typlen == -1)
!         {
!             ArrayFlattenState afstate;
!
!             init_flatten_support(&afstate, element_type, -1, astate->typalign);
!             FLATTEN_ARRAY_ELEMENT_COPY(dvalue, -1, false, afstate);
!             end_flatten_support(&afstate);
!         }
          else
              dvalue = datumCopy(dvalue, astate->typbyval, astate->typlen);
      }
*************** array_fill_internal(ArrayType *dims, Arr
*** 5037,5043 ****

          /* make sure data is not toasted */
          if (elmlen == -1)
!             value = PointerGetDatum(PG_DETOAST_DATUM(value));

          nbytes = att_addlength_datum(0, elmlen, value);
          nbytes = att_align_nominal(nbytes, elmalign);
--- 5246,5258 ----

          /* make sure data is not toasted */
          if (elmlen == -1)
!         {
!             ArrayFlattenState afstate;
!
!             init_flatten_support(&afstate, elmtype, elmlen, elmalign);
!             FLATTEN_ARRAY_ELEMENT(value, elmlen, afstate);
!             end_flatten_support(&afstate);
!         }

          nbytes = att_addlength_datum(0, elmlen, value);
          nbytes = att_align_nominal(nbytes, elmalign);
*************** array_replace_internal(ArrayType *array,
*** 5277,5286 ****
       */
      if (typlen == -1)
      {
          if (!search_isnull)
!             search = PointerGetDatum(PG_DETOAST_DATUM(search));
          if (!replace_isnull)
!             replace = PointerGetDatum(PG_DETOAST_DATUM(replace));
      }

      /* Prepare to apply the comparison operator */
--- 5492,5505 ----
       */
      if (typlen == -1)
      {
+         ArrayFlattenState afstate;
+
+         init_flatten_support(&afstate, element_type, typlen, typalign);
          if (!search_isnull)
!             FLATTEN_ARRAY_ELEMENT(search, typlen, afstate);
          if (!replace_isnull)
!             FLATTEN_ARRAY_ELEMENT(replace, typlen, afstate);
!         end_flatten_support(&afstate);
      }

      /* Prepare to apply the comparison operator */
diff --git a/src/include/access/tuptoaster.h b/src/include/access/tuptoaster.h
index 296d016..5cdefad 100644
*** a/src/include/access/tuptoaster.h
--- b/src/include/access/tuptoaster.h
*************** extern struct varlena *heap_tuple_untoas
*** 184,199 ****
  extern HeapTuple toast_flatten_tuple(HeapTuple tup, TupleDesc tupleDesc);

  /* ----------
   * toast_flatten_tuple_attribute -
   *
   *    If a Datum is of composite type, "flatten" it to contain no toasted fields.
!  *    This must be invoked on any potentially-composite field that is to be
!  *    inserted into a tuple.    Doing this preserves the invariant that toasting
!  *    goes only one level deep in a tuple.
   * ----------
   */
! extern Datum toast_flatten_tuple_attribute(Datum value,
!                               Oid typeId, int32 typeMod);

  /* ----------
   * toast_compress_datum -
--- 184,208 ----
  extern HeapTuple toast_flatten_tuple(HeapTuple tup, TupleDesc tupleDesc);

  /* ----------
+  * toast_flatten_tuple_datum -
+  *
+  *    "Flatten" a composite Datum to contain no toasted fields.
+  *    This must be invoked on any composite value that is to be inserted into
+  *    a tuple, array, range, etc.  Doing this preserves the invariant that
+  *    toasting goes only one level deep in a tuple.
+  * ----------
+  */
+ extern Datum toast_flatten_tuple_datum(Datum value, TupleDesc tupleDesc);
+
+ /* ----------
   * toast_flatten_tuple_attribute -
   *
   *    If a Datum is of composite type, "flatten" it to contain no toasted fields.
!  *    This is a convenience routine for doing toast_flatten_tuple_datum() on
!  *    tuple fields.
   * ----------
   */
! extern Datum toast_flatten_tuple_attribute(Datum value, Form_pg_attribute att);

  /* ----------
   * toast_compress_datum -

pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Clock sweep not caching enough B-Tree leaf pages?
Next
From: Bruce Momjian
Date:
Subject: Re: WIP patch for Todo Item : Provide fallback_application_name in contrib/pgbench, oid2name, and dblink