Manipulating complex types as non-contiguous structures in-memory - Mailing list pgsql-hackers

From Tom Lane
Subject Manipulating complex types as non-contiguous structures in-memory
Date
Msg-id 20178.1423598435@sss.pgh.pa.us
Whole thread Raw
Responses Re: Manipulating complex types as non-contiguous structures in-memory  (Stephen Frost <sfrost@snowman.net>)
Re: Manipulating complex types as non-contiguous structures in-memory  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
I've been fooling around with a design to support computation-oriented,
not-necessarily-contiguous-blobs representations of datatypes in memory,
along the lines I mentioned here:
http://www.postgresql.org/message-id/2355.1382710707@sss.pgh.pa.us

In particular this is meant to reduce the overhead for repeated operations
on arrays, records, etc.  We've had several previous discussions about
that, and even some single-purpose patches such as in this thread:
http://www.postgresql.org/message-id/flat/CAFj8pRAKuDU_0md-dg6Ftk0wSupvMLyrV1PB+HyC+GUBZz346w@mail.gmail.com
There was also a thread discussing how this sort of thing could be
useful to PostGIS:
http://www.postgresql.org/message-id/526A61FB.1050209@oslandia.com
and it's been discussed a few other times too, but I'm too lazy to
search the archives any further.

I've now taken this idea as far as building the required infrastructure
and revamping a couple of array operators to use it.  There's a lot yet
to do, but I've done enough to get some preliminary ideas about
performance (see below).

The core ideas of this patch are:

* Invent a new TOAST datum type "pointer to deserialized object", which
is physically just like the existing indirect-toast-pointer concept, but
it has a different va_tag code and somewhat different semantics.

* A deserialized object has a standard header (which is what the toast
pointers point to) and typically will have additional data-type-specific
fields after that.  One component of the standard header is a pointer to
a set of "method" functions that provide ways to accomplish standard
data-type-independent operations on the deserialized object.

* Another standard header component is a MemoryContext identifier: the
header, as well as all subsidiary data belonging to the deserialized
object, must live in this context.  (Well, I guess there could also be
child contexts.)  By exposing an explicit context identifier, we can
accomplish tasks like "move this object into another context" by
reparenting the object's context rather than physically copying anything.

* The only standard "methods" I've found a need for so far are functions
to re-serialize the object, that is generate a plain varlena value that is
semantically equivalent.  To avoid extra copying, this is split into
separate "compute the space needed" and "serialize into this memory"
steps, so that the result can be dropped exactly where the caller needs
it.

* Currently, a deserialized object will be reserialized in that way
whenever we incorporate it into a physical tuple (ie, heap_form_tuple
or index_form_tuple), or whenever somebody applies datumCopy() to it.
I'd like to relax this later, but there's an awful lot of code that
supposes that heap_form_tuple or datumCopy will produce a self-contained
value that survives beyond, eg, destruction of the memory context that
contained the source Datums.  We can get good speedups in a lot of
interesting cases without solving that problem, so I don't feel too bad
about leaving it as a future project.

* In particular, things like PG_GETARG_ARRAYTYPE_P() treat a deserialized
toast pointer as something to be detoasted, and will produce a palloc'd
re-serialized value.  This means that we do not need to convert all the
C functions concerned with a given datatype at the same time (or indeed
ever); a function that hasn't been upgraded will build a re-serialized
representation and then operate on that.  We'll invent alternate
argument-fetching functions that skip the reserialization step, for use
by functions that have been upgraded to handle either case.  This is
basically the same approach we used when we introduced short varlena
headers, and that seems to have gone smoothly enough.

* There's a concept that a deserialized object has a "primary" toast
pointer, which is physically part of the object, as well as "secondary"
toast pointers which might or might not be part of the object.  If you
have a Datum pointer to the primary toast pointer then you are authorized
to modify the object in-place; if you have a Datum pointer to a secondary
toast pointer then you must treat it as read-only (ie, you have to make a
copy if you're going to change it).  Functions that construct a new
deserialized object always return its primary toast pointer; this allows a
nest of functions to modify an object in-place without copying, which was
the primary need that the PostGIS folks expressed.  On the other hand,
plpgsql can hand out secondary toast pointers to deserialized objects
stored in plpgsql function variables, thus ensuring that the objects won't
be modified unexpectedly, while never having to physically copy them if
the called functions just need to inspect them.

* Primary and secondary pointers are physically identical, but the
primary pointer resides in a specific spot in the deserialized object's
standard header.  (So you can tell if you've got the primary pointer via
a simple address comparison.)

* I've modified the array element assignment path in plpgsql's
exec_assign_value so that, instead of passing a secondary toast pointer
to array_set() as you might expect from the above, it passes the primary
toast pointer thus allowing array_set() to modify the variable in-place.
So an operation like "array_variable[x] := y" no longer incurs recopying
of the whole array, once the variable has been converted into deserialized
form.  (If it's not yet, it becomes deserialized after the first such
assignment.)  Also, assignment of an already-deserialized value to a
variable accomplishes that with a MemoryContext parent pointer swing
instead of physical copying, if what we have is the primary toast pointer,
which implies it's not referenced anywhere else.

* Any functions that plpgsql gives a read/write pointer to need to be
exceedingly careful to not leave a corrupted object behind if they fail
partway through.  I've successfully written such a version of array_set(),
and it wasn't too painful, but this may be a limitation on the general
applicability of the whole approach.

* In the current patch, that restriction only applies to array_set()
anyway.  But I would like to allow in-place updates for non-core cases.
For example in something like
    hstore_var := hstore_var || 'foo=>bar';
we could plausibly pass a R/W pointer to hstore_concat and let it modify
hstore_var in place.  But this would require knowing which such functions
are safe, or assuming that they all are, which might be an onerous
restriction.

* I soon noticed that I was getting a lot of empty "deserialized array"
contexts floating around.  The attached patch addresses this in a quick
hack fashion by redefining ResetExprContext() to use
MemoryContextResetAndDeleteChildren() instead of MemoryContextReset(),
so that deserialized objects created within an expression evaluation
context go completely away at ResetExprContext(), rather than being left
behind as empty subcontext shells.  We've talked more than once about
redefining mcxt.c's API so that MemoryContextReset() means what's
currently meant by MemoryContextResetAndDeleteChildren(), and if you
really truly do want to keep empty child contexts around then you need to
call something else instead.  I did not go that far here, but I think we
should seriously consider biting the bullet and finally changing it.

* Although I said above that everything owned by a deserialized object
has to live in a single memory context, I do have ideas about relaxing
that.  The core idea would be to invent a "memory context reset/delete
callback" feature in mcxt.c.  Then a deserialized object could register
such a callback on its own memory context, and use the callback to clean
up resources outside its context.  This is potentially useful for instance
for something like PostGIS, where an object likely includes some data that
was allocated with malloc not palloc because it was created by library
functions that aren't Postgres-aware.  Another likely use-case is for
deserialized objects representing composite types to maintain reference
counts on their tuple descriptors instead of having to copy said
descriptors into their private contexts.  This'd be material for a
separate patch though.


So that's the plan, and attached is a very-much-WIP patch that uses this
approach to speed up plpgsql array element assignments (and not a whole
lot else as yet).  Here's the basic test case I've been using:

create or replace function arraysetint(n int) returns int[] as $$
declare res int[] := '{}';
begin
  for i in 1 .. n loop
    res[i] := i;
  end loop;
  return res;
end
$$ language plpgsql strict;

In HEAD, this function's runtime grows as O(N^2), so for example
(with casserts off on my x86_64 workstation):

regression=# select array_dims(arraysetint(100000));
 array_dims
------------
 [1:100000]
(1 row)

Time: 7874.070 ms

With variable-length array elements, such as if you change the
int[] arrays to numeric[], it's even worse:

regression=# select array_dims(arraysetnum(100000));
 array_dims
------------
 [1:100000]
(1 row)

Time: 31177.340 ms

With the attached patch, those timings drop to 80 and 150 ms respectively.

It's not all peaches and cream: for the array_append operator, which is
also accelerated by the patch (mainly because it is too much in bed with
array_set to not fix at the same time ;-)), I tried examples like

explain analyze select array[1,2] || g || g || g from generate_series(1,1000000) g;

Very roughly, HEAD needs about 400 ns per || operator in this scenario.
With the patch, it's about 480 ns for the first operator and then 200 more
for each one accepting a prior operator's output.  (Those numbers could
perhaps be improved with more-invasive refactoring of the array code.)
The extra initial overhead represents the time to convert the array[1,2]
constant to deserialized form during each execution of the first operator.

Still, if the worst-case slowdown is around 20% on trivially-sized arrays,
I'd gladly take that to have better performance on larger arrays.  And I
think this example is close to the worst case for the patch's approach,
since it's testing small, fixed-element-length, no-nulls arrays, which is
what the existing code can handle without spending a lot of cycles.

Note that I've kept all the deserialized-array-specific code in its own file
for now, just for ease of hacking.  That stuff would need to propagate into
the main array-related files in a more complete patch.

BTW, I'm not all that thrilled with the "deserialized object" terminology.
I found myself repeatedly tripping up on which form was serialized and
which de-.  If anyone's got a better naming idea I'm willing to adopt it.

I'm not sure exactly how to push this forward.  I would not want to
commit it without converting a significant number of array functions to
understand about deserialized inputs, and by the time I've finished that
work it's likely to be too late for 9.5.  OTOH I'm sure that the PostGIS
folk would love to have this infrastructure in 9.5 not 9.6 so they could
make a start on fixing their issues.  (Further down the pike, I'd plan to
look at adapting composite-type operations, JSONB, etc, to make use of
this approach, but that certainly isn't happening for 9.5.)

Thoughts, advice, better ideas?

            regards, tom lane

diff --git a/src/backend/access/common/heaptuple.c b/src/backend/access/common/heaptuple.c
index 867035d..e5fcced 100644
*** a/src/backend/access/common/heaptuple.c
--- b/src/backend/access/common/heaptuple.c
***************
*** 60,65 ****
--- 60,66 ----
  #include "access/sysattr.h"
  #include "access/tuptoaster.h"
  #include "executor/tuptable.h"
+ #include "utils/deserialized.h"


  /* Does att's datatype allow packing into the 1-byte-header varlena format? */
*************** heap_compute_data_size(TupleDesc tupleDe
*** 93,105 ****
      for (i = 0; i < numberOfAttributes; i++)
      {
          Datum        val;

          if (isnull[i])
              continue;

          val = values[i];

!         if (ATT_IS_PACKABLE(att[i]) &&
              VARATT_CAN_MAKE_SHORT(DatumGetPointer(val)))
          {
              /*
--- 94,108 ----
      for (i = 0; i < numberOfAttributes; i++)
      {
          Datum        val;
+         Form_pg_attribute atti;

          if (isnull[i])
              continue;

          val = values[i];
+         atti = att[i];

!         if (ATT_IS_PACKABLE(atti) &&
              VARATT_CAN_MAKE_SHORT(DatumGetPointer(val)))
          {
              /*
*************** heap_compute_data_size(TupleDesc tupleDe
*** 108,118 ****
               */
              data_length += VARATT_CONVERTED_SHORT_SIZE(DatumGetPointer(val));
          }
          else
          {
!             data_length = att_align_datum(data_length, att[i]->attalign,
!                                           att[i]->attlen, val);
!             data_length = att_addlength_datum(data_length, att[i]->attlen,
                                                val);
          }
      }
--- 111,131 ----
               */
              data_length += VARATT_CONVERTED_SHORT_SIZE(DatumGetPointer(val));
          }
+         else if (atti->attlen == -1 &&
+                  VARATT_IS_EXTERNAL_DESERIALIZED(DatumGetPointer(val)))
+         {
+             /*
+              * we want to re-serialize the deserialized value so that the
+              * constructed tuple doesn't depend on it
+              */
+             data_length = att_align_nominal(data_length, atti->attalign);
+             data_length += DOH_get_serialized_size(DatumGetDOHP(val));
+         }
          else
          {
!             data_length = att_align_datum(data_length, atti->attalign,
!                                           atti->attlen, val);
!             data_length = att_addlength_datum(data_length, atti->attlen,
                                                val);
          }
      }
*************** heap_fill_tuple(TupleDesc tupleDesc,
*** 203,212 ****
              *infomask |= HEAP_HASVARWIDTH;
              if (VARATT_IS_EXTERNAL(val))
              {
!                 *infomask |= HEAP_HASEXTERNAL;
!                 /* no alignment, since it's short by definition */
!                 data_length = VARSIZE_EXTERNAL(val);
!                 memcpy(data, val, data_length);
              }
              else if (VARATT_IS_SHORT(val))
              {
--- 216,241 ----
              *infomask |= HEAP_HASVARWIDTH;
              if (VARATT_IS_EXTERNAL(val))
              {
!                 if (VARATT_IS_EXTERNAL_DESERIALIZED(val))
!                 {
!                     /*
!                      * we want to re-serialize the deserialized value so that
!                      * the constructed tuple doesn't depend on it
!                      */
!                     DeserializedObjectHeader *doh = DatumGetDOHP(values[i]);
!
!                     data = (char *) att_align_nominal(data,
!                                                       att[i]->attalign);
!                     data_length = DOH_get_serialized_size(doh);
!                     DOH_serialize_into(doh, data, data_length);
!                 }
!                 else
!                 {
!                     *infomask |= HEAP_HASEXTERNAL;
!                     /* no alignment, since it's short by definition */
!                     data_length = VARSIZE_EXTERNAL(val);
!                     memcpy(data, val, data_length);
!                 }
              }
              else if (VARATT_IS_SHORT(val))
              {
diff --git a/src/backend/access/heap/tuptoaster.c b/src/backend/access/heap/tuptoaster.c
index f8c1401..0c9dd8e 100644
*** a/src/backend/access/heap/tuptoaster.c
--- b/src/backend/access/heap/tuptoaster.c
***************
*** 37,42 ****
--- 37,43 ----
  #include "catalog/catalog.h"
  #include "common/pg_lzcompress.h"
  #include "miscadmin.h"
+ #include "utils/deserialized.h"
  #include "utils/fmgroids.h"
  #include "utils/rel.h"
  #include "utils/typcache.h"
*************** heap_tuple_fetch_attr(struct varlena * a
*** 130,135 ****
--- 131,149 ----
          result = (struct varlena *) palloc(VARSIZE_ANY(attr));
          memcpy(result, attr, VARSIZE_ANY(attr));
      }
+     else if (VARATT_IS_EXTERNAL_DESERIALIZED(attr))
+     {
+         /*
+          * This is a deserialized-object pointer --- get serialized format
+          */
+         DeserializedObjectHeader *doh;
+         Size        resultsize;
+
+         doh = DatumGetDOHP(PointerGetDatum(attr));
+         resultsize = DOH_get_serialized_size(doh);
+         result = (struct varlena *) palloc(resultsize);
+         DOH_serialize_into(doh, (void *) result, resultsize);
+     }
      else
      {
          /*
*************** heap_tuple_untoast_attr(struct varlena *
*** 196,201 ****
--- 210,224 ----
              attr = result;
          }
      }
+     else if (VARATT_IS_EXTERNAL_DESERIALIZED(attr))
+     {
+         /*
+          * This is a deserialized-object pointer --- get serialized format
+          */
+         attr = heap_tuple_fetch_attr(attr);
+         /* deserializers are not allowed to produce compressed/short output */
+         Assert(!VARATT_IS_EXTENDED(attr));
+     }
      else if (VARATT_IS_COMPRESSED(attr))
      {
          /*
*************** heap_tuple_untoast_attr_slice(struct var
*** 263,268 ****
--- 286,296 ----
          return heap_tuple_untoast_attr_slice(redirect.pointer,
                                               sliceoffset, slicelength);
      }
+     else if (VARATT_IS_EXTERNAL_DESERIALIZED(attr))
+     {
+         /* pass it off to heap_tuple_fetch_attr to deserialize */
+         preslice = heap_tuple_fetch_attr(attr);
+     }
      else
          preslice = attr;

*************** toast_raw_datum_size(Datum value)
*** 344,349 ****
--- 372,381 ----

          return toast_raw_datum_size(PointerGetDatum(toast_pointer.pointer));
      }
+     else if (VARATT_IS_EXTERNAL_DESERIALIZED(attr))
+     {
+         result = DOH_get_serialized_size(DatumGetDOHP(value));
+     }
      else if (VARATT_IS_COMPRESSED(attr))
      {
          /* here, va_rawsize is just the payload size */
*************** toast_datum_size(Datum value)
*** 400,405 ****
--- 432,441 ----

          return toast_datum_size(PointerGetDatum(toast_pointer.pointer));
      }
+     else if (VARATT_IS_EXTERNAL_DESERIALIZED(attr))
+     {
+         result = DOH_get_serialized_size(DatumGetDOHP(value));
+     }
      else if (VARATT_IS_SHORT(attr))
      {
          result = VARSIZE_SHORT(attr);
diff --git a/src/backend/executor/execTuples.c b/src/backend/executor/execTuples.c
index 753754d..6c5f5dd 100644
*** a/src/backend/executor/execTuples.c
--- b/src/backend/executor/execTuples.c
***************
*** 88,93 ****
--- 88,94 ----
  #include "nodes/nodeFuncs.h"
  #include "storage/bufmgr.h"
  #include "utils/builtins.h"
+ #include "utils/deserialized.h"
  #include "utils/lsyscache.h"
  #include "utils/typcache.h"

*************** ExecCopySlot(TupleTableSlot *dstslot, Tu
*** 812,817 ****
--- 813,864 ----
      return ExecStoreTuple(newTuple, dstslot, InvalidBuffer, true);
  }

+ /* --------------------------------
+  *        ExecMakeSlotContentsReadOnly
+  *            Mark any R/W deserialized datums in the slot as read-only.
+  *
+  * This is needed when a slot that might contain R/W datum references is to be
+  * used as input for general expression evaluation.  Since the expression(s)
+  * might contain more than one Var referencing the same R/W datum, we could
+  * get wrong answers if functions acting on those Vars thought they could
+  * modify the deserialized value.
+  *
+  * For notational reasons, we return the same slot passed in.
+  * --------------------------------
+  */
+ TupleTableSlot *
+ ExecMakeSlotContentsReadOnly(TupleTableSlot *slot)
+ {
+     /*
+      * sanity checks
+      */
+     Assert(slot != NULL);
+     Assert(slot->tts_tupleDescriptor != NULL);
+     Assert(!slot->tts_isempty);
+
+     /*
+      * If the slot contains a physical tuple, it can't contain any
+      * deserialized datums, because we flatten those whenever making a
+      * physical tuple.  This might change later; but for now, we need do
+      * nothing unless the slot is virtual.
+      */
+     if (slot->tts_tuple == NULL)
+     {
+         Form_pg_attribute *att = slot->tts_tupleDescriptor->attrs;
+         int            attnum;
+
+         for (attnum = 0; attnum < slot->tts_nvalid; attnum++)
+         {
+             slot->tts_values[attnum] =
+                 MakeDeserializedObjectReadOnly(slot->tts_values[attnum],
+                                                slot->tts_isnull[attnum],
+                                                att[attnum]->attlen);
+         }
+     }
+
+     return slot;
+ }
+

  /* ----------------------------------------------------------------
   *                convenience initialization routines
diff --git a/src/backend/executor/nodeSubqueryscan.c b/src/backend/executor/nodeSubqueryscan.c
index 3f66e24..e5d1e54 100644
*** a/src/backend/executor/nodeSubqueryscan.c
--- b/src/backend/executor/nodeSubqueryscan.c
*************** SubqueryNext(SubqueryScanState *node)
*** 56,62 ****
--- 56,70 ----
       * We just return the subplan's result slot, rather than expending extra
       * cycles for ExecCopySlot().  (Our own ScanTupleSlot is used only for
       * EvalPlanQual rechecks.)
+      *
+      * We do need to mark the slot contents read-only to prevent interference
+      * between different functions reading the same datum from the slot. It's
+      * a bit hokey to do this to the subplan's slot, but should be safe
+      * enough.
       */
+     if (!TupIsNull(slot))
+         slot = ExecMakeSlotContentsReadOnly(slot);
+
      return slot;
  }

diff --git a/src/backend/utils/adt/Makefile b/src/backend/utils/adt/Makefile
index 20e5ff1..8f2d319 100644
*** a/src/backend/utils/adt/Makefile
--- b/src/backend/utils/adt/Makefile
*************** endif
*** 16,24 ****
  endif

  # keep this list arranged alphabetically or it gets to be a mess
! OBJS = acl.o arrayfuncs.o array_selfuncs.o array_typanalyze.o \
!     array_userfuncs.o arrayutils.o ascii.o bool.o \
!     cash.o char.o date.o datetime.o datum.o dbsize.o domains.o \
      encode.o enum.o float.o format_type.o formatting.o genfile.o \
      geo_ops.o geo_selfuncs.o inet_cidr_ntop.o inet_net_pton.o int.o \
      int8.o json.o jsonb.o jsonb_gin.o jsonb_op.o jsonb_util.o \
--- 16,25 ----
  endif

  # keep this list arranged alphabetically or it gets to be a mess
! OBJS = acl.o arrayfuncs.o array_deserialized.o array_selfuncs.o \
!     array_typanalyze.o array_userfuncs.o arrayutils.o \
!     ascii.o bool.o cash.o char.o \
!     date.o datetime.o datum.o dbsize.o deserialized.o domains.o \
      encode.o enum.o float.o format_type.o formatting.o genfile.o \
      geo_ops.o geo_selfuncs.o inet_cidr_ntop.o inet_net_pton.o int.o \
      int8.o json.o jsonb.o jsonb_gin.o jsonb_op.o jsonb_util.o \
diff --git a/src/backend/utils/adt/array_deserialized.c b/src/backend/utils/adt/array_deserialized.c
index ...092a5b1 .
*** a/src/backend/utils/adt/array_deserialized.c
--- b/src/backend/utils/adt/array_deserialized.c
***************
*** 0 ****
--- 1,936 ----
+ /*-------------------------------------------------------------------------
+  *
+  * array_deserialized.c
+  *      Functions for manipulating deserialized arrays.
+  *
+  * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+  * Portions Copyright (c) 1994, Regents of the University of California
+  *
+  *
+  * IDENTIFICATION
+  *      src/backend/utils/adt/array_deserialized.c
+  *
+  *-------------------------------------------------------------------------
+  */
+ #include "postgres.h"
+
+ #include "access/tupmacs.h"
+ #include "utils/array.h"
+ #include "utils/builtins.h"
+ #include "utils/datum.h"
+ #include "utils/deserialized.h"
+ #include "utils/lsyscache.h"
+ #include "utils/memutils.h"
+
+
+ /*
+  * A deserialized array is contained within a private memory context (as
+  * all deserialized objects must be) and has a control structure as below.
+  *
+  * The deserialized array might contain a regular serialized array if that was
+  * the original input and we've not modified it significantly.  Otherwise, the
+  * contents are represented by Datum/isnull arrays plus dimensionality and
+  * type information.  We could also have both forms, if we've deconstructed
+  * the original array for access purposes but not yet changed it.  For pass
+  * by reference element types, the Datums would point into the serialized
+  * array in this situation.  Once we start modifying array elements, new
+  * pass-by-ref elements are separately palloc'd within the memory context.
+  */
+ #define DA_MAGIC 689375833        /* ID for debugging crosschecks */
+
+ typedef struct DeserializedArrayHeader
+ {
+     /* Standard header for deserialized objects */
+     DeserializedObjectHeader hdr;
+
+     /* Magic value identifying a deserialized array (for debugging only) */
+     int            da_magic;
+
+     /* Dimensionality info (always valid) */
+     int            ndims;            /* # of dimensions */
+     int           *dims;            /* array dimensions */
+     int           *lbound;            /* index lower bounds for each dimension */
+
+     /* Element type info (always valid) */
+     Oid            element_type;    /* element type OID */
+     int16        typlen;            /* needed info about element datatype */
+     bool        typbyval;
+     char        typalign;
+
+     /*
+      * If we have a Datum-array representation of the array, it's kept here;
+      * else dvalues/dnulls are NULL.  The dvalues and dnulls arrays are always
+      * palloc'd within the object private context, but may change size from
+      * time to time.  For pass-by-ref element types, dvalues entries might
+      * point either into the sstartptr..sendptr area, or to separately
+      * palloc'd chunks.  Elements should always be fully detoasted, as they
+      * are in the standard serialized representation.
+      *
+      * Even when dvalues is valid, dnulls can be NULL if there are no null
+      * elements.
+      */
+     Datum       *dvalues;        /* array of Datums */
+     bool       *dnulls;            /* array of is-null flags for Datums */
+     int            dvalueslen;        /* allocated length of above arrays */
+     int            nelems;            /* number of valid entries in above arrays */
+
+     /*
+      * serialized_size is the current space requirement for the serialized
+      * equivalent of the deserialized array, if known; otherwise it's 0.  We
+      * store this to make consecutive calls of get_serialized_size cheap.
+      */
+     Size        serialized_size;
+
+     /*
+      * svalue points to the serialized representation if it is valid, else it
+      * is NULL.  If we have or ever had a serialized representation then
+      * sstartptr/sendptr point to the start and end+1 of its data area; this
+      * is so that we can tell which Datum pointers point into the serialized
+      * representation rather than being pointers to separately palloc'd data.
+      */
+     ArrayType  *svalue;            /* must be a fully detoasted array */
+     char       *sstartptr;        /* start of its data area */
+     char       *sendptr;        /* end+1 of its data area */
+ } DeserializedArrayHeader;
+
+ /* "Methods" required for a deserialized object */
+ static Size DA_get_serialized_size(DeserializedObjectHeader *dohptr);
+ static void DA_serialize_into(DeserializedObjectHeader *dohptr,
+                   void *result, Size allocated_size);
+
+ static const DeserializedObjectMethods DA_methods =
+ {
+     DA_get_serialized_size,
+     DA_serialize_into
+ };
+
+ /*
+  * Functions that can handle either a "flat" varlena array or a deserialized
+  * array use this union to work with their input.
+  */
+ typedef union AnyArrayType
+ {
+     ArrayType    arr;
+     DeserializedArrayHeader des;
+ } AnyArrayType;
+
+ /*
+  * Macros for working with AnyArrayType inputs.  Beware multiple references!
+  */
+ #define AARR_NDIM(a) \
+     (VARATT_IS_DESERIALIZED_HEADER(a) ? (a)->des.ndims : ARR_NDIM(&(a)->arr))
+ #define AARR_HASNULL(a) \
+     (VARATT_IS_DESERIALIZED_HEADER(a) ? \
+      ((a)->des.dvalues != NULL ? (a)->des.dnulls != NULL : ARR_HASNULL((a)->des.svalue)) : \
+      ARR_HASNULL(&(a)->arr))
+ #define AARR_ELEMTYPE(a) \
+     (VARATT_IS_DESERIALIZED_HEADER(a) ? (a)->des.element_type : ARR_ELEMTYPE(&(a)->arr))
+ #define AARR_DIMS(a) \
+     (VARATT_IS_DESERIALIZED_HEADER(a) ? (a)->des.dims : ARR_DIMS(&(a)->arr))
+ #define AARR_LBOUND(a) \
+     (VARATT_IS_DESERIALIZED_HEADER(a) ? (a)->des.lbound : ARR_LBOUND(&(a)->arr))
+
+
+ /*
+  * deserialize_array: convert an array Datum into a deserialized array
+  *
+  * The deserialized object will be a child of parentcontext.
+  *
+  * Caller can provide element type's representational data; we do that because
+  * caller is often in a position to cache it across repeated calls.  If the
+  * caller can't do that, pass zeroes for elmlen/elmbyval/elmalign.
+  */
+ Datum
+ deserialize_array(Datum arraydatum, MemoryContext parentcontext,
+                   int elmlen, bool elmbyval, char elmalign)
+ {
+     ArrayType  *array;
+     DeserializedArrayHeader *dah;
+     MemoryContext objcxt;
+     MemoryContext oldcxt;
+
+     /* allocate private context for deserialized object */
+     objcxt = AllocSetContextCreate(parentcontext,
+                                    "deserialized array",
+                                    ALLOCSET_DEFAULT_MINSIZE,
+                                    ALLOCSET_DEFAULT_INITSIZE,
+                                    ALLOCSET_DEFAULT_MAXSIZE);
+
+     /*
+      * Detoast and copy original array into private context.  Note that this
+      * coding risks leaking some memory in the private context if we have to
+      * fetch data back from a TOAST table; however, experimentation says that
+      * the leak is minimal.  Doing it this way saves a copy step, which seems
+      * worthwhile, especially if the array is large enough to need toasting.
+      */
+     oldcxt = MemoryContextSwitchTo(objcxt);
+     array = DatumGetArrayTypePCopy(arraydatum);
+     MemoryContextSwitchTo(oldcxt);
+
+     /* set up deserialized array header */
+     dah = (DeserializedArrayHeader *)
+         MemoryContextAlloc(objcxt, sizeof(DeserializedArrayHeader));
+
+     DOH_init_header(&dah->hdr, &DA_methods, objcxt);
+     dah->da_magic = DA_MAGIC;
+
+     dah->ndims = ARR_NDIM(array);
+     /* note these pointers point into the svalue header! */
+     dah->dims = ARR_DIMS(array);
+     dah->lbound = ARR_LBOUND(array);
+
+     /* save array's element-type data for possible use later */
+     dah->element_type = ARR_ELEMTYPE(array);
+     if (elmlen)
+     {
+         /* Caller provided representational data */
+         dah->typlen = elmlen;
+         dah->typbyval = elmbyval;
+         dah->typalign = elmalign;
+     }
+     else
+     {
+         /* No, so look it up */
+         get_typlenbyvalalign(dah->element_type,
+                              &dah->typlen,
+                              &dah->typbyval,
+                              &dah->typalign);
+     }
+
+     /* we don't make a deconstructed representation now */
+     dah->dvalues = NULL;
+     dah->dnulls = NULL;
+     dah->dvalueslen = 0;
+     dah->nelems = 0;
+     dah->serialized_size = 0;
+
+     /* remember we have a serialized representation */
+     dah->svalue = array;
+     dah->sstartptr = ARR_DATA_PTR(array);
+     dah->sendptr = ((char *) array) + ARR_SIZE(array);
+
+     /* return a R/W pointer to the deserialized array */
+     return PointerGetDatum(dah->hdr.doh_primary_ptr);
+ }
+
+ /*
+  * construct_empty_deserialized_array: make an empty deserialized array
+  * given only type information.  (elmlen etc can be zeroes.)
+  */
+ static DeserializedArrayHeader *
+ construct_empty_deserialized_array(Oid element_type,
+                                    MemoryContext parentcontext,
+                                    int elmlen, bool elmbyval, char elmalign)
+ {
+     ArrayType  *array = construct_empty_array(element_type);
+     Datum        d;
+
+     d = deserialize_array(PointerGetDatum(array), parentcontext,
+                           elmlen, elmbyval, elmalign);
+     return (DeserializedArrayHeader *) DatumGetDOHP(d);
+ }
+
+
+ /*
+  * get_serialized_size method for deserialized arrays
+  */
+ static Size
+ DA_get_serialized_size(DeserializedObjectHeader *dohptr)
+ {
+     DeserializedArrayHeader *dah = (DeserializedArrayHeader *) dohptr;
+     int            nelems;
+     int            ndims;
+     Datum       *dvalues;
+     bool       *dnulls;
+     Size        nbytes;
+     int            i;
+
+     Assert(dah->da_magic == DA_MAGIC);
+
+     /* Easy if we have a valid serialized value */
+     if (dah->svalue)
+         return ARR_SIZE(dah->svalue);
+
+     /* If we have a cached size value, believe that */
+     if (dah->serialized_size)
+         return dah->serialized_size;
+
+     /*
+      * Compute space needed by examining dvalues/dnulls.  Note that the result
+      * array will have a nulls bitmap if dnulls isn't NULL, even if the array
+      * doesn't actually contain any nulls now.
+      */
+     nelems = dah->nelems;
+     ndims = dah->ndims;
+     Assert(nelems == ArrayGetNItems(ndims, dah->dims));
+     dvalues = dah->dvalues;
+     dnulls = dah->dnulls;
+     nbytes = 0;
+     for (i = 0; i < nelems; i++)
+     {
+         if (dnulls && dnulls[i])
+             continue;
+         nbytes = att_addlength_datum(nbytes, dah->typlen, dvalues[i]);
+         nbytes = att_align_nominal(nbytes, dah->typalign);
+         /* check for overflow of total request */
+         if (!AllocSizeIsValid(nbytes))
+             ereport(ERROR,
+                     (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
+                      errmsg("array size exceeds the maximum allowed (%d)",
+                             (int) MaxAllocSize)));
+     }
+
+     if (dnulls)
+         nbytes += ARR_OVERHEAD_WITHNULLS(ndims, nelems);
+     else
+         nbytes += ARR_OVERHEAD_NONULLS(ndims);
+
+     /* cache for next time */
+     dah->serialized_size = nbytes;
+
+     return nbytes;
+ }
+
+ /*
+  * serialize_into method for deserialized arrays
+  */
+ static void
+ DA_serialize_into(DeserializedObjectHeader *dohptr,
+                   void *result, Size allocated_size)
+ {
+     DeserializedArrayHeader *dah = (DeserializedArrayHeader *) dohptr;
+     ArrayType  *aresult = (ArrayType *) result;
+     int            nelems;
+     int            ndims;
+     int32        dataoffset;
+
+     Assert(dah->da_magic == DA_MAGIC);
+
+     /* Easy if we have a valid serialized value */
+     if (dah->svalue)
+     {
+         Assert(allocated_size == ARR_SIZE(dah->svalue));
+         memcpy(result, dah->svalue, allocated_size);
+         return;
+     }
+
+     /* Else allocation should match previous get_serialized_size result */
+     Assert(allocated_size == dah->serialized_size);
+
+     /* Fill result array from dvalues/dnulls */
+     nelems = dah->nelems;
+     ndims = dah->ndims;
+
+     if (dah->dnulls)
+         dataoffset = ARR_OVERHEAD_WITHNULLS(ndims, nelems);
+     else
+         dataoffset = 0;            /* marker for no null bitmap */
+
+     /* We must ensure that any pad space is zero-filled */
+     memset(aresult, 0, allocated_size);
+
+     SET_VARSIZE(aresult, allocated_size);
+     aresult->ndim = ndims;
+     aresult->dataoffset = dataoffset;
+     aresult->elemtype = dah->element_type;
+     memcpy(ARR_DIMS(aresult), dah->dims, ndims * sizeof(int));
+     memcpy(ARR_LBOUND(aresult), dah->lbound, ndims * sizeof(int));
+
+     CopyArrayEls(aresult,
+                  dah->dvalues, dah->dnulls, nelems,
+                  dah->typlen, dah->typbyval, dah->typalign,
+                  false);
+ }
+
+ /*
+  * Argument fetching support code
+  */
+
+ #ifdef NOT_YET_USED
+
+ /*
+  * DatumGetDeserializedArray: get a writable deserialized array from an input
+  */
+ static DeserializedArrayHeader *
+ DatumGetDeserializedArray(Datum d)
+ {
+     DeserializedArrayHeader *dah;
+
+     /* If it's a writable deserialized array already, just return it */
+     if (DatumIsReadWriteDeserializedObject(d, false, -1))
+     {
+         dah = (DeserializedArrayHeader *) DatumGetDOHP(d);
+         Assert(dah->da_magic == DA_MAGIC);
+         return dah;
+     }
+
+     /*
+      * If it's a non-writable deserialized array, copy it, extracting the
+      * element representational data to save a catalog lookup.
+      */
+     if (VARATT_IS_EXTERNAL_DESERIALIZED(DatumGetPointer(d)))
+     {
+         dah = (DeserializedArrayHeader *) DatumGetDOHP(d);
+         Assert(dah->da_magic == DA_MAGIC);
+         d = deserialize_array(d, CurrentMemoryContext,
+                               dah->typlen, dah->typbyval, dah->typalign);
+         return (DeserializedArrayHeader *) DatumGetDOHP(d);
+     }
+
+     /* Else deserialize the hard way */
+     d = deserialize_array(d, CurrentMemoryContext, 0, 0, 0);
+     return (DeserializedArrayHeader *) DatumGetDOHP(d);
+ }
+
+ #define PG_GETARG_DESERIALIZED_ARRAY(n)  DatumGetDeserializedArray(PG_GETARG_DATUM(n))
+
+ #endif
+
+ /*
+  * As above, when caller has the ability to cache element type info
+  */
+ static DeserializedArrayHeader *
+ DatumGetDeserializedArrayX(Datum d,
+                            int elmlen, bool elmbyval, char elmalign)
+ {
+     DeserializedArrayHeader *dah;
+
+     /* If it's a writable deserialized array already, just return it */
+     if (DatumIsReadWriteDeserializedObject(d, false, -1))
+     {
+         dah = (DeserializedArrayHeader *) DatumGetDOHP(d);
+         Assert(dah->da_magic == DA_MAGIC);
+         Assert(dah->typlen == elmlen);
+         Assert(dah->typbyval == elmbyval);
+         Assert(dah->typalign == elmalign);
+         return dah;
+     }
+
+     /* Else deserialize using caller's data */
+     d = deserialize_array(d, CurrentMemoryContext, elmlen, elmbyval, elmalign);
+     return (DeserializedArrayHeader *) DatumGetDOHP(d);
+ }
+
+ #define PG_GETARG_DESERIALIZED_ARRAYX(n, elmlen, elmbyval, elmalign) \
+     DatumGetDeserializedArrayX(PG_GETARG_DATUM(n),    elmlen, elmbyval, elmalign)
+
+ /*
+  * DatumGetAnyArray: return either a deserialized array or a detoasted varlena
+  * array.  The result must not be modified in-place.
+  */
+ static AnyArrayType *
+ DatumGetAnyArray(Datum d)
+ {
+     DeserializedArrayHeader *dah;
+
+     /*
+      * If it's a deserialized array, return the header pointer.
+      */
+     if (VARATT_IS_EXTERNAL_DESERIALIZED(DatumGetPointer(d)))
+     {
+         dah = (DeserializedArrayHeader *) DatumGetDOHP(d);
+         Assert(dah->da_magic == DA_MAGIC);
+         return (AnyArrayType *) dah;
+     }
+
+     /* Else do regular detoasting as needed */
+     return (AnyArrayType *) PG_DETOAST_DATUM(d);
+ }
+
+ #define PG_GETARG_ANY_ARRAY(n)    DatumGetAnyArray(PG_GETARG_DATUM(n))
+
+
+ /*
+  * Equivalent of array_set() for a deserialized array
+  *
+  * array_set took care of detoasting dataValue, the rest is up to us
+  *
+  * Note: as with any operation on a read/write deserialized object, we must
+  * take pains not to leave the object in a corrupt state if we fail partway
+  * through.
+  */
+ Datum
+ array_set_deserialized(Datum arraydatum,
+                        int nSubscripts, int *indx,
+                        Datum dataValue, bool isNull,
+                        int arraytyplen,
+                        int elmlen, bool elmbyval, char elmalign)
+ {
+     DeserializedArrayHeader *dah;
+     Datum       *dvalues;
+     bool       *dnulls;
+     int            i,
+                 ndim,
+                 dim[MAXDIM],
+                 lb[MAXDIM],
+                 offset;
+     bool        dimschanged,
+                 newhasnulls;
+     int            addedbefore,
+                 addedafter;
+     char       *oldValue = NULL;
+
+     dah = (DeserializedArrayHeader *) DatumGetDOHP(arraydatum);
+     Assert(dah->da_magic == DA_MAGIC);
+
+     /* if this fails, we shouldn't be modifying this array in-place */
+     Assert(DatumGetPointer(arraydatum) == (Pointer) dah->hdr.doh_primary_ptr);
+
+     /* sanity-check caller's state; we don't use the passed data otherwise */
+     Assert(arraytyplen == -1);
+     Assert(elmlen == dah->typlen);
+     Assert(elmbyval == dah->typbyval);
+     Assert(elmalign == dah->typalign);
+
+     /*
+      * Copy dimension info into local storage.  This allows us to modify the
+      * dimensions if needed, while not messing up the deserialized value if we
+      * fail partway through.
+      */
+     ndim = dah->ndims;
+     Assert(ndim >= 0 && ndim <= MAXDIM);
+     memcpy(dim, dah->dims, ndim * sizeof(int));
+     memcpy(lb, dah->lbound, ndim * sizeof(int));
+     dimschanged = false;
+
+     /*
+      * if number of dims is zero, i.e. an empty array, create an array with
+      * nSubscripts dimensions, and set the lower bounds to the supplied
+      * subscripts.
+      */
+     if (ndim == 0)
+     {
+         /*
+          * Allocate adequate space for new dimension info.  This is harmless
+          * if we fail later.
+          */
+         Assert(nSubscripts > 0 && nSubscripts <= MAXDIM);
+         dah->dims = (int *) MemoryContextAllocZero(dah->hdr.doh_context,
+                                                    nSubscripts * sizeof(int));
+         dah->lbound = (int *) MemoryContextAllocZero(dah->hdr.doh_context,
+                                                   nSubscripts * sizeof(int));
+
+         /* Update local copies of dimension info */
+         ndim = nSubscripts;
+         for (i = 0; i < nSubscripts; i++)
+         {
+             dim[i] = 0;
+             lb[i] = indx[i];
+         }
+         dimschanged = true;
+     }
+     else if (ndim != nSubscripts)
+         ereport(ERROR,
+                 (errcode(ERRCODE_ARRAY_SUBSCRIPT_ERROR),
+                  errmsg("wrong number of array subscripts")));
+
+     /*
+      * Deconstruct array if we didn't already.  (Someday maybe add a special
+      * case path for fixed-length, no-nulls cases, where we can overwrite an
+      * element in place without ever deconstructing.  But today is not that
+      * day.)
+      */
+     if (dah->dvalues == NULL)
+     {
+         MemoryContext oldcxt = MemoryContextSwitchTo(dah->hdr.doh_context);
+         int            nelems;
+
+         dnulls = NULL;
+         deconstruct_array(dah->svalue,
+                           dah->element_type,
+                           dah->typlen, dah->typbyval, dah->typalign,
+                           &dvalues,
+                           ARR_HASNULL(dah->svalue) ? &dnulls : NULL,
+                           &nelems);
+
+         /*
+          * Update header only after successful completion of this step.  If
+          * deconstruct_array fails partway through, worst consequence is some
+          * leaked memory in the object's context.
+          */
+         dah->dvalues = dvalues;
+         dah->dnulls = dnulls;
+         dah->dvalueslen = dah->nelems = nelems;
+         MemoryContextSwitchTo(oldcxt);
+     }
+
+     /*
+      * Copy new element into array's context, if needed (we assume it's
+      * already detoasted, so no junk should be created).  If we fail further
+      * down, this memory is leaked, but that's reasonably harmless.
+      */
+     if (!dah->typbyval && !isNull)
+     {
+         MemoryContext oldcxt = MemoryContextSwitchTo(dah->hdr.doh_context);
+
+         dataValue = datumCopy(dataValue, false, dah->typlen);
+         MemoryContextSwitchTo(oldcxt);
+     }
+
+     dvalues = dah->dvalues;
+     dnulls = dah->dnulls;
+
+     newhasnulls = ((dnulls != NULL) || isNull);
+     addedbefore = addedafter = 0;
+
+     /*
+      * Check subscripts (this logic matches original array_set)
+      */
+     if (ndim == 1)
+     {
+         if (indx[0] < lb[0])
+         {
+             addedbefore = lb[0] - indx[0];
+             dim[0] += addedbefore;
+             lb[0] = indx[0];
+             dimschanged = true;
+             if (addedbefore > 1)
+                 newhasnulls = true;        /* will insert nulls */
+         }
+         if (indx[0] >= (dim[0] + lb[0]))
+         {
+             addedafter = indx[0] - (dim[0] + lb[0]) + 1;
+             dim[0] += addedafter;
+             dimschanged = true;
+             if (addedafter > 1)
+                 newhasnulls = true;        /* will insert nulls */
+         }
+         /* Physically enlarge dvalues/dnulls arrays if needed */
+         if (dim[0] > dah->dvalueslen)
+         {
+             /* We want some extra space if we're enlarging */
+             int            newlen = dim[0] + dim[0] / 8;
+
+             dah->dvalues = dvalues = (Datum *)
+                 repalloc(dvalues, newlen * sizeof(Datum));
+             if (dnulls)
+                 dah->dnulls = dnulls = (bool *)
+                     repalloc(dnulls, newlen * sizeof(bool));
+             dah->dvalueslen = newlen;
+         }
+     }
+     else
+     {
+         /*
+          * XXX currently we do not support extending multi-dimensional arrays
+          * during assignment
+          */
+         for (i = 0; i < ndim; i++)
+         {
+             if (indx[i] < lb[i] ||
+                 indx[i] >= (dim[i] + lb[i]))
+                 ereport(ERROR,
+                         (errcode(ERRCODE_ARRAY_SUBSCRIPT_ERROR),
+                          errmsg("array subscript out of range")));
+         }
+     }
+
+     /* Now we can calculate linear offset of target item in array */
+     offset = ArrayGetOffset(nSubscripts, dim, lb, indx);
+
+     /*
+      * If we need a nulls bitmap and don't already have one, create it, being
+      * sure to mark all existing entries as not null.
+      */
+     if (newhasnulls && dnulls == NULL)
+         dah->dnulls = dnulls = (bool *)
+             MemoryContextAllocZero(dah->hdr.doh_context,
+                                    dah->dvalueslen * sizeof(bool));
+
+     /*
+      * We now have all the needed space allocated, so we're ready to make
+      * irreversible changes.  Be very wary of allowing failure below here.
+      */
+
+     /* Serialized value will no longer represent array accurately */
+     dah->svalue = NULL;
+     /* And we don't know the deserialized size either */
+     dah->serialized_size = 0;
+
+     /* Update dimensionality info if needed */
+     if (dimschanged)
+     {
+         dah->ndims = ndim;
+         memcpy(dah->dims, dim, ndim * sizeof(int));
+         memcpy(dah->lbound, lb, ndim * sizeof(int));
+     }
+
+     /* Reposition items if needed, and fill addedbefore items with nulls */
+     if (addedbefore > 0)
+     {
+         memmove(dvalues + addedbefore, dvalues, dah->nelems * sizeof(Datum));
+         for (i = 0; i < addedbefore; i++)
+             dvalues[i] = (Datum) 0;
+         if (dnulls)
+         {
+             memmove(dnulls + addedbefore, dnulls, dah->nelems * sizeof(bool));
+             for (i = 0; i < addedbefore; i++)
+                 dnulls[i] = true;
+         }
+         dah->nelems += addedbefore;
+     }
+
+     /* fill addedafter items with nulls */
+     if (addedafter > 0)
+     {
+         for (i = 0; i < addedafter; i++)
+             dvalues[dah->nelems + i] = (Datum) 0;
+         if (dnulls)
+         {
+             for (i = 0; i < addedafter; i++)
+                 dnulls[dah->nelems + i] = true;
+         }
+         dah->nelems += addedafter;
+     }
+
+     /* Grab old element value for pfree'ing, if needed. */
+     if (!dah->typbyval && (dnulls == NULL || !dnulls[offset]))
+         oldValue = (char *) DatumGetPointer(dvalues[offset]);
+
+     /* And finally we can insert the new element. */
+     dvalues[offset] = dataValue;
+     if (dnulls)
+         dnulls[offset] = isNull;
+
+     /*
+      * Free old element if needed; this keeps repeated element replacements
+      * from bloating the array's storage.  If the pfree somehow fails, it
+      * won't corrupt the array.
+      */
+     if (oldValue)
+     {
+         /* Don't try to pfree a part of the original serialized array */
+         if (oldValue < dah->sstartptr || oldValue >= dah->sendptr)
+             pfree(oldValue);
+     }
+
+     /* Done, return primary TOAST pointer for object */
+     return PointerGetDatum(dah->hdr.doh_primary_ptr);
+ }
+
+ /*
+  * Deserialized reimplementation of array_push
+  */
+ typedef struct ArrayPushState
+ {
+     Oid            arg0_typeid;
+     Oid            arg1_typeid;
+     bool        array_on_left;
+     Oid            element_type;
+     int16        typlen;
+     bool        typbyval;
+     char        typalign;
+ } ArrayPushState;
+
+ Datum
+ array_push_deserialized(PG_FUNCTION_ARGS)
+ {
+     DeserializedArrayHeader *dah;
+     Datum        newelem;
+     bool        isNull;
+     int           *dimv,
+                *lb;
+     ArrayType  *result;
+     int            indx;
+     int            lb0;
+     Oid            element_type;
+     int16        typlen;
+     bool        typbyval;
+     char        typalign;
+     Oid            arg0_typeid = get_fn_expr_argtype(fcinfo->flinfo, 0);
+     Oid            arg1_typeid = get_fn_expr_argtype(fcinfo->flinfo, 1);
+     ArrayPushState *my_extra;
+
+     if (arg0_typeid == InvalidOid || arg1_typeid == InvalidOid)
+         ereport(ERROR,
+                 (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+                  errmsg("could not determine input data types")));
+
+     /*
+      * We arrange to look up info about element type only once per series of
+      * calls, assuming the element type doesn't change underneath us.
+      */
+     my_extra = (ArrayPushState *) fcinfo->flinfo->fn_extra;
+     if (my_extra == NULL)
+     {
+         fcinfo->flinfo->fn_extra = MemoryContextAlloc(fcinfo->flinfo->fn_mcxt,
+                                                       sizeof(ArrayPushState));
+         my_extra = (ArrayPushState *) fcinfo->flinfo->fn_extra;
+         my_extra->arg0_typeid = InvalidOid;
+     }
+
+     if (my_extra->arg0_typeid != arg0_typeid ||
+         my_extra->arg1_typeid != arg1_typeid)
+     {
+         /* Determine which input is the array */
+         Oid            arg0_elemid = get_element_type(arg0_typeid);
+         Oid            arg1_elemid = get_element_type(arg1_typeid);
+
+         if (arg0_elemid != InvalidOid)
+         {
+             my_extra->array_on_left = true;
+             element_type = arg0_elemid;
+         }
+         else if (arg1_elemid != InvalidOid)
+         {
+             my_extra->array_on_left = false;
+             element_type = arg1_elemid;
+         }
+         else
+         {
+             /* Shouldn't get here given proper type checking in parser */
+             ereport(ERROR,
+                     (errcode(ERRCODE_DATATYPE_MISMATCH),
+                      errmsg("neither input type is an array")));
+             PG_RETURN_NULL();    /* keep compiler quiet */
+         }
+
+         my_extra->arg0_typeid = arg0_typeid;
+         my_extra->arg1_typeid = arg1_typeid;
+
+         /* Get info about element type */
+         get_typlenbyvalalign(element_type,
+                              &my_extra->typlen,
+                              &my_extra->typbyval,
+                              &my_extra->typalign);
+         my_extra->element_type = element_type;
+     }
+
+     element_type = my_extra->element_type;
+     typlen = my_extra->typlen;
+     typbyval = my_extra->typbyval;
+     typalign = my_extra->typalign;
+
+     /*
+      * Now we can fetch the arguments, using cached type info if needed
+      */
+     if (my_extra->array_on_left)
+     {
+         if (PG_ARGISNULL(0))
+             dah = construct_empty_deserialized_array(element_type,
+                                                      CurrentMemoryContext,
+                                                  typlen, typbyval, typalign);
+         else
+             dah = PG_GETARG_DESERIALIZED_ARRAYX(0, typlen, typbyval, typalign);
+         isNull = PG_ARGISNULL(1);
+         if (isNull)
+             newelem = (Datum) 0;
+         else
+             newelem = PG_GETARG_DATUM(1);
+     }
+     else
+     {
+         if (PG_ARGISNULL(1))
+             dah = construct_empty_deserialized_array(element_type,
+                                                      CurrentMemoryContext,
+                                                  typlen, typbyval, typalign);
+         else
+             dah = PG_GETARG_DESERIALIZED_ARRAYX(1, typlen, typbyval, typalign);
+         isNull = PG_ARGISNULL(0);
+         if (isNull)
+             newelem = (Datum) 0;
+         else
+             newelem = PG_GETARG_DATUM(0);
+     }
+
+     Assert(element_type == dah->element_type);
+
+     /*
+      * Perform push (this logic is basically unchanged from original)
+      */
+     if (dah->ndims == 1)
+     {
+         lb = dah->lbound;
+         dimv = dah->dims;
+
+         if (my_extra->array_on_left)
+         {
+             /* append newelem */
+             int            ub = dimv[0] + lb[0] - 1;
+
+             indx = ub + 1;
+             /* overflow? */
+             if (indx < ub)
+                 ereport(ERROR,
+                         (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+                          errmsg("integer out of range")));
+         }
+         else
+         {
+             /* prepend newelem */
+             indx = lb[0] - 1;
+             /* overflow? */
+             if (indx > lb[0])
+                 ereport(ERROR,
+                         (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+                          errmsg("integer out of range")));
+         }
+         lb0 = lb[0];
+     }
+     else if (dah->ndims == 0)
+     {
+         indx = 1;
+         lb0 = 1;
+     }
+     else
+         ereport(ERROR,
+                 (errcode(ERRCODE_DATA_EXCEPTION),
+                  errmsg("argument must be empty or one-dimensional array")));
+
+     result = array_set((ArrayType *) dah->hdr.doh_primary_ptr,
+                        1, &indx, newelem, isNull,
+                        -1, typlen, typbyval, typalign);
+
+     Assert(result == (ArrayType *) dah->hdr.doh_primary_ptr);
+
+     /*
+      * Readjust result's LB to match the input's.  We need do nothing in the
+      * append case, but it's the simplest way to implement the prepend case.
+      */
+     if (dah->ndims == 1 && !my_extra->array_on_left)
+     {
+         /* This is ok whether we've deconstructed or not */
+         dah->lbound[0] = lb0;
+     }
+
+     PG_RETURN_POINTER(result);
+ }
+
+ /*
+  * array_dims :
+  *          returns the dimensions of the array pointed to by "v", as a "text"
+  *
+  * This is here as an example of handling either flat or deserialized inputs.
+  */
+ Datum
+ array_dims_deserialized(PG_FUNCTION_ARGS)
+ {
+     AnyArrayType *v = PG_GETARG_ANY_ARRAY(0);
+     char       *p;
+     int            i;
+     int           *dimv,
+                *lb;
+
+     /*
+      * 33 since we assume 15 digits per number + ':' +'[]'
+      *
+      * +1 for trailing null
+      */
+     char        buf[MAXDIM * 33 + 1];
+
+     /* Sanity check: does it look like an array at all? */
+     if (AARR_NDIM(v) <= 0 || AARR_NDIM(v) > MAXDIM)
+         PG_RETURN_NULL();
+
+     dimv = AARR_DIMS(v);
+     lb = AARR_LBOUND(v);
+
+     p = buf;
+     for (i = 0; i < AARR_NDIM(v); i++)
+     {
+         sprintf(p, "[%d:%d]", lb[i], dimv[i] + lb[i] - 1);
+         p += strlen(p);
+     }
+
+     PG_RETURN_TEXT_P(cstring_to_text(buf));
+ }
diff --git a/src/backend/utils/adt/array_userfuncs.c b/src/backend/utils/adt/array_userfuncs.c
index 600646e..4684f0a 100644
*** a/src/backend/utils/adt/array_userfuncs.c
--- b/src/backend/utils/adt/array_userfuncs.c
***************
*** 25,30 ****
--- 25,33 ----
  Datum
  array_push(PG_FUNCTION_ARGS)
  {
+ #if 1
+     return array_push_deserialized(fcinfo);
+ #else
      ArrayType  *v;
      Datum        newelem;
      bool        isNull;
*************** array_push(PG_FUNCTION_ARGS)
*** 157,162 ****
--- 160,166 ----
          ARR_LBOUND(result)[0] = ARR_LBOUND(v)[0];

      PG_RETURN_ARRAYTYPE_P(result);
+ #endif
  }

  /*-----------------------------------------------------------------------------
diff --git a/src/backend/utils/adt/arrayfuncs.c b/src/backend/utils/adt/arrayfuncs.c
index 5591b46..9b3037a 100644
*** a/src/backend/utils/adt/arrayfuncs.c
--- b/src/backend/utils/adt/arrayfuncs.c
***************
*** 27,32 ****
--- 27,33 ----
  #include "utils/array.h"
  #include "utils/builtins.h"
  #include "utils/datum.h"
+ #include "utils/deserialized.h"
  #include "utils/lsyscache.h"
  #include "utils/memutils.h"
  #include "utils/typcache.h"
*************** static void ReadArrayBinary(StringInfo b
*** 93,102 ****
                  int typlen, bool typbyval, char typalign,
                  Datum *values, bool *nulls,
                  bool *hasnulls, int32 *nbytes);
- static void CopyArrayEls(ArrayType *array,
-              Datum *values, bool *nulls, int nitems,
-              int typlen, bool typbyval, char typalign,
-              bool freedata);
  static bool array_get_isnull(const bits8 *nullbitmap, int offset);
  static void array_set_isnull(bits8 *nullbitmap, int offset, bool isNull);
  static Datum ArrayCast(char *value, bool byval, int len);
--- 94,99 ----
*************** ReadArrayStr(char *arrayStr,
*** 939,945 ****
   * the values are not toasted.  (Doing it here doesn't work since the
   * caller has already allocated space for the array...)
   */
! static void
  CopyArrayEls(ArrayType *array,
               Datum *values,
               bool *nulls,
--- 936,942 ----
   * the values are not toasted.  (Doing it here doesn't work since the
   * caller has already allocated space for the array...)
   */
! void
  CopyArrayEls(ArrayType *array,
               Datum *values,
               bool *nulls,
*************** array_ndims(PG_FUNCTION_ARGS)
*** 1666,1671 ****
--- 1663,1671 ----
  Datum
  array_dims(PG_FUNCTION_ARGS)
  {
+ #if 1
+     return array_dims_deserialized(fcinfo);
+ #else
      ArrayType  *v = PG_GETARG_ARRAYTYPE_P(0);
      char       *p;
      int            i;
*************** array_dims(PG_FUNCTION_ARGS)
*** 1694,1699 ****
--- 1694,1700 ----
      }

      PG_RETURN_TEXT_P(cstring_to_text(buf));
+ #endif
  }

  /*
*************** array_set(ArrayType *array,
*** 2161,2166 ****
--- 2162,2191 ----
      if (elmlen == -1 && !isNull)
          dataValue = PointerGetDatum(PG_DETOAST_DATUM(dataValue));

+     if (1)
+     {
+         /* Convert to R/W deserialized form if not that already */
+         if (!DatumIsReadWriteDeserializedObject(PointerGetDatum(array),
+                                                 false, -1))
+         {
+             array = (ArrayType *) DatumGetPointer(
+               deserialize_array(PointerGetDatum(array), CurrentMemoryContext,
+                                 elmlen, elmbyval, elmalign));
+         }
+
+         /* And hand off to array_deserialized.c */
+         return (ArrayType *) DatumGetPointer(
+                                array_set_deserialized(PointerGetDatum(array),
+                                                       nSubscripts,
+                                                       indx,
+                                                       dataValue,
+                                                       isNull,
+                                                       arraytyplen,
+                                                       elmlen,
+                                                       elmbyval,
+                                                       elmalign));
+     }
+
      /* detoast input array if necessary */
      array = DatumGetArrayTypeP(PointerGetDatum(array));

diff --git a/src/backend/utils/adt/datum.c b/src/backend/utils/adt/datum.c
index 014eca5..4ebf79a 100644
*** a/src/backend/utils/adt/datum.c
--- b/src/backend/utils/adt/datum.c
***************
*** 12,19 ****
   *
   *-------------------------------------------------------------------------
   */
  /*
!  * In the implementation of the next routines we assume the following:
   *
   * A) if a type is "byVal" then all the information is stored in the
   * Datum itself (i.e. no pointers involved!). In this case the
--- 12,20 ----
   *
   *-------------------------------------------------------------------------
   */
+
  /*
!  * In the implementation of these routines we assume the following:
   *
   * A) if a type is "byVal" then all the information is stored in the
   * Datum itself (i.e. no pointers involved!). In this case the
***************
*** 34,44 ****
--- 35,49 ----
   *
   * Note that we do not treat "toasted" datums specially; therefore what
   * will be copied or compared is the compressed data or toast reference.
+  * An exception is made for datumCopy() of a deserialized object, however,
+  * because most callers expect to get a simple contiguous (and pfree'able)
+  * result from datumCopy().
   */

  #include "postgres.h"

  #include "utils/datum.h"
+ #include "utils/deserialized.h"


  /*-------------------------------------------------------------------------
***************
*** 46,51 ****
--- 51,57 ----
   *
   * Find the "real" size of a datum, given the datum value,
   * whether it is a "by value", and the declared type length.
+  * (For TOAST pointer datums, this is the size of the pointer datum.)
   *
   * This is essentially an out-of-line version of the att_addlength_datum()
   * macro in access/tupmacs.h.  We do a tad more error checking though.
*************** datumGetSize(Datum value, bool typByVal,
*** 106,114 ****
  /*-------------------------------------------------------------------------
   * datumCopy
   *
!  * make a copy of a datum
   *
   * If the datatype is pass-by-reference, memory is obtained with palloc().
   *-------------------------------------------------------------------------
   */
  Datum
--- 112,127 ----
  /*-------------------------------------------------------------------------
   * datumCopy
   *
!  * Make a copy of a non-NULL datum.
   *
   * If the datatype is pass-by-reference, memory is obtained with palloc().
+  *
+  * If the value is a reference to a deserialized object, we deserialize into
+  * memory obtained with palloc().  We need to copy because one of the main
+  * uses of this function is to copy a datum out of a transient memory context
+  * that's about to be destroyed, and the deserialized object is probably in a
+  * child context that will also go away.  Moreover, many callers assume that
+  * the result is a single pfree-able chunk.
   *-------------------------------------------------------------------------
   */
  Datum
*************** datumCopy(Datum value, bool typByVal, in
*** 118,161 ****

      if (typByVal)
          res = value;
!     else
      {
!         Size        realSize;
!         char       *s;

!         if (DatumGetPointer(value) == NULL)
!             return PointerGetDatum(NULL);

!         realSize = datumGetSize(value, typByVal, typLen);

!         s = (char *) palloc(realSize);
!         memcpy(s, DatumGetPointer(value), realSize);
!         res = PointerGetDatum(s);
      }
!     return res;
! }
!
! /*-------------------------------------------------------------------------
!  * datumFree
!  *
!  * Free the space occupied by a datum CREATED BY "datumCopy"
!  *
!  * NOTE: DO NOT USE THIS ROUTINE with datums returned by heap_getattr() etc.
!  * ONLY datums created by "datumCopy" can be freed!
!  *-------------------------------------------------------------------------
!  */
! #ifdef NOT_USED
! void
! datumFree(Datum value, bool typByVal, int typLen)
! {
!     if (!typByVal)
      {
!         Pointer        s = DatumGetPointer(value);

!         pfree(s);
      }
  }
- #endif

  /*-------------------------------------------------------------------------
   * datumIsEqual
--- 131,179 ----

      if (typByVal)
          res = value;
!     else if (typLen == -1)
      {
!         /* It is a varlena datatype */
!         struct varlena *vl = (struct varlena *) DatumGetPointer(value);

!         if (VARATT_IS_EXTERNAL_DESERIALIZED(vl))
!         {
!             /* Serialize into the caller's memory context */
!             DeserializedObjectHeader *doh = DatumGetDOHP(value);
!             Size        resultsize;
!             char       *resultptr;

!             resultsize = DOH_get_serialized_size(doh);
!             resultptr = (char *) palloc(resultsize);
!             DOH_serialize_into(doh, (void *) resultptr, resultsize);
!             res = PointerGetDatum(resultptr);
!         }
!         else
!         {
!             /* Otherwise, just copy the varlena datum verbatim */
!             Size        realSize;
!             char       *resultptr;

!             realSize = (Size) VARSIZE_ANY(vl);
!             resultptr = (char *) palloc(realSize);
!             memcpy(resultptr, vl, realSize);
!             res = PointerGetDatum(resultptr);
!         }
      }
!     else
      {
!         /* Pass by reference, but not varlena, so not toasted */
!         Size        realSize;
!         char       *resultptr;

!         realSize = datumGetSize(value, typByVal, typLen);
!
!         resultptr = (char *) palloc(realSize);
!         memcpy(resultptr, DatumGetPointer(value), realSize);
!         res = PointerGetDatum(resultptr);
      }
+     return res;
  }

  /*-------------------------------------------------------------------------
   * datumIsEqual
diff --git a/src/backend/utils/adt/deserialized.c b/src/backend/utils/adt/deserialized.c
index ...4584514 .
*** a/src/backend/utils/adt/deserialized.c
--- b/src/backend/utils/adt/deserialized.c
***************
*** 0 ****
--- 1,169 ----
+ /*-------------------------------------------------------------------------
+  *
+  * deserialized.c
+  *      Support functions for "deserialized" value representations.
+  *
+  * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+  * Portions Copyright (c) 1994, Regents of the University of California
+  *
+  *
+  * IDENTIFICATION
+  *      src/backend/utils/adt/deserialized.c
+  *
+  *-------------------------------------------------------------------------
+  */
+ #include "postgres.h"
+
+ #include "utils/deserialized.h"
+ #include "utils/memutils.h"
+
+ /*
+  * DatumGetDOHP
+  *
+  * Given a Datum that is a deserialized-object reference, extract the pointer.
+  *
+  * This is a bit tedious since the pointer may not be properly aligned;
+  * compare VARATT_EXTERNAL_GET_POINTER().
+  */
+ DeserializedObjectHeader *
+ DatumGetDOHP(Datum d)
+ {
+     varattrib_1b_e *datum = (varattrib_1b_e *) DatumGetPointer(d);
+     varatt_deserialized ptr;
+
+     Assert(VARATT_IS_EXTERNAL_DESERIALIZED(datum));
+     memcpy(&ptr, VARDATA_EXTERNAL(datum), sizeof(ptr));
+     return ptr.dohptr;
+ }
+
+ /*
+  * DOH_init_header
+  *
+  * Initialize the common header of a deserialized object.
+  *
+  * The main thing this encapsulates is initializing the TOAST pointers.
+  */
+ void
+ DOH_init_header(DeserializedObjectHeader *dohptr,
+                 const DeserializedObjectMethods *methods,
+                 MemoryContext obj_context)
+ {
+     varatt_deserialized ptr;
+
+     dohptr->vl_len_ = DOH_HEADER_MAGIC;
+     dohptr->doh_methods = methods;
+     dohptr->doh_context = obj_context;
+
+     ptr.dohptr = dohptr;
+
+     SET_VARTAG_EXTERNAL(dohptr->doh_primary_ptr, VARTAG_DESERIALIZED);
+     memcpy(VARDATA_EXTERNAL(dohptr->doh_primary_ptr), &ptr, sizeof(ptr));
+
+     SET_VARTAG_EXTERNAL(dohptr->doh_secondary_ptr, VARTAG_DESERIALIZED);
+     memcpy(VARDATA_EXTERNAL(dohptr->doh_secondary_ptr), &ptr, sizeof(ptr));
+ }
+
+ /*
+  * DOH_get_serialized_size
+  * DOH_serialize_into
+  *
+  * Convenience functions for invoking the "methods" of a deserialized object.
+  */
+
+ Size
+ DOH_get_serialized_size(DeserializedObjectHeader *dohptr)
+ {
+     return (*dohptr->doh_methods->get_serialized_size) (dohptr);
+ }
+
+ void
+ DOH_serialize_into(DeserializedObjectHeader *dohptr,
+                    void *result, Size allocated_size)
+ {
+     (*dohptr->doh_methods->serialize_into) (dohptr, result, allocated_size);
+ }
+
+ /*
+  * Does the Datum represent a writable deserialized object?
+  */
+ bool
+ DatumIsReadWriteDeserializedObject(Datum d, bool isnull, int16 typlen)
+ {
+     DeserializedObjectHeader *dohptr;
+
+     /* Reject if it's NULL or not a varlena type */
+     if (isnull || typlen != -1)
+         return false;
+
+     /* Reject if not a deserialized-object pointer */
+     if (!VARATT_IS_EXTERNAL_DESERIALIZED(DatumGetPointer(d)))
+         return false;
+
+     /* Now safe to extract the object pointer */
+     dohptr = DatumGetDOHP(d);
+
+     /* Reject if this isn't the primary TOAST pointer for the object */
+     if (DatumGetPointer(d) != (Pointer) dohptr->doh_primary_ptr)
+         return false;
+
+     return true;
+ }
+
+ /*
+  * If the Datum represents a R/W deserialized object, change it to R/O.
+  * Otherwise return the original Datum.
+  */
+ Datum
+ MakeDeserializedObjectReadOnly(Datum d, bool isnull, int16 typlen)
+ {
+     DeserializedObjectHeader *dohptr;
+
+     /* Nothing to do if it's NULL or not a varlena type */
+     if (isnull || typlen != -1)
+         return d;
+
+     /* Nothing to do if not a deserialized-object pointer */
+     if (!VARATT_IS_EXTERNAL_DESERIALIZED(DatumGetPointer(d)))
+         return d;
+
+     /* Now safe to extract the object pointer */
+     dohptr = DatumGetDOHP(d);
+
+     /* Nothing to do if this isn't the primary TOAST pointer for the object */
+     if (DatumGetPointer(d) != (Pointer) dohptr->doh_primary_ptr)
+         return d;
+
+     /* Else return the secondary pointer instead */
+     return PointerGetDatum(dohptr->doh_secondary_ptr);
+ }
+
+ /*
+  * Transfer ownership of a deserialized object to a new parent memory context.
+  * The object must be referenced by its primary (R/W) pointer.
+  */
+ void
+ TransferDeserializedObject(Datum d, MemoryContext new_parent)
+ {
+     DeserializedObjectHeader *dohptr = DatumGetDOHP(d);
+
+     /* Assert this is the primary TOAST pointer for the object */
+     Assert(DatumGetPointer(d) == (Pointer) dohptr->doh_primary_ptr);
+
+     /* Transfer ownership */
+     MemoryContextSetParent(dohptr->doh_context, new_parent);
+ }
+
+ /*
+  * Delete a deserialized object (must be referenced by its primary pointer).
+  */
+ void
+ DeleteDeserializedObject(Datum d)
+ {
+     DeserializedObjectHeader *dohptr = DatumGetDOHP(d);
+
+     /* Assert this is the primary TOAST pointer for the object */
+     Assert(DatumGetPointer(d) == (Pointer) dohptr->doh_primary_ptr);
+
+     /* Kill it */
+     MemoryContextDelete(dohptr->doh_context);
+ }
diff --git a/src/backend/utils/mmgr/mcxt.c b/src/backend/utils/mmgr/mcxt.c
index 202bc78..4b24066 100644
*** a/src/backend/utils/mmgr/mcxt.c
--- b/src/backend/utils/mmgr/mcxt.c
*************** MemoryContextSetParent(MemoryContext con
*** 266,271 ****
--- 266,275 ----
      AssertArg(MemoryContextIsValid(context));
      AssertArg(context != new_parent);

+     /* Fast path if it's got correct parent already */
+     if (new_parent == context->parent)
+         return;
+
      /* Delink from existing parent, if any */
      if (context->parent)
      {
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index 40fde83..a98a7af 100644
*** a/src/include/executor/executor.h
--- b/src/include/executor/executor.h
*************** extern void FreeExprContext(ExprContext
*** 312,318 ****
  extern void ReScanExprContext(ExprContext *econtext);

  #define ResetExprContext(econtext) \
!     MemoryContextReset((econtext)->ecxt_per_tuple_memory)

  extern ExprContext *MakePerTupleExprContext(EState *estate);

--- 312,318 ----
  extern void ReScanExprContext(ExprContext *econtext);

  #define ResetExprContext(econtext) \
!     MemoryContextResetAndDeleteChildren((econtext)->ecxt_per_tuple_memory)

  extern ExprContext *MakePerTupleExprContext(EState *estate);

diff --git a/src/include/executor/tuptable.h b/src/include/executor/tuptable.h
index 48f84bf..00686b0 100644
*** a/src/include/executor/tuptable.h
--- b/src/include/executor/tuptable.h
*************** extern Datum ExecFetchSlotTupleDatum(Tup
*** 163,168 ****
--- 163,169 ----
  extern HeapTuple ExecMaterializeSlot(TupleTableSlot *slot);
  extern TupleTableSlot *ExecCopySlot(TupleTableSlot *dstslot,
               TupleTableSlot *srcslot);
+ extern TupleTableSlot *ExecMakeSlotContentsReadOnly(TupleTableSlot *slot);

  /* in access/common/heaptuple.c */
  extern Datum slot_getattr(TupleTableSlot *slot, int attnum, bool *isnull);
diff --git a/src/include/postgres.h b/src/include/postgres.h
index 082c75b..f7cea45 100644
*** a/src/include/postgres.h
--- b/src/include/postgres.h
*************** typedef struct varatt_indirect
*** 88,93 ****
--- 88,110 ----
  }    varatt_indirect;

  /*
+  * struct varatt_deserialized is a "TOAST pointer" representing an out-of-line
+  * Datum that is stored in memory, in some type-specific, not necessarily
+  * physically contiguous format that is convenient for computation not
+  * storage.  APIs for this, in particular the definition of struct
+  * DeserializedObjectHeader, are in utils/deserialized.h.
+  *
+  * Note that just as for struct varatt_external, this struct is stored
+  * unaligned within any containing tuple.
+  */
+ typedef struct DeserializedObjectHeader DeserializedObjectHeader;
+
+ typedef struct varatt_deserialized
+ {
+     DeserializedObjectHeader *dohptr;
+ } varatt_deserialized;
+
+ /*
   * Type tag for the various sorts of "TOAST pointer" datums.  The peculiar
   * value for VARTAG_ONDISK comes from a requirement for on-disk compatibility
   * with a previous notion that the tag field was the pointer datum's length.
*************** typedef struct varatt_indirect
*** 95,105 ****
--- 112,124 ----
  typedef enum vartag_external
  {
      VARTAG_INDIRECT = 1,
+     VARTAG_DESERIALIZED = 2,
      VARTAG_ONDISK = 18
  } vartag_external;

  #define VARTAG_SIZE(tag) \
      ((tag) == VARTAG_INDIRECT ? sizeof(varatt_indirect) : \
+      (tag) == VARTAG_DESERIALIZED ? sizeof(varatt_deserialized) : \
       (tag) == VARTAG_ONDISK ? sizeof(varatt_external) : \
       TrapMacro(true, "unrecognized TOAST vartag"))

*************** typedef struct
*** 294,299 ****
--- 313,320 ----
      (VARATT_IS_EXTERNAL(PTR) && VARTAG_EXTERNAL(PTR) == VARTAG_ONDISK)
  #define VARATT_IS_EXTERNAL_INDIRECT(PTR) \
      (VARATT_IS_EXTERNAL(PTR) && VARTAG_EXTERNAL(PTR) == VARTAG_INDIRECT)
+ #define VARATT_IS_EXTERNAL_DESERIALIZED(PTR) \
+     (VARATT_IS_EXTERNAL(PTR) && VARTAG_EXTERNAL(PTR) == VARTAG_DESERIALIZED)
  #define VARATT_IS_SHORT(PTR)                VARATT_IS_1B(PTR)
  #define VARATT_IS_EXTENDED(PTR)                (!VARATT_IS_4B_U(PTR))

diff --git a/src/include/utils/array.h b/src/include/utils/array.h
index 694bce7..c2255e1 100644
*** a/src/include/utils/array.h
--- b/src/include/utils/array.h
*************** extern Datum array_remove(PG_FUNCTION_AR
*** 248,253 ****
--- 248,261 ----
  extern Datum array_replace(PG_FUNCTION_ARGS);
  extern Datum width_bucket_array(PG_FUNCTION_ARGS);

+ extern void CopyArrayEls(ArrayType *array,
+              Datum *values,
+              bool *nulls,
+              int nitems,
+              int typlen,
+              bool typbyval,
+              char typalign,
+              bool freedata);
  extern Datum array_ref(ArrayType *array, int nSubscripts, int *indx,
            int arraytyplen, int elmlen, bool elmbyval, char elmalign,
            bool *isNull);
*************** extern Datum array_agg_array_transfn(PG_
*** 349,354 ****
--- 357,375 ----
  extern Datum array_agg_array_finalfn(PG_FUNCTION_ARGS);

  /*
+  * prototypes for functions defined in array_deserialized.c
+  */
+ extern Datum deserialize_array(Datum arraydatum, MemoryContext parentcontext,
+                   int elmlen, bool elmbyval, char elmalign);
+ extern Datum array_set_deserialized(Datum arraydatum,
+                        int nSubscripts, int *indx,
+                        Datum dataValue, bool isNull,
+                        int arraytyplen,
+                        int elmlen, bool elmbyval, char elmalign);
+ extern Datum array_push_deserialized(PG_FUNCTION_ARGS);
+ extern Datum array_dims_deserialized(PG_FUNCTION_ARGS);
+
+ /*
   * prototypes for functions defined in array_typanalyze.c
   */
  extern Datum array_typanalyze(PG_FUNCTION_ARGS);
diff --git a/src/include/utils/datum.h b/src/include/utils/datum.h
index 663414b..bcc203d 100644
*** a/src/include/utils/datum.h
--- b/src/include/utils/datum.h
*************** extern Size datumGetSize(Datum value, bo
*** 31,43 ****
  extern Datum datumCopy(Datum value, bool typByVal, int typLen);

  /*
-  * datumFree - free a datum previously allocated by datumCopy, if any.
-  *
-  * Does nothing if datatype is pass-by-value.
-  */
- extern void datumFree(Datum value, bool typByVal, int typLen);
-
- /*
   * datumIsEqual
   * return true if two datums of the same type are equal, false otherwise.
   *
--- 31,36 ----
diff --git a/src/include/utils/deserialized.h b/src/include/utils/deserialized.h
index ...c5a261a .
*** a/src/include/utils/deserialized.h
--- b/src/include/utils/deserialized.h
***************
*** 0 ****
--- 1,135 ----
+ /*-------------------------------------------------------------------------
+  *
+  * deserialized.h
+  *      Declarations for access to "deserialized" value representations.
+  *
+  * Complex data types, particularly container types such as arrays and records,
+  * usually have on-disk representations that are compact but not especially
+  * convenient to modify.  What's more, when we do modify them, having to
+  * recopy all the rest of the value can be extremely inefficient.  Therefore,
+  * we provide a notion of a "deserialized" representation that is used only
+  * in memory and is optimized more for computation than storage.  The format
+  * appearing on disk is called the data type's "serialized" representation,
+  * since it is required to be a contiguous blob of bytes -- but the type can
+  * have a deserialized representation that is not.  Data types must provide
+  * means to translate a deserialized representation back to serialized form.
+  *
+  * A deserialized object is meant to survive across multiple operations, but
+  * not to be enormously long-lived; for example it might be a local variable
+  * in a PL/pgSQL procedure.  So its extra bulk compared to the on-disk format
+  * is a worthwhile trade-off.
+  *
+  * References to deserialized objects are a type of TOAST pointer.
+  * Because of longstanding conventions in Postgres, this means that the
+  * serialized form of such an object must always be a varlena object.
+  * Fortunately that's no restriction in practice.
+  *
+  *
+  * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+  * Portions Copyright (c) 1994, Regents of the University of California
+  *
+  * src/include/utils/deserialized.h
+  *
+  *-------------------------------------------------------------------------
+  */
+ #ifndef DESERIALIZED_H
+ #define DESERIALIZED_H
+
+ /* Size of an EXTERNAL datum that contains a pointer to a deserialized object */
+ #define DESERIALIZED_POINTER_SIZE (VARHDRSZ_EXTERNAL + sizeof(varatt_deserialized))
+
+ /*
+  * "Methods" that must be provided for any deserialized object.
+  *
+  * get_serialized_size: compute space needed for serialized representation
+  * (which, in general, must be a valid in-line, non-compressed varlena object).
+  *
+  * serialize_into: construct serialized representation in caller-allocated
+  * space at *result, of size allocated_size (which will always be the result
+  * of a preceding get_serialized_size call; it's passed for cross-checking).
+  *
+  * Note: construction of a heap tuple from a deserialized datum calls
+  * get_serialized_size twice, so it's worthwhile to make sure that doesn't
+  * incur too much overhead.
+  */
+ typedef Size (*DOM_get_serialized_size_method) (DeserializedObjectHeader *dohptr);
+ typedef void (*DOM_serialize_into_method) (DeserializedObjectHeader *dohptr,
+                                           void *result, Size allocated_size);
+
+ /* Struct of function pointers for a deserialized object's methods */
+ typedef struct DeserializedObjectMethods
+ {
+     DOM_get_serialized_size_method get_serialized_size;
+     DOM_serialize_into_method serialize_into;
+ } DeserializedObjectMethods;
+
+ /*
+  * Every deserialized object must contain this header; typically the header
+  * is embedded in some larger struct that adds type-specific fields.
+  *
+  * It is presumed that the header object and all subsidiary data are stored
+  * in doh_context, so that the object can be freed by deleting that context,
+  * or its storage lifespan can be altered by reparenting the context.
+  * (In principle the object could own additional resources, such as malloc'd
+  * storage, and use a memory context reset callback to free them upon reset or
+  * deletion of doh_context.)
+  *
+  * We consider a Datum pointing at the "primary" TOAST pointer to be a
+  * read/write reference.  Any other TOAST pointer is a read-only reference.
+  * For convenience, a "secondary" toast pointer is also allocated in the
+  * object header, but any copied pointer would also be considered read-only.
+  *
+  * The typedef declaration for this appears in postgres.h.
+  */
+ struct DeserializedObjectHeader
+ {
+     /* Phony varlena header */
+     int32        vl_len_;        /* always DOH_HEADER_MAGIC, see below */
+
+     /* Pointer to methods required for object type */
+     const DeserializedObjectMethods *doh_methods;
+
+     /* Memory context containing this header and subsidiary data */
+     MemoryContext doh_context;
+
+     /* "Primary" (R/W) TOAST pointer for this object is kept here */
+     char        doh_primary_ptr[DESERIALIZED_POINTER_SIZE];
+
+     /* "Secondary" (R/O) TOAST pointer for this object is kept here */
+     char        doh_secondary_ptr[DESERIALIZED_POINTER_SIZE];
+ };
+
+ /*
+  * Particularly for read-only functions, it is handy to be able to work with
+  * either regular "flat" varlena inputs or deserialized inputs of the same
+  * data type.  To allow determining which case an argument-fetching macro has
+  * returned, the first int32 of a DeserializedObjectHeader always contains -1
+  * (DOH_HEADER_MAGIC to the code).  This works since no 4-byte-header varlena
+  * could have that as its first 4 bytes.  Caution: we could not reliably tell
+  * the difference between a DeserializedObjectHeader and a short-header object
+  * with this trick.  However, it works fine for cases where the argument
+  * fetching code will return either a fully-uncompressed flat object or a
+  * deserialized object.
+  */
+ #define DOH_HEADER_MAGIC (-1)
+ #define VARATT_IS_DESERIALIZED_HEADER(PTR) \
+     (((DeserializedObjectHeader *) (PTR))->vl_len_ == DOH_HEADER_MAGIC)
+
+ /*
+  * Generic support functions for deserialized objects.
+  * (Some of these might be worth inlining later.)
+  */
+
+ extern DeserializedObjectHeader *DatumGetDOHP(Datum d);
+ extern void DOH_init_header(DeserializedObjectHeader *dohptr,
+                 const DeserializedObjectMethods *methods,
+                 MemoryContext obj_context);
+ extern Size DOH_get_serialized_size(DeserializedObjectHeader *dohptr);
+ extern void DOH_serialize_into(DeserializedObjectHeader *dohptr,
+                    void *result, Size allocated_size);
+ extern bool DatumIsReadWriteDeserializedObject(Datum d, bool isnull, int16 typlen);
+ extern Datum MakeDeserializedObjectReadOnly(Datum d, bool isnull, int16 typlen);
+ extern void TransferDeserializedObject(Datum d, MemoryContext new_parent);
+ extern void DeleteDeserializedObject(Datum d);
+
+ #endif   /* DESERIALIZED_H */
diff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c
index ae5421f..833fcf0 100644
*** a/src/pl/plpgsql/src/pl_exec.c
--- b/src/pl/plpgsql/src/pl_exec.c
***************
*** 32,37 ****
--- 32,38 ----
  #include "utils/array.h"
  #include "utils/builtins.h"
  #include "utils/datum.h"
+ #include "utils/deserialized.h"
  #include "utils/lsyscache.h"
  #include "utils/memutils.h"
  #include "utils/rel.h"
*************** static void exec_assign_value(PLpgSQL_ex
*** 171,176 ****
--- 172,178 ----
                    Datum value, Oid valtype, bool *isNull);
  static void exec_eval_datum(PLpgSQL_execstate *estate,
                  PLpgSQL_datum *datum,
+                 bool getrwpointer,
                  Oid *typeid,
                  int32 *typetypmod,
                  Datum *value,
*************** plpgsql_exec_function(PLpgSQL_function *
*** 468,473 ****
--- 470,482 ----
                  Size        len;
                  void       *tmp;

+                 /* temporary hack: reserialize if retval is deserialized */
+                 if (func->fn_rettyplen == -1 &&
+                     VARATT_IS_EXTERNAL_DESERIALIZED(DatumGetPointer(estate.retval)))
+                 {
+                     estate.retval = datumCopy(estate.retval, false, -1);
+                 }
+
                  len = datumGetSize(estate.retval, false, func->fn_rettyplen);
                  tmp = SPI_palloc(len);
                  memcpy(tmp, DatumGetPointer(estate.retval), len);
*************** exec_assign_value(PLpgSQL_execstate *est
*** 4057,4084 ****
                                      var->refname)));

                  /*
!                  * If type is by-reference, copy the new value (which is
!                  * probably in the eval_econtext) into the procedure's memory
!                  * context.
                   */
!                 if (!var->datatype->typbyval && !*isNull)
!                     newvalue = datumCopy(newvalue,
!                                          false,
!                                          var->datatype->typlen);

!                 /*
!                  * Now free the old value.  (We can't do this any earlier
!                  * because of the possibility that we are assigning the var's
!                  * old value to it, eg "foo := foo".  We could optimize out
!                  * the assignment altogether in such cases, but it's too
!                  * infrequent to be worth testing for.)
!                  */
!                 free_var(var);

!                 var->value = newvalue;
!                 var->isnull = *isNull;
!                 if (!var->datatype->typbyval && !*isNull)
!                     var->freeval = true;
                  break;
              }

--- 4066,4114 ----
                                      var->refname)));

                  /*
!                  * If we're assigning the variable's existing value back
!                  * again, there's nothing to do.  We must check this case to
!                  * avoid doing the wrong thing with deserialized objects.
                   */
!                 if (var->value == newvalue && !var->isnull && !*isNull)
!                      /* no work */ ;
!                 else
!                 {
!                     /*
!                      * If type is by-reference, copy the new value (which is
!                      * probably in the eval_econtext) into the procedure's
!                      * memory context.  But if it's a read/write reference to
!                      * a deserialized object, no physical copy needs to
!                      * happen; at most we need to reparent the object's memory
!                      * context.
!                      */
!                     if (!var->datatype->typbyval && !*isNull)
!                     {
!                         if (DatumIsReadWriteDeserializedObject(newvalue,
!                                                                false,
!                                                       var->datatype->typlen))
!                             TransferDeserializedObject(newvalue,
!                                                        CurrentMemoryContext);
!                         else
!                             newvalue = datumCopy(newvalue,
!                                                  false,
!                                                  var->datatype->typlen);
!                     }

!                     /*
!                      * Now free the old value.  (We don't do this any earlier
!                      * because of the possibility that we are assigning the
!                      * var's old value to it, eg "foo := foo".  This shouldn't
!                      * happen any more because of the preceding test, but
!                      * let's be safe.)
!                      */
!                     free_var(var);

!                     var->value = newvalue;
!                     var->isnull = *isNull;
!                     if (!var->datatype->typbyval && !*isNull)
!                         var->freeval = true;
!                 }
                  break;
              }

*************** exec_assign_value(PLpgSQL_execstate *est
*** 4276,4282 ****
                  } while (target->dtype == PLPGSQL_DTYPE_ARRAYELEM);

                  /* Fetch current value of array datum */
!                 exec_eval_datum(estate, target,
                                  &parenttypoid, &parenttypmod,
                                  &oldarraydatum, &oldarrayisnull);

--- 4306,4312 ----
                  } while (target->dtype == PLPGSQL_DTYPE_ARRAYELEM);

                  /* Fetch current value of array datum */
!                 exec_eval_datum(estate, target, true,
                                  &parenttypoid, &parenttypmod,
                                  &oldarraydatum, &oldarrayisnull);

*************** exec_assign_value(PLpgSQL_execstate *est
*** 4424,4439 ****
   *
   * The type oid, typmod, value in Datum format, and null flag are returned.
   *
   * At present this doesn't handle PLpgSQL_expr or PLpgSQL_arrayelem datums.
   *
!  * NOTE: caller must not modify the returned value, since it points right
!  * at the stored value in the case of pass-by-reference datatypes.  In some
!  * cases we have to palloc a return value, and in such cases we put it into
!  * the estate's short-term memory context.
   */
  static void
  exec_eval_datum(PLpgSQL_execstate *estate,
                  PLpgSQL_datum *datum,
                  Oid *typeid,
                  int32 *typetypmod,
                  Datum *value,
--- 4454,4473 ----
   *
   * The type oid, typmod, value in Datum format, and null flag are returned.
   *
+  * If getrwpointer is TRUE, we'll return a R/W pointer to any variable that
+  * is a deserialized object; otherwise we return a R/O pointer.
+  *
   * At present this doesn't handle PLpgSQL_expr or PLpgSQL_arrayelem datums.
   *
!  * NOTE: in most cases caller must not modify the returned value, since
!  * it points right at the stored value in the case of pass-by-reference
!  * datatypes.  In some cases we have to palloc a return value, and in such
!  * cases we put it into the estate's short-term memory context.
   */
  static void
  exec_eval_datum(PLpgSQL_execstate *estate,
                  PLpgSQL_datum *datum,
+                 bool getrwpointer,
                  Oid *typeid,
                  int32 *typetypmod,
                  Datum *value,
*************** exec_eval_datum(PLpgSQL_execstate *estat
*** 4449,4455 ****

                  *typeid = var->datatype->typoid;
                  *typetypmod = var->datatype->atttypmod;
!                 *value = var->value;
                  *isnull = var->isnull;
                  break;
              }
--- 4483,4494 ----

                  *typeid = var->datatype->typoid;
                  *typetypmod = var->datatype->atttypmod;
!                 if (getrwpointer)
!                     *value = var->value;
!                 else
!                     *value = MakeDeserializedObjectReadOnly(var->value,
!                                                             var->isnull,
!                                                       var->datatype->typlen);
                  *isnull = var->isnull;
                  break;
              }
*************** setup_param_list(PLpgSQL_execstate *esta
*** 5285,5291 ****
                  PLpgSQL_var *var = (PLpgSQL_var *) datum;
                  ParamExternData *prm = ¶mLI->params[dno];

!                 prm->value = var->value;
                  prm->isnull = var->isnull;
                  prm->pflags = PARAM_FLAG_CONST;
                  prm->ptype = var->datatype->typoid;
--- 5324,5332 ----
                  PLpgSQL_var *var = (PLpgSQL_var *) datum;
                  ParamExternData *prm = ¶mLI->params[dno];

!                 prm->value = MakeDeserializedObjectReadOnly(var->value,
!                                                             var->isnull,
!                                                       var->datatype->typlen);
                  prm->isnull = var->isnull;
                  prm->pflags = PARAM_FLAG_CONST;
                  prm->ptype = var->datatype->typoid;
*************** plpgsql_param_fetch(ParamListInfo params
*** 5351,5357 ****
      /* OK, evaluate the value and store into the appropriate paramlist slot */
      datum = estate->datums[dno];
      prm = ¶ms->params[dno];
!     exec_eval_datum(estate, datum,
                      &prm->ptype, &prmtypmod,
                      &prm->value, &prm->isnull);
  }
--- 5392,5398 ----
      /* OK, evaluate the value and store into the appropriate paramlist slot */
      datum = estate->datums[dno];
      prm = ¶ms->params[dno];
!     exec_eval_datum(estate, datum, false,
                      &prm->ptype, &prmtypmod,
                      &prm->value, &prm->isnull);
  }
*************** make_tuple_from_row(PLpgSQL_execstate *e
*** 5543,5549 ****
          if (row->varnos[i] < 0) /* should not happen */
              elog(ERROR, "dropped rowtype entry for non-dropped column");

!         exec_eval_datum(estate, estate->datums[row->varnos[i]],
                          &fieldtypeid, &fieldtypmod,
                          &dvalues[i], &nulls[i]);
          if (fieldtypeid != tupdesc->attrs[i]->atttypid)
--- 5584,5590 ----
          if (row->varnos[i] < 0) /* should not happen */
              elog(ERROR, "dropped rowtype entry for non-dropped column");

!         exec_eval_datum(estate, estate->datums[row->varnos[i]], false,
                          &fieldtypeid, &fieldtypmod,
                          &dvalues[i], &nulls[i]);
          if (fieldtypeid != tupdesc->attrs[i]->atttypid)
*************** free_var(PLpgSQL_var *var)
*** 6336,6342 ****
  {
      if (var->freeval)
      {
!         pfree(DatumGetPointer(var->value));
          var->freeval = false;
      }
  }
--- 6377,6388 ----
  {
      if (var->freeval)
      {
!         if (DatumIsReadWriteDeserializedObject(var->value,
!                                                var->isnull,
!                                                var->datatype->typlen))
!             DeleteDeserializedObject(var->value);
!         else
!             pfree(DatumGetPointer(var->value));
          var->freeval = false;
      }
  }
*************** format_expr_params(PLpgSQL_execstate *es
*** 6543,6550 ****

          curvar = (PLpgSQL_var *) estate->datums[dno];

!         exec_eval_datum(estate, (PLpgSQL_datum *) curvar, ¶mtypeid,
!                         ¶mtypmod, ¶mdatum, ¶misnull);

          appendStringInfo(¶mstr, "%s%s = ",
                           paramno > 0 ? ", " : "",
--- 6589,6597 ----

          curvar = (PLpgSQL_var *) estate->datums[dno];

!         exec_eval_datum(estate, (PLpgSQL_datum *) curvar, false,
!                         ¶mtypeid, ¶mtypmod,
!                         ¶mdatum, ¶misnull);

          appendStringInfo(¶mstr, "%s%s = ",
                           paramno > 0 ? ", " : "",

pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: Odd behavior of updatable security barrier views on foreign tables
Next
From: David Fetter
Date:
Subject: Re: RangeType internal use