Thread: Manipulating complex types as non-contiguous structures in-memory
I've been fooling around with a design to support computation-oriented, not-necessarily-contiguous-blobs representations of datatypes in memory, along the lines I mentioned here: http://www.postgresql.org/message-id/2355.1382710707@sss.pgh.pa.us In particular this is meant to reduce the overhead for repeated operations on arrays, records, etc. We've had several previous discussions about that, and even some single-purpose patches such as in this thread: http://www.postgresql.org/message-id/flat/CAFj8pRAKuDU_0md-dg6Ftk0wSupvMLyrV1PB+HyC+GUBZz346w@mail.gmail.com There was also a thread discussing how this sort of thing could be useful to PostGIS: http://www.postgresql.org/message-id/526A61FB.1050209@oslandia.com and it's been discussed a few other times too, but I'm too lazy to search the archives any further. I've now taken this idea as far as building the required infrastructure and revamping a couple of array operators to use it. There's a lot yet to do, but I've done enough to get some preliminary ideas about performance (see below). The core ideas of this patch are: * Invent a new TOAST datum type "pointer to deserialized object", which is physically just like the existing indirect-toast-pointer concept, but it has a different va_tag code and somewhat different semantics. * A deserialized object has a standard header (which is what the toast pointers point to) and typically will have additional data-type-specific fields after that. One component of the standard header is a pointer to a set of "method" functions that provide ways to accomplish standard data-type-independent operations on the deserialized object. * Another standard header component is a MemoryContext identifier: the header, as well as all subsidiary data belonging to the deserialized object, must live in this context. (Well, I guess there could also be child contexts.) By exposing an explicit context identifier, we can accomplish tasks like "move this object into another context" by reparenting the object's context rather than physically copying anything. * The only standard "methods" I've found a need for so far are functions to re-serialize the object, that is generate a plain varlena value that is semantically equivalent. To avoid extra copying, this is split into separate "compute the space needed" and "serialize into this memory" steps, so that the result can be dropped exactly where the caller needs it. * Currently, a deserialized object will be reserialized in that way whenever we incorporate it into a physical tuple (ie, heap_form_tuple or index_form_tuple), or whenever somebody applies datumCopy() to it. I'd like to relax this later, but there's an awful lot of code that supposes that heap_form_tuple or datumCopy will produce a self-contained value that survives beyond, eg, destruction of the memory context that contained the source Datums. We can get good speedups in a lot of interesting cases without solving that problem, so I don't feel too bad about leaving it as a future project. * In particular, things like PG_GETARG_ARRAYTYPE_P() treat a deserialized toast pointer as something to be detoasted, and will produce a palloc'd re-serialized value. This means that we do not need to convert all the C functions concerned with a given datatype at the same time (or indeed ever); a function that hasn't been upgraded will build a re-serialized representation and then operate on that. We'll invent alternate argument-fetching functions that skip the reserialization step, for use by functions that have been upgraded to handle either case. This is basically the same approach we used when we introduced short varlena headers, and that seems to have gone smoothly enough. * There's a concept that a deserialized object has a "primary" toast pointer, which is physically part of the object, as well as "secondary" toast pointers which might or might not be part of the object. If you have a Datum pointer to the primary toast pointer then you are authorized to modify the object in-place; if you have a Datum pointer to a secondary toast pointer then you must treat it as read-only (ie, you have to make a copy if you're going to change it). Functions that construct a new deserialized object always return its primary toast pointer; this allows a nest of functions to modify an object in-place without copying, which was the primary need that the PostGIS folks expressed. On the other hand, plpgsql can hand out secondary toast pointers to deserialized objects stored in plpgsql function variables, thus ensuring that the objects won't be modified unexpectedly, while never having to physically copy them if the called functions just need to inspect them. * Primary and secondary pointers are physically identical, but the primary pointer resides in a specific spot in the deserialized object's standard header. (So you can tell if you've got the primary pointer via a simple address comparison.) * I've modified the array element assignment path in plpgsql's exec_assign_value so that, instead of passing a secondary toast pointer to array_set() as you might expect from the above, it passes the primary toast pointer thus allowing array_set() to modify the variable in-place. So an operation like "array_variable[x] := y" no longer incurs recopying of the whole array, once the variable has been converted into deserialized form. (If it's not yet, it becomes deserialized after the first such assignment.) Also, assignment of an already-deserialized value to a variable accomplishes that with a MemoryContext parent pointer swing instead of physical copying, if what we have is the primary toast pointer, which implies it's not referenced anywhere else. * Any functions that plpgsql gives a read/write pointer to need to be exceedingly careful to not leave a corrupted object behind if they fail partway through. I've successfully written such a version of array_set(), and it wasn't too painful, but this may be a limitation on the general applicability of the whole approach. * In the current patch, that restriction only applies to array_set() anyway. But I would like to allow in-place updates for non-core cases. For example in something like hstore_var := hstore_var || 'foo=>bar'; we could plausibly pass a R/W pointer to hstore_concat and let it modify hstore_var in place. But this would require knowing which such functions are safe, or assuming that they all are, which might be an onerous restriction. * I soon noticed that I was getting a lot of empty "deserialized array" contexts floating around. The attached patch addresses this in a quick hack fashion by redefining ResetExprContext() to use MemoryContextResetAndDeleteChildren() instead of MemoryContextReset(), so that deserialized objects created within an expression evaluation context go completely away at ResetExprContext(), rather than being left behind as empty subcontext shells. We've talked more than once about redefining mcxt.c's API so that MemoryContextReset() means what's currently meant by MemoryContextResetAndDeleteChildren(), and if you really truly do want to keep empty child contexts around then you need to call something else instead. I did not go that far here, but I think we should seriously consider biting the bullet and finally changing it. * Although I said above that everything owned by a deserialized object has to live in a single memory context, I do have ideas about relaxing that. The core idea would be to invent a "memory context reset/delete callback" feature in mcxt.c. Then a deserialized object could register such a callback on its own memory context, and use the callback to clean up resources outside its context. This is potentially useful for instance for something like PostGIS, where an object likely includes some data that was allocated with malloc not palloc because it was created by library functions that aren't Postgres-aware. Another likely use-case is for deserialized objects representing composite types to maintain reference counts on their tuple descriptors instead of having to copy said descriptors into their private contexts. This'd be material for a separate patch though. So that's the plan, and attached is a very-much-WIP patch that uses this approach to speed up plpgsql array element assignments (and not a whole lot else as yet). Here's the basic test case I've been using: create or replace function arraysetint(n int) returns int[] as $$ declare res int[] := '{}'; begin for i in 1 .. n loop res[i] := i; end loop; return res; end $$ language plpgsql strict; In HEAD, this function's runtime grows as O(N^2), so for example (with casserts off on my x86_64 workstation): regression=# select array_dims(arraysetint(100000)); array_dims ------------ [1:100000] (1 row) Time: 7874.070 ms With variable-length array elements, such as if you change the int[] arrays to numeric[], it's even worse: regression=# select array_dims(arraysetnum(100000)); array_dims ------------ [1:100000] (1 row) Time: 31177.340 ms With the attached patch, those timings drop to 80 and 150 ms respectively. It's not all peaches and cream: for the array_append operator, which is also accelerated by the patch (mainly because it is too much in bed with array_set to not fix at the same time ;-)), I tried examples like explain analyze select array[1,2] || g || g || g from generate_series(1,1000000) g; Very roughly, HEAD needs about 400 ns per || operator in this scenario. With the patch, it's about 480 ns for the first operator and then 200 more for each one accepting a prior operator's output. (Those numbers could perhaps be improved with more-invasive refactoring of the array code.) The extra initial overhead represents the time to convert the array[1,2] constant to deserialized form during each execution of the first operator. Still, if the worst-case slowdown is around 20% on trivially-sized arrays, I'd gladly take that to have better performance on larger arrays. And I think this example is close to the worst case for the patch's approach, since it's testing small, fixed-element-length, no-nulls arrays, which is what the existing code can handle without spending a lot of cycles. Note that I've kept all the deserialized-array-specific code in its own file for now, just for ease of hacking. That stuff would need to propagate into the main array-related files in a more complete patch. BTW, I'm not all that thrilled with the "deserialized object" terminology. I found myself repeatedly tripping up on which form was serialized and which de-. If anyone's got a better naming idea I'm willing to adopt it. I'm not sure exactly how to push this forward. I would not want to commit it without converting a significant number of array functions to understand about deserialized inputs, and by the time I've finished that work it's likely to be too late for 9.5. OTOH I'm sure that the PostGIS folk would love to have this infrastructure in 9.5 not 9.6 so they could make a start on fixing their issues. (Further down the pike, I'd plan to look at adapting composite-type operations, JSONB, etc, to make use of this approach, but that certainly isn't happening for 9.5.) Thoughts, advice, better ideas? regards, tom lane diff --git a/src/backend/access/common/heaptuple.c b/src/backend/access/common/heaptuple.c index 867035d..e5fcced 100644 *** a/src/backend/access/common/heaptuple.c --- b/src/backend/access/common/heaptuple.c *************** *** 60,65 **** --- 60,66 ---- #include "access/sysattr.h" #include "access/tuptoaster.h" #include "executor/tuptable.h" + #include "utils/deserialized.h" /* Does att's datatype allow packing into the 1-byte-header varlena format? */ *************** heap_compute_data_size(TupleDesc tupleDe *** 93,105 **** for (i = 0; i < numberOfAttributes; i++) { Datum val; if (isnull[i]) continue; val = values[i]; ! if (ATT_IS_PACKABLE(att[i]) && VARATT_CAN_MAKE_SHORT(DatumGetPointer(val))) { /* --- 94,108 ---- for (i = 0; i < numberOfAttributes; i++) { Datum val; + Form_pg_attribute atti; if (isnull[i]) continue; val = values[i]; + atti = att[i]; ! if (ATT_IS_PACKABLE(atti) && VARATT_CAN_MAKE_SHORT(DatumGetPointer(val))) { /* *************** heap_compute_data_size(TupleDesc tupleDe *** 108,118 **** */ data_length += VARATT_CONVERTED_SHORT_SIZE(DatumGetPointer(val)); } else { ! data_length = att_align_datum(data_length, att[i]->attalign, ! att[i]->attlen, val); ! data_length = att_addlength_datum(data_length, att[i]->attlen, val); } } --- 111,131 ---- */ data_length += VARATT_CONVERTED_SHORT_SIZE(DatumGetPointer(val)); } + else if (atti->attlen == -1 && + VARATT_IS_EXTERNAL_DESERIALIZED(DatumGetPointer(val))) + { + /* + * we want to re-serialize the deserialized value so that the + * constructed tuple doesn't depend on it + */ + data_length = att_align_nominal(data_length, atti->attalign); + data_length += DOH_get_serialized_size(DatumGetDOHP(val)); + } else { ! data_length = att_align_datum(data_length, atti->attalign, ! atti->attlen, val); ! data_length = att_addlength_datum(data_length, atti->attlen, val); } } *************** heap_fill_tuple(TupleDesc tupleDesc, *** 203,212 **** *infomask |= HEAP_HASVARWIDTH; if (VARATT_IS_EXTERNAL(val)) { ! *infomask |= HEAP_HASEXTERNAL; ! /* no alignment, since it's short by definition */ ! data_length = VARSIZE_EXTERNAL(val); ! memcpy(data, val, data_length); } else if (VARATT_IS_SHORT(val)) { --- 216,241 ---- *infomask |= HEAP_HASVARWIDTH; if (VARATT_IS_EXTERNAL(val)) { ! if (VARATT_IS_EXTERNAL_DESERIALIZED(val)) ! { ! /* ! * we want to re-serialize the deserialized value so that ! * the constructed tuple doesn't depend on it ! */ ! DeserializedObjectHeader *doh = DatumGetDOHP(values[i]); ! ! data = (char *) att_align_nominal(data, ! att[i]->attalign); ! data_length = DOH_get_serialized_size(doh); ! DOH_serialize_into(doh, data, data_length); ! } ! else ! { ! *infomask |= HEAP_HASEXTERNAL; ! /* no alignment, since it's short by definition */ ! data_length = VARSIZE_EXTERNAL(val); ! memcpy(data, val, data_length); ! } } else if (VARATT_IS_SHORT(val)) { diff --git a/src/backend/access/heap/tuptoaster.c b/src/backend/access/heap/tuptoaster.c index f8c1401..0c9dd8e 100644 *** a/src/backend/access/heap/tuptoaster.c --- b/src/backend/access/heap/tuptoaster.c *************** *** 37,42 **** --- 37,43 ---- #include "catalog/catalog.h" #include "common/pg_lzcompress.h" #include "miscadmin.h" + #include "utils/deserialized.h" #include "utils/fmgroids.h" #include "utils/rel.h" #include "utils/typcache.h" *************** heap_tuple_fetch_attr(struct varlena * a *** 130,135 **** --- 131,149 ---- result = (struct varlena *) palloc(VARSIZE_ANY(attr)); memcpy(result, attr, VARSIZE_ANY(attr)); } + else if (VARATT_IS_EXTERNAL_DESERIALIZED(attr)) + { + /* + * This is a deserialized-object pointer --- get serialized format + */ + DeserializedObjectHeader *doh; + Size resultsize; + + doh = DatumGetDOHP(PointerGetDatum(attr)); + resultsize = DOH_get_serialized_size(doh); + result = (struct varlena *) palloc(resultsize); + DOH_serialize_into(doh, (void *) result, resultsize); + } else { /* *************** heap_tuple_untoast_attr(struct varlena * *** 196,201 **** --- 210,224 ---- attr = result; } } + else if (VARATT_IS_EXTERNAL_DESERIALIZED(attr)) + { + /* + * This is a deserialized-object pointer --- get serialized format + */ + attr = heap_tuple_fetch_attr(attr); + /* deserializers are not allowed to produce compressed/short output */ + Assert(!VARATT_IS_EXTENDED(attr)); + } else if (VARATT_IS_COMPRESSED(attr)) { /* *************** heap_tuple_untoast_attr_slice(struct var *** 263,268 **** --- 286,296 ---- return heap_tuple_untoast_attr_slice(redirect.pointer, sliceoffset, slicelength); } + else if (VARATT_IS_EXTERNAL_DESERIALIZED(attr)) + { + /* pass it off to heap_tuple_fetch_attr to deserialize */ + preslice = heap_tuple_fetch_attr(attr); + } else preslice = attr; *************** toast_raw_datum_size(Datum value) *** 344,349 **** --- 372,381 ---- return toast_raw_datum_size(PointerGetDatum(toast_pointer.pointer)); } + else if (VARATT_IS_EXTERNAL_DESERIALIZED(attr)) + { + result = DOH_get_serialized_size(DatumGetDOHP(value)); + } else if (VARATT_IS_COMPRESSED(attr)) { /* here, va_rawsize is just the payload size */ *************** toast_datum_size(Datum value) *** 400,405 **** --- 432,441 ---- return toast_datum_size(PointerGetDatum(toast_pointer.pointer)); } + else if (VARATT_IS_EXTERNAL_DESERIALIZED(attr)) + { + result = DOH_get_serialized_size(DatumGetDOHP(value)); + } else if (VARATT_IS_SHORT(attr)) { result = VARSIZE_SHORT(attr); diff --git a/src/backend/executor/execTuples.c b/src/backend/executor/execTuples.c index 753754d..6c5f5dd 100644 *** a/src/backend/executor/execTuples.c --- b/src/backend/executor/execTuples.c *************** *** 88,93 **** --- 88,94 ---- #include "nodes/nodeFuncs.h" #include "storage/bufmgr.h" #include "utils/builtins.h" + #include "utils/deserialized.h" #include "utils/lsyscache.h" #include "utils/typcache.h" *************** ExecCopySlot(TupleTableSlot *dstslot, Tu *** 812,817 **** --- 813,864 ---- return ExecStoreTuple(newTuple, dstslot, InvalidBuffer, true); } + /* -------------------------------- + * ExecMakeSlotContentsReadOnly + * Mark any R/W deserialized datums in the slot as read-only. + * + * This is needed when a slot that might contain R/W datum references is to be + * used as input for general expression evaluation. Since the expression(s) + * might contain more than one Var referencing the same R/W datum, we could + * get wrong answers if functions acting on those Vars thought they could + * modify the deserialized value. + * + * For notational reasons, we return the same slot passed in. + * -------------------------------- + */ + TupleTableSlot * + ExecMakeSlotContentsReadOnly(TupleTableSlot *slot) + { + /* + * sanity checks + */ + Assert(slot != NULL); + Assert(slot->tts_tupleDescriptor != NULL); + Assert(!slot->tts_isempty); + + /* + * If the slot contains a physical tuple, it can't contain any + * deserialized datums, because we flatten those whenever making a + * physical tuple. This might change later; but for now, we need do + * nothing unless the slot is virtual. + */ + if (slot->tts_tuple == NULL) + { + Form_pg_attribute *att = slot->tts_tupleDescriptor->attrs; + int attnum; + + for (attnum = 0; attnum < slot->tts_nvalid; attnum++) + { + slot->tts_values[attnum] = + MakeDeserializedObjectReadOnly(slot->tts_values[attnum], + slot->tts_isnull[attnum], + att[attnum]->attlen); + } + } + + return slot; + } + /* ---------------------------------------------------------------- * convenience initialization routines diff --git a/src/backend/executor/nodeSubqueryscan.c b/src/backend/executor/nodeSubqueryscan.c index 3f66e24..e5d1e54 100644 *** a/src/backend/executor/nodeSubqueryscan.c --- b/src/backend/executor/nodeSubqueryscan.c *************** SubqueryNext(SubqueryScanState *node) *** 56,62 **** --- 56,70 ---- * We just return the subplan's result slot, rather than expending extra * cycles for ExecCopySlot(). (Our own ScanTupleSlot is used only for * EvalPlanQual rechecks.) + * + * We do need to mark the slot contents read-only to prevent interference + * between different functions reading the same datum from the slot. It's + * a bit hokey to do this to the subplan's slot, but should be safe + * enough. */ + if (!TupIsNull(slot)) + slot = ExecMakeSlotContentsReadOnly(slot); + return slot; } diff --git a/src/backend/utils/adt/Makefile b/src/backend/utils/adt/Makefile index 20e5ff1..8f2d319 100644 *** a/src/backend/utils/adt/Makefile --- b/src/backend/utils/adt/Makefile *************** endif *** 16,24 **** endif # keep this list arranged alphabetically or it gets to be a mess ! OBJS = acl.o arrayfuncs.o array_selfuncs.o array_typanalyze.o \ ! array_userfuncs.o arrayutils.o ascii.o bool.o \ ! cash.o char.o date.o datetime.o datum.o dbsize.o domains.o \ encode.o enum.o float.o format_type.o formatting.o genfile.o \ geo_ops.o geo_selfuncs.o inet_cidr_ntop.o inet_net_pton.o int.o \ int8.o json.o jsonb.o jsonb_gin.o jsonb_op.o jsonb_util.o \ --- 16,25 ---- endif # keep this list arranged alphabetically or it gets to be a mess ! OBJS = acl.o arrayfuncs.o array_deserialized.o array_selfuncs.o \ ! array_typanalyze.o array_userfuncs.o arrayutils.o \ ! ascii.o bool.o cash.o char.o \ ! date.o datetime.o datum.o dbsize.o deserialized.o domains.o \ encode.o enum.o float.o format_type.o formatting.o genfile.o \ geo_ops.o geo_selfuncs.o inet_cidr_ntop.o inet_net_pton.o int.o \ int8.o json.o jsonb.o jsonb_gin.o jsonb_op.o jsonb_util.o \ diff --git a/src/backend/utils/adt/array_deserialized.c b/src/backend/utils/adt/array_deserialized.c index ...092a5b1 . *** a/src/backend/utils/adt/array_deserialized.c --- b/src/backend/utils/adt/array_deserialized.c *************** *** 0 **** --- 1,936 ---- + /*------------------------------------------------------------------------- + * + * array_deserialized.c + * Functions for manipulating deserialized arrays. + * + * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group + * Portions Copyright (c) 1994, Regents of the University of California + * + * + * IDENTIFICATION + * src/backend/utils/adt/array_deserialized.c + * + *------------------------------------------------------------------------- + */ + #include "postgres.h" + + #include "access/tupmacs.h" + #include "utils/array.h" + #include "utils/builtins.h" + #include "utils/datum.h" + #include "utils/deserialized.h" + #include "utils/lsyscache.h" + #include "utils/memutils.h" + + + /* + * A deserialized array is contained within a private memory context (as + * all deserialized objects must be) and has a control structure as below. + * + * The deserialized array might contain a regular serialized array if that was + * the original input and we've not modified it significantly. Otherwise, the + * contents are represented by Datum/isnull arrays plus dimensionality and + * type information. We could also have both forms, if we've deconstructed + * the original array for access purposes but not yet changed it. For pass + * by reference element types, the Datums would point into the serialized + * array in this situation. Once we start modifying array elements, new + * pass-by-ref elements are separately palloc'd within the memory context. + */ + #define DA_MAGIC 689375833 /* ID for debugging crosschecks */ + + typedef struct DeserializedArrayHeader + { + /* Standard header for deserialized objects */ + DeserializedObjectHeader hdr; + + /* Magic value identifying a deserialized array (for debugging only) */ + int da_magic; + + /* Dimensionality info (always valid) */ + int ndims; /* # of dimensions */ + int *dims; /* array dimensions */ + int *lbound; /* index lower bounds for each dimension */ + + /* Element type info (always valid) */ + Oid element_type; /* element type OID */ + int16 typlen; /* needed info about element datatype */ + bool typbyval; + char typalign; + + /* + * If we have a Datum-array representation of the array, it's kept here; + * else dvalues/dnulls are NULL. The dvalues and dnulls arrays are always + * palloc'd within the object private context, but may change size from + * time to time. For pass-by-ref element types, dvalues entries might + * point either into the sstartptr..sendptr area, or to separately + * palloc'd chunks. Elements should always be fully detoasted, as they + * are in the standard serialized representation. + * + * Even when dvalues is valid, dnulls can be NULL if there are no null + * elements. + */ + Datum *dvalues; /* array of Datums */ + bool *dnulls; /* array of is-null flags for Datums */ + int dvalueslen; /* allocated length of above arrays */ + int nelems; /* number of valid entries in above arrays */ + + /* + * serialized_size is the current space requirement for the serialized + * equivalent of the deserialized array, if known; otherwise it's 0. We + * store this to make consecutive calls of get_serialized_size cheap. + */ + Size serialized_size; + + /* + * svalue points to the serialized representation if it is valid, else it + * is NULL. If we have or ever had a serialized representation then + * sstartptr/sendptr point to the start and end+1 of its data area; this + * is so that we can tell which Datum pointers point into the serialized + * representation rather than being pointers to separately palloc'd data. + */ + ArrayType *svalue; /* must be a fully detoasted array */ + char *sstartptr; /* start of its data area */ + char *sendptr; /* end+1 of its data area */ + } DeserializedArrayHeader; + + /* "Methods" required for a deserialized object */ + static Size DA_get_serialized_size(DeserializedObjectHeader *dohptr); + static void DA_serialize_into(DeserializedObjectHeader *dohptr, + void *result, Size allocated_size); + + static const DeserializedObjectMethods DA_methods = + { + DA_get_serialized_size, + DA_serialize_into + }; + + /* + * Functions that can handle either a "flat" varlena array or a deserialized + * array use this union to work with their input. + */ + typedef union AnyArrayType + { + ArrayType arr; + DeserializedArrayHeader des; + } AnyArrayType; + + /* + * Macros for working with AnyArrayType inputs. Beware multiple references! + */ + #define AARR_NDIM(a) \ + (VARATT_IS_DESERIALIZED_HEADER(a) ? (a)->des.ndims : ARR_NDIM(&(a)->arr)) + #define AARR_HASNULL(a) \ + (VARATT_IS_DESERIALIZED_HEADER(a) ? \ + ((a)->des.dvalues != NULL ? (a)->des.dnulls != NULL : ARR_HASNULL((a)->des.svalue)) : \ + ARR_HASNULL(&(a)->arr)) + #define AARR_ELEMTYPE(a) \ + (VARATT_IS_DESERIALIZED_HEADER(a) ? (a)->des.element_type : ARR_ELEMTYPE(&(a)->arr)) + #define AARR_DIMS(a) \ + (VARATT_IS_DESERIALIZED_HEADER(a) ? (a)->des.dims : ARR_DIMS(&(a)->arr)) + #define AARR_LBOUND(a) \ + (VARATT_IS_DESERIALIZED_HEADER(a) ? (a)->des.lbound : ARR_LBOUND(&(a)->arr)) + + + /* + * deserialize_array: convert an array Datum into a deserialized array + * + * The deserialized object will be a child of parentcontext. + * + * Caller can provide element type's representational data; we do that because + * caller is often in a position to cache it across repeated calls. If the + * caller can't do that, pass zeroes for elmlen/elmbyval/elmalign. + */ + Datum + deserialize_array(Datum arraydatum, MemoryContext parentcontext, + int elmlen, bool elmbyval, char elmalign) + { + ArrayType *array; + DeserializedArrayHeader *dah; + MemoryContext objcxt; + MemoryContext oldcxt; + + /* allocate private context for deserialized object */ + objcxt = AllocSetContextCreate(parentcontext, + "deserialized array", + ALLOCSET_DEFAULT_MINSIZE, + ALLOCSET_DEFAULT_INITSIZE, + ALLOCSET_DEFAULT_MAXSIZE); + + /* + * Detoast and copy original array into private context. Note that this + * coding risks leaking some memory in the private context if we have to + * fetch data back from a TOAST table; however, experimentation says that + * the leak is minimal. Doing it this way saves a copy step, which seems + * worthwhile, especially if the array is large enough to need toasting. + */ + oldcxt = MemoryContextSwitchTo(objcxt); + array = DatumGetArrayTypePCopy(arraydatum); + MemoryContextSwitchTo(oldcxt); + + /* set up deserialized array header */ + dah = (DeserializedArrayHeader *) + MemoryContextAlloc(objcxt, sizeof(DeserializedArrayHeader)); + + DOH_init_header(&dah->hdr, &DA_methods, objcxt); + dah->da_magic = DA_MAGIC; + + dah->ndims = ARR_NDIM(array); + /* note these pointers point into the svalue header! */ + dah->dims = ARR_DIMS(array); + dah->lbound = ARR_LBOUND(array); + + /* save array's element-type data for possible use later */ + dah->element_type = ARR_ELEMTYPE(array); + if (elmlen) + { + /* Caller provided representational data */ + dah->typlen = elmlen; + dah->typbyval = elmbyval; + dah->typalign = elmalign; + } + else + { + /* No, so look it up */ + get_typlenbyvalalign(dah->element_type, + &dah->typlen, + &dah->typbyval, + &dah->typalign); + } + + /* we don't make a deconstructed representation now */ + dah->dvalues = NULL; + dah->dnulls = NULL; + dah->dvalueslen = 0; + dah->nelems = 0; + dah->serialized_size = 0; + + /* remember we have a serialized representation */ + dah->svalue = array; + dah->sstartptr = ARR_DATA_PTR(array); + dah->sendptr = ((char *) array) + ARR_SIZE(array); + + /* return a R/W pointer to the deserialized array */ + return PointerGetDatum(dah->hdr.doh_primary_ptr); + } + + /* + * construct_empty_deserialized_array: make an empty deserialized array + * given only type information. (elmlen etc can be zeroes.) + */ + static DeserializedArrayHeader * + construct_empty_deserialized_array(Oid element_type, + MemoryContext parentcontext, + int elmlen, bool elmbyval, char elmalign) + { + ArrayType *array = construct_empty_array(element_type); + Datum d; + + d = deserialize_array(PointerGetDatum(array), parentcontext, + elmlen, elmbyval, elmalign); + return (DeserializedArrayHeader *) DatumGetDOHP(d); + } + + + /* + * get_serialized_size method for deserialized arrays + */ + static Size + DA_get_serialized_size(DeserializedObjectHeader *dohptr) + { + DeserializedArrayHeader *dah = (DeserializedArrayHeader *) dohptr; + int nelems; + int ndims; + Datum *dvalues; + bool *dnulls; + Size nbytes; + int i; + + Assert(dah->da_magic == DA_MAGIC); + + /* Easy if we have a valid serialized value */ + if (dah->svalue) + return ARR_SIZE(dah->svalue); + + /* If we have a cached size value, believe that */ + if (dah->serialized_size) + return dah->serialized_size; + + /* + * Compute space needed by examining dvalues/dnulls. Note that the result + * array will have a nulls bitmap if dnulls isn't NULL, even if the array + * doesn't actually contain any nulls now. + */ + nelems = dah->nelems; + ndims = dah->ndims; + Assert(nelems == ArrayGetNItems(ndims, dah->dims)); + dvalues = dah->dvalues; + dnulls = dah->dnulls; + nbytes = 0; + for (i = 0; i < nelems; i++) + { + if (dnulls && dnulls[i]) + continue; + nbytes = att_addlength_datum(nbytes, dah->typlen, dvalues[i]); + nbytes = att_align_nominal(nbytes, dah->typalign); + /* check for overflow of total request */ + if (!AllocSizeIsValid(nbytes)) + ereport(ERROR, + (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED), + errmsg("array size exceeds the maximum allowed (%d)", + (int) MaxAllocSize))); + } + + if (dnulls) + nbytes += ARR_OVERHEAD_WITHNULLS(ndims, nelems); + else + nbytes += ARR_OVERHEAD_NONULLS(ndims); + + /* cache for next time */ + dah->serialized_size = nbytes; + + return nbytes; + } + + /* + * serialize_into method for deserialized arrays + */ + static void + DA_serialize_into(DeserializedObjectHeader *dohptr, + void *result, Size allocated_size) + { + DeserializedArrayHeader *dah = (DeserializedArrayHeader *) dohptr; + ArrayType *aresult = (ArrayType *) result; + int nelems; + int ndims; + int32 dataoffset; + + Assert(dah->da_magic == DA_MAGIC); + + /* Easy if we have a valid serialized value */ + if (dah->svalue) + { + Assert(allocated_size == ARR_SIZE(dah->svalue)); + memcpy(result, dah->svalue, allocated_size); + return; + } + + /* Else allocation should match previous get_serialized_size result */ + Assert(allocated_size == dah->serialized_size); + + /* Fill result array from dvalues/dnulls */ + nelems = dah->nelems; + ndims = dah->ndims; + + if (dah->dnulls) + dataoffset = ARR_OVERHEAD_WITHNULLS(ndims, nelems); + else + dataoffset = 0; /* marker for no null bitmap */ + + /* We must ensure that any pad space is zero-filled */ + memset(aresult, 0, allocated_size); + + SET_VARSIZE(aresult, allocated_size); + aresult->ndim = ndims; + aresult->dataoffset = dataoffset; + aresult->elemtype = dah->element_type; + memcpy(ARR_DIMS(aresult), dah->dims, ndims * sizeof(int)); + memcpy(ARR_LBOUND(aresult), dah->lbound, ndims * sizeof(int)); + + CopyArrayEls(aresult, + dah->dvalues, dah->dnulls, nelems, + dah->typlen, dah->typbyval, dah->typalign, + false); + } + + /* + * Argument fetching support code + */ + + #ifdef NOT_YET_USED + + /* + * DatumGetDeserializedArray: get a writable deserialized array from an input + */ + static DeserializedArrayHeader * + DatumGetDeserializedArray(Datum d) + { + DeserializedArrayHeader *dah; + + /* If it's a writable deserialized array already, just return it */ + if (DatumIsReadWriteDeserializedObject(d, false, -1)) + { + dah = (DeserializedArrayHeader *) DatumGetDOHP(d); + Assert(dah->da_magic == DA_MAGIC); + return dah; + } + + /* + * If it's a non-writable deserialized array, copy it, extracting the + * element representational data to save a catalog lookup. + */ + if (VARATT_IS_EXTERNAL_DESERIALIZED(DatumGetPointer(d))) + { + dah = (DeserializedArrayHeader *) DatumGetDOHP(d); + Assert(dah->da_magic == DA_MAGIC); + d = deserialize_array(d, CurrentMemoryContext, + dah->typlen, dah->typbyval, dah->typalign); + return (DeserializedArrayHeader *) DatumGetDOHP(d); + } + + /* Else deserialize the hard way */ + d = deserialize_array(d, CurrentMemoryContext, 0, 0, 0); + return (DeserializedArrayHeader *) DatumGetDOHP(d); + } + + #define PG_GETARG_DESERIALIZED_ARRAY(n) DatumGetDeserializedArray(PG_GETARG_DATUM(n)) + + #endif + + /* + * As above, when caller has the ability to cache element type info + */ + static DeserializedArrayHeader * + DatumGetDeserializedArrayX(Datum d, + int elmlen, bool elmbyval, char elmalign) + { + DeserializedArrayHeader *dah; + + /* If it's a writable deserialized array already, just return it */ + if (DatumIsReadWriteDeserializedObject(d, false, -1)) + { + dah = (DeserializedArrayHeader *) DatumGetDOHP(d); + Assert(dah->da_magic == DA_MAGIC); + Assert(dah->typlen == elmlen); + Assert(dah->typbyval == elmbyval); + Assert(dah->typalign == elmalign); + return dah; + } + + /* Else deserialize using caller's data */ + d = deserialize_array(d, CurrentMemoryContext, elmlen, elmbyval, elmalign); + return (DeserializedArrayHeader *) DatumGetDOHP(d); + } + + #define PG_GETARG_DESERIALIZED_ARRAYX(n, elmlen, elmbyval, elmalign) \ + DatumGetDeserializedArrayX(PG_GETARG_DATUM(n), elmlen, elmbyval, elmalign) + + /* + * DatumGetAnyArray: return either a deserialized array or a detoasted varlena + * array. The result must not be modified in-place. + */ + static AnyArrayType * + DatumGetAnyArray(Datum d) + { + DeserializedArrayHeader *dah; + + /* + * If it's a deserialized array, return the header pointer. + */ + if (VARATT_IS_EXTERNAL_DESERIALIZED(DatumGetPointer(d))) + { + dah = (DeserializedArrayHeader *) DatumGetDOHP(d); + Assert(dah->da_magic == DA_MAGIC); + return (AnyArrayType *) dah; + } + + /* Else do regular detoasting as needed */ + return (AnyArrayType *) PG_DETOAST_DATUM(d); + } + + #define PG_GETARG_ANY_ARRAY(n) DatumGetAnyArray(PG_GETARG_DATUM(n)) + + + /* + * Equivalent of array_set() for a deserialized array + * + * array_set took care of detoasting dataValue, the rest is up to us + * + * Note: as with any operation on a read/write deserialized object, we must + * take pains not to leave the object in a corrupt state if we fail partway + * through. + */ + Datum + array_set_deserialized(Datum arraydatum, + int nSubscripts, int *indx, + Datum dataValue, bool isNull, + int arraytyplen, + int elmlen, bool elmbyval, char elmalign) + { + DeserializedArrayHeader *dah; + Datum *dvalues; + bool *dnulls; + int i, + ndim, + dim[MAXDIM], + lb[MAXDIM], + offset; + bool dimschanged, + newhasnulls; + int addedbefore, + addedafter; + char *oldValue = NULL; + + dah = (DeserializedArrayHeader *) DatumGetDOHP(arraydatum); + Assert(dah->da_magic == DA_MAGIC); + + /* if this fails, we shouldn't be modifying this array in-place */ + Assert(DatumGetPointer(arraydatum) == (Pointer) dah->hdr.doh_primary_ptr); + + /* sanity-check caller's state; we don't use the passed data otherwise */ + Assert(arraytyplen == -1); + Assert(elmlen == dah->typlen); + Assert(elmbyval == dah->typbyval); + Assert(elmalign == dah->typalign); + + /* + * Copy dimension info into local storage. This allows us to modify the + * dimensions if needed, while not messing up the deserialized value if we + * fail partway through. + */ + ndim = dah->ndims; + Assert(ndim >= 0 && ndim <= MAXDIM); + memcpy(dim, dah->dims, ndim * sizeof(int)); + memcpy(lb, dah->lbound, ndim * sizeof(int)); + dimschanged = false; + + /* + * if number of dims is zero, i.e. an empty array, create an array with + * nSubscripts dimensions, and set the lower bounds to the supplied + * subscripts. + */ + if (ndim == 0) + { + /* + * Allocate adequate space for new dimension info. This is harmless + * if we fail later. + */ + Assert(nSubscripts > 0 && nSubscripts <= MAXDIM); + dah->dims = (int *) MemoryContextAllocZero(dah->hdr.doh_context, + nSubscripts * sizeof(int)); + dah->lbound = (int *) MemoryContextAllocZero(dah->hdr.doh_context, + nSubscripts * sizeof(int)); + + /* Update local copies of dimension info */ + ndim = nSubscripts; + for (i = 0; i < nSubscripts; i++) + { + dim[i] = 0; + lb[i] = indx[i]; + } + dimschanged = true; + } + else if (ndim != nSubscripts) + ereport(ERROR, + (errcode(ERRCODE_ARRAY_SUBSCRIPT_ERROR), + errmsg("wrong number of array subscripts"))); + + /* + * Deconstruct array if we didn't already. (Someday maybe add a special + * case path for fixed-length, no-nulls cases, where we can overwrite an + * element in place without ever deconstructing. But today is not that + * day.) + */ + if (dah->dvalues == NULL) + { + MemoryContext oldcxt = MemoryContextSwitchTo(dah->hdr.doh_context); + int nelems; + + dnulls = NULL; + deconstruct_array(dah->svalue, + dah->element_type, + dah->typlen, dah->typbyval, dah->typalign, + &dvalues, + ARR_HASNULL(dah->svalue) ? &dnulls : NULL, + &nelems); + + /* + * Update header only after successful completion of this step. If + * deconstruct_array fails partway through, worst consequence is some + * leaked memory in the object's context. + */ + dah->dvalues = dvalues; + dah->dnulls = dnulls; + dah->dvalueslen = dah->nelems = nelems; + MemoryContextSwitchTo(oldcxt); + } + + /* + * Copy new element into array's context, if needed (we assume it's + * already detoasted, so no junk should be created). If we fail further + * down, this memory is leaked, but that's reasonably harmless. + */ + if (!dah->typbyval && !isNull) + { + MemoryContext oldcxt = MemoryContextSwitchTo(dah->hdr.doh_context); + + dataValue = datumCopy(dataValue, false, dah->typlen); + MemoryContextSwitchTo(oldcxt); + } + + dvalues = dah->dvalues; + dnulls = dah->dnulls; + + newhasnulls = ((dnulls != NULL) || isNull); + addedbefore = addedafter = 0; + + /* + * Check subscripts (this logic matches original array_set) + */ + if (ndim == 1) + { + if (indx[0] < lb[0]) + { + addedbefore = lb[0] - indx[0]; + dim[0] += addedbefore; + lb[0] = indx[0]; + dimschanged = true; + if (addedbefore > 1) + newhasnulls = true; /* will insert nulls */ + } + if (indx[0] >= (dim[0] + lb[0])) + { + addedafter = indx[0] - (dim[0] + lb[0]) + 1; + dim[0] += addedafter; + dimschanged = true; + if (addedafter > 1) + newhasnulls = true; /* will insert nulls */ + } + /* Physically enlarge dvalues/dnulls arrays if needed */ + if (dim[0] > dah->dvalueslen) + { + /* We want some extra space if we're enlarging */ + int newlen = dim[0] + dim[0] / 8; + + dah->dvalues = dvalues = (Datum *) + repalloc(dvalues, newlen * sizeof(Datum)); + if (dnulls) + dah->dnulls = dnulls = (bool *) + repalloc(dnulls, newlen * sizeof(bool)); + dah->dvalueslen = newlen; + } + } + else + { + /* + * XXX currently we do not support extending multi-dimensional arrays + * during assignment + */ + for (i = 0; i < ndim; i++) + { + if (indx[i] < lb[i] || + indx[i] >= (dim[i] + lb[i])) + ereport(ERROR, + (errcode(ERRCODE_ARRAY_SUBSCRIPT_ERROR), + errmsg("array subscript out of range"))); + } + } + + /* Now we can calculate linear offset of target item in array */ + offset = ArrayGetOffset(nSubscripts, dim, lb, indx); + + /* + * If we need a nulls bitmap and don't already have one, create it, being + * sure to mark all existing entries as not null. + */ + if (newhasnulls && dnulls == NULL) + dah->dnulls = dnulls = (bool *) + MemoryContextAllocZero(dah->hdr.doh_context, + dah->dvalueslen * sizeof(bool)); + + /* + * We now have all the needed space allocated, so we're ready to make + * irreversible changes. Be very wary of allowing failure below here. + */ + + /* Serialized value will no longer represent array accurately */ + dah->svalue = NULL; + /* And we don't know the deserialized size either */ + dah->serialized_size = 0; + + /* Update dimensionality info if needed */ + if (dimschanged) + { + dah->ndims = ndim; + memcpy(dah->dims, dim, ndim * sizeof(int)); + memcpy(dah->lbound, lb, ndim * sizeof(int)); + } + + /* Reposition items if needed, and fill addedbefore items with nulls */ + if (addedbefore > 0) + { + memmove(dvalues + addedbefore, dvalues, dah->nelems * sizeof(Datum)); + for (i = 0; i < addedbefore; i++) + dvalues[i] = (Datum) 0; + if (dnulls) + { + memmove(dnulls + addedbefore, dnulls, dah->nelems * sizeof(bool)); + for (i = 0; i < addedbefore; i++) + dnulls[i] = true; + } + dah->nelems += addedbefore; + } + + /* fill addedafter items with nulls */ + if (addedafter > 0) + { + for (i = 0; i < addedafter; i++) + dvalues[dah->nelems + i] = (Datum) 0; + if (dnulls) + { + for (i = 0; i < addedafter; i++) + dnulls[dah->nelems + i] = true; + } + dah->nelems += addedafter; + } + + /* Grab old element value for pfree'ing, if needed. */ + if (!dah->typbyval && (dnulls == NULL || !dnulls[offset])) + oldValue = (char *) DatumGetPointer(dvalues[offset]); + + /* And finally we can insert the new element. */ + dvalues[offset] = dataValue; + if (dnulls) + dnulls[offset] = isNull; + + /* + * Free old element if needed; this keeps repeated element replacements + * from bloating the array's storage. If the pfree somehow fails, it + * won't corrupt the array. + */ + if (oldValue) + { + /* Don't try to pfree a part of the original serialized array */ + if (oldValue < dah->sstartptr || oldValue >= dah->sendptr) + pfree(oldValue); + } + + /* Done, return primary TOAST pointer for object */ + return PointerGetDatum(dah->hdr.doh_primary_ptr); + } + + /* + * Deserialized reimplementation of array_push + */ + typedef struct ArrayPushState + { + Oid arg0_typeid; + Oid arg1_typeid; + bool array_on_left; + Oid element_type; + int16 typlen; + bool typbyval; + char typalign; + } ArrayPushState; + + Datum + array_push_deserialized(PG_FUNCTION_ARGS) + { + DeserializedArrayHeader *dah; + Datum newelem; + bool isNull; + int *dimv, + *lb; + ArrayType *result; + int indx; + int lb0; + Oid element_type; + int16 typlen; + bool typbyval; + char typalign; + Oid arg0_typeid = get_fn_expr_argtype(fcinfo->flinfo, 0); + Oid arg1_typeid = get_fn_expr_argtype(fcinfo->flinfo, 1); + ArrayPushState *my_extra; + + if (arg0_typeid == InvalidOid || arg1_typeid == InvalidOid) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("could not determine input data types"))); + + /* + * We arrange to look up info about element type only once per series of + * calls, assuming the element type doesn't change underneath us. + */ + my_extra = (ArrayPushState *) fcinfo->flinfo->fn_extra; + if (my_extra == NULL) + { + fcinfo->flinfo->fn_extra = MemoryContextAlloc(fcinfo->flinfo->fn_mcxt, + sizeof(ArrayPushState)); + my_extra = (ArrayPushState *) fcinfo->flinfo->fn_extra; + my_extra->arg0_typeid = InvalidOid; + } + + if (my_extra->arg0_typeid != arg0_typeid || + my_extra->arg1_typeid != arg1_typeid) + { + /* Determine which input is the array */ + Oid arg0_elemid = get_element_type(arg0_typeid); + Oid arg1_elemid = get_element_type(arg1_typeid); + + if (arg0_elemid != InvalidOid) + { + my_extra->array_on_left = true; + element_type = arg0_elemid; + } + else if (arg1_elemid != InvalidOid) + { + my_extra->array_on_left = false; + element_type = arg1_elemid; + } + else + { + /* Shouldn't get here given proper type checking in parser */ + ereport(ERROR, + (errcode(ERRCODE_DATATYPE_MISMATCH), + errmsg("neither input type is an array"))); + PG_RETURN_NULL(); /* keep compiler quiet */ + } + + my_extra->arg0_typeid = arg0_typeid; + my_extra->arg1_typeid = arg1_typeid; + + /* Get info about element type */ + get_typlenbyvalalign(element_type, + &my_extra->typlen, + &my_extra->typbyval, + &my_extra->typalign); + my_extra->element_type = element_type; + } + + element_type = my_extra->element_type; + typlen = my_extra->typlen; + typbyval = my_extra->typbyval; + typalign = my_extra->typalign; + + /* + * Now we can fetch the arguments, using cached type info if needed + */ + if (my_extra->array_on_left) + { + if (PG_ARGISNULL(0)) + dah = construct_empty_deserialized_array(element_type, + CurrentMemoryContext, + typlen, typbyval, typalign); + else + dah = PG_GETARG_DESERIALIZED_ARRAYX(0, typlen, typbyval, typalign); + isNull = PG_ARGISNULL(1); + if (isNull) + newelem = (Datum) 0; + else + newelem = PG_GETARG_DATUM(1); + } + else + { + if (PG_ARGISNULL(1)) + dah = construct_empty_deserialized_array(element_type, + CurrentMemoryContext, + typlen, typbyval, typalign); + else + dah = PG_GETARG_DESERIALIZED_ARRAYX(1, typlen, typbyval, typalign); + isNull = PG_ARGISNULL(0); + if (isNull) + newelem = (Datum) 0; + else + newelem = PG_GETARG_DATUM(0); + } + + Assert(element_type == dah->element_type); + + /* + * Perform push (this logic is basically unchanged from original) + */ + if (dah->ndims == 1) + { + lb = dah->lbound; + dimv = dah->dims; + + if (my_extra->array_on_left) + { + /* append newelem */ + int ub = dimv[0] + lb[0] - 1; + + indx = ub + 1; + /* overflow? */ + if (indx < ub) + ereport(ERROR, + (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE), + errmsg("integer out of range"))); + } + else + { + /* prepend newelem */ + indx = lb[0] - 1; + /* overflow? */ + if (indx > lb[0]) + ereport(ERROR, + (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE), + errmsg("integer out of range"))); + } + lb0 = lb[0]; + } + else if (dah->ndims == 0) + { + indx = 1; + lb0 = 1; + } + else + ereport(ERROR, + (errcode(ERRCODE_DATA_EXCEPTION), + errmsg("argument must be empty or one-dimensional array"))); + + result = array_set((ArrayType *) dah->hdr.doh_primary_ptr, + 1, &indx, newelem, isNull, + -1, typlen, typbyval, typalign); + + Assert(result == (ArrayType *) dah->hdr.doh_primary_ptr); + + /* + * Readjust result's LB to match the input's. We need do nothing in the + * append case, but it's the simplest way to implement the prepend case. + */ + if (dah->ndims == 1 && !my_extra->array_on_left) + { + /* This is ok whether we've deconstructed or not */ + dah->lbound[0] = lb0; + } + + PG_RETURN_POINTER(result); + } + + /* + * array_dims : + * returns the dimensions of the array pointed to by "v", as a "text" + * + * This is here as an example of handling either flat or deserialized inputs. + */ + Datum + array_dims_deserialized(PG_FUNCTION_ARGS) + { + AnyArrayType *v = PG_GETARG_ANY_ARRAY(0); + char *p; + int i; + int *dimv, + *lb; + + /* + * 33 since we assume 15 digits per number + ':' +'[]' + * + * +1 for trailing null + */ + char buf[MAXDIM * 33 + 1]; + + /* Sanity check: does it look like an array at all? */ + if (AARR_NDIM(v) <= 0 || AARR_NDIM(v) > MAXDIM) + PG_RETURN_NULL(); + + dimv = AARR_DIMS(v); + lb = AARR_LBOUND(v); + + p = buf; + for (i = 0; i < AARR_NDIM(v); i++) + { + sprintf(p, "[%d:%d]", lb[i], dimv[i] + lb[i] - 1); + p += strlen(p); + } + + PG_RETURN_TEXT_P(cstring_to_text(buf)); + } diff --git a/src/backend/utils/adt/array_userfuncs.c b/src/backend/utils/adt/array_userfuncs.c index 600646e..4684f0a 100644 *** a/src/backend/utils/adt/array_userfuncs.c --- b/src/backend/utils/adt/array_userfuncs.c *************** *** 25,30 **** --- 25,33 ---- Datum array_push(PG_FUNCTION_ARGS) { + #if 1 + return array_push_deserialized(fcinfo); + #else ArrayType *v; Datum newelem; bool isNull; *************** array_push(PG_FUNCTION_ARGS) *** 157,162 **** --- 160,166 ---- ARR_LBOUND(result)[0] = ARR_LBOUND(v)[0]; PG_RETURN_ARRAYTYPE_P(result); + #endif } /*----------------------------------------------------------------------------- diff --git a/src/backend/utils/adt/arrayfuncs.c b/src/backend/utils/adt/arrayfuncs.c index 5591b46..9b3037a 100644 *** a/src/backend/utils/adt/arrayfuncs.c --- b/src/backend/utils/adt/arrayfuncs.c *************** *** 27,32 **** --- 27,33 ---- #include "utils/array.h" #include "utils/builtins.h" #include "utils/datum.h" + #include "utils/deserialized.h" #include "utils/lsyscache.h" #include "utils/memutils.h" #include "utils/typcache.h" *************** static void ReadArrayBinary(StringInfo b *** 93,102 **** int typlen, bool typbyval, char typalign, Datum *values, bool *nulls, bool *hasnulls, int32 *nbytes); - static void CopyArrayEls(ArrayType *array, - Datum *values, bool *nulls, int nitems, - int typlen, bool typbyval, char typalign, - bool freedata); static bool array_get_isnull(const bits8 *nullbitmap, int offset); static void array_set_isnull(bits8 *nullbitmap, int offset, bool isNull); static Datum ArrayCast(char *value, bool byval, int len); --- 94,99 ---- *************** ReadArrayStr(char *arrayStr, *** 939,945 **** * the values are not toasted. (Doing it here doesn't work since the * caller has already allocated space for the array...) */ ! static void CopyArrayEls(ArrayType *array, Datum *values, bool *nulls, --- 936,942 ---- * the values are not toasted. (Doing it here doesn't work since the * caller has already allocated space for the array...) */ ! void CopyArrayEls(ArrayType *array, Datum *values, bool *nulls, *************** array_ndims(PG_FUNCTION_ARGS) *** 1666,1671 **** --- 1663,1671 ---- Datum array_dims(PG_FUNCTION_ARGS) { + #if 1 + return array_dims_deserialized(fcinfo); + #else ArrayType *v = PG_GETARG_ARRAYTYPE_P(0); char *p; int i; *************** array_dims(PG_FUNCTION_ARGS) *** 1694,1699 **** --- 1694,1700 ---- } PG_RETURN_TEXT_P(cstring_to_text(buf)); + #endif } /* *************** array_set(ArrayType *array, *** 2161,2166 **** --- 2162,2191 ---- if (elmlen == -1 && !isNull) dataValue = PointerGetDatum(PG_DETOAST_DATUM(dataValue)); + if (1) + { + /* Convert to R/W deserialized form if not that already */ + if (!DatumIsReadWriteDeserializedObject(PointerGetDatum(array), + false, -1)) + { + array = (ArrayType *) DatumGetPointer( + deserialize_array(PointerGetDatum(array), CurrentMemoryContext, + elmlen, elmbyval, elmalign)); + } + + /* And hand off to array_deserialized.c */ + return (ArrayType *) DatumGetPointer( + array_set_deserialized(PointerGetDatum(array), + nSubscripts, + indx, + dataValue, + isNull, + arraytyplen, + elmlen, + elmbyval, + elmalign)); + } + /* detoast input array if necessary */ array = DatumGetArrayTypeP(PointerGetDatum(array)); diff --git a/src/backend/utils/adt/datum.c b/src/backend/utils/adt/datum.c index 014eca5..4ebf79a 100644 *** a/src/backend/utils/adt/datum.c --- b/src/backend/utils/adt/datum.c *************** *** 12,19 **** * *------------------------------------------------------------------------- */ /* ! * In the implementation of the next routines we assume the following: * * A) if a type is "byVal" then all the information is stored in the * Datum itself (i.e. no pointers involved!). In this case the --- 12,20 ---- * *------------------------------------------------------------------------- */ + /* ! * In the implementation of these routines we assume the following: * * A) if a type is "byVal" then all the information is stored in the * Datum itself (i.e. no pointers involved!). In this case the *************** *** 34,44 **** --- 35,49 ---- * * Note that we do not treat "toasted" datums specially; therefore what * will be copied or compared is the compressed data or toast reference. + * An exception is made for datumCopy() of a deserialized object, however, + * because most callers expect to get a simple contiguous (and pfree'able) + * result from datumCopy(). */ #include "postgres.h" #include "utils/datum.h" + #include "utils/deserialized.h" /*------------------------------------------------------------------------- *************** *** 46,51 **** --- 51,57 ---- * * Find the "real" size of a datum, given the datum value, * whether it is a "by value", and the declared type length. + * (For TOAST pointer datums, this is the size of the pointer datum.) * * This is essentially an out-of-line version of the att_addlength_datum() * macro in access/tupmacs.h. We do a tad more error checking though. *************** datumGetSize(Datum value, bool typByVal, *** 106,114 **** /*------------------------------------------------------------------------- * datumCopy * ! * make a copy of a datum * * If the datatype is pass-by-reference, memory is obtained with palloc(). *------------------------------------------------------------------------- */ Datum --- 112,127 ---- /*------------------------------------------------------------------------- * datumCopy * ! * Make a copy of a non-NULL datum. * * If the datatype is pass-by-reference, memory is obtained with palloc(). + * + * If the value is a reference to a deserialized object, we deserialize into + * memory obtained with palloc(). We need to copy because one of the main + * uses of this function is to copy a datum out of a transient memory context + * that's about to be destroyed, and the deserialized object is probably in a + * child context that will also go away. Moreover, many callers assume that + * the result is a single pfree-able chunk. *------------------------------------------------------------------------- */ Datum *************** datumCopy(Datum value, bool typByVal, in *** 118,161 **** if (typByVal) res = value; ! else { ! Size realSize; ! char *s; ! if (DatumGetPointer(value) == NULL) ! return PointerGetDatum(NULL); ! realSize = datumGetSize(value, typByVal, typLen); ! s = (char *) palloc(realSize); ! memcpy(s, DatumGetPointer(value), realSize); ! res = PointerGetDatum(s); } ! return res; ! } ! ! /*------------------------------------------------------------------------- ! * datumFree ! * ! * Free the space occupied by a datum CREATED BY "datumCopy" ! * ! * NOTE: DO NOT USE THIS ROUTINE with datums returned by heap_getattr() etc. ! * ONLY datums created by "datumCopy" can be freed! ! *------------------------------------------------------------------------- ! */ ! #ifdef NOT_USED ! void ! datumFree(Datum value, bool typByVal, int typLen) ! { ! if (!typByVal) { ! Pointer s = DatumGetPointer(value); ! pfree(s); } } - #endif /*------------------------------------------------------------------------- * datumIsEqual --- 131,179 ---- if (typByVal) res = value; ! else if (typLen == -1) { ! /* It is a varlena datatype */ ! struct varlena *vl = (struct varlena *) DatumGetPointer(value); ! if (VARATT_IS_EXTERNAL_DESERIALIZED(vl)) ! { ! /* Serialize into the caller's memory context */ ! DeserializedObjectHeader *doh = DatumGetDOHP(value); ! Size resultsize; ! char *resultptr; ! resultsize = DOH_get_serialized_size(doh); ! resultptr = (char *) palloc(resultsize); ! DOH_serialize_into(doh, (void *) resultptr, resultsize); ! res = PointerGetDatum(resultptr); ! } ! else ! { ! /* Otherwise, just copy the varlena datum verbatim */ ! Size realSize; ! char *resultptr; ! realSize = (Size) VARSIZE_ANY(vl); ! resultptr = (char *) palloc(realSize); ! memcpy(resultptr, vl, realSize); ! res = PointerGetDatum(resultptr); ! } } ! else { ! /* Pass by reference, but not varlena, so not toasted */ ! Size realSize; ! char *resultptr; ! realSize = datumGetSize(value, typByVal, typLen); ! ! resultptr = (char *) palloc(realSize); ! memcpy(resultptr, DatumGetPointer(value), realSize); ! res = PointerGetDatum(resultptr); } + return res; } /*------------------------------------------------------------------------- * datumIsEqual diff --git a/src/backend/utils/adt/deserialized.c b/src/backend/utils/adt/deserialized.c index ...4584514 . *** a/src/backend/utils/adt/deserialized.c --- b/src/backend/utils/adt/deserialized.c *************** *** 0 **** --- 1,169 ---- + /*------------------------------------------------------------------------- + * + * deserialized.c + * Support functions for "deserialized" value representations. + * + * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group + * Portions Copyright (c) 1994, Regents of the University of California + * + * + * IDENTIFICATION + * src/backend/utils/adt/deserialized.c + * + *------------------------------------------------------------------------- + */ + #include "postgres.h" + + #include "utils/deserialized.h" + #include "utils/memutils.h" + + /* + * DatumGetDOHP + * + * Given a Datum that is a deserialized-object reference, extract the pointer. + * + * This is a bit tedious since the pointer may not be properly aligned; + * compare VARATT_EXTERNAL_GET_POINTER(). + */ + DeserializedObjectHeader * + DatumGetDOHP(Datum d) + { + varattrib_1b_e *datum = (varattrib_1b_e *) DatumGetPointer(d); + varatt_deserialized ptr; + + Assert(VARATT_IS_EXTERNAL_DESERIALIZED(datum)); + memcpy(&ptr, VARDATA_EXTERNAL(datum), sizeof(ptr)); + return ptr.dohptr; + } + + /* + * DOH_init_header + * + * Initialize the common header of a deserialized object. + * + * The main thing this encapsulates is initializing the TOAST pointers. + */ + void + DOH_init_header(DeserializedObjectHeader *dohptr, + const DeserializedObjectMethods *methods, + MemoryContext obj_context) + { + varatt_deserialized ptr; + + dohptr->vl_len_ = DOH_HEADER_MAGIC; + dohptr->doh_methods = methods; + dohptr->doh_context = obj_context; + + ptr.dohptr = dohptr; + + SET_VARTAG_EXTERNAL(dohptr->doh_primary_ptr, VARTAG_DESERIALIZED); + memcpy(VARDATA_EXTERNAL(dohptr->doh_primary_ptr), &ptr, sizeof(ptr)); + + SET_VARTAG_EXTERNAL(dohptr->doh_secondary_ptr, VARTAG_DESERIALIZED); + memcpy(VARDATA_EXTERNAL(dohptr->doh_secondary_ptr), &ptr, sizeof(ptr)); + } + + /* + * DOH_get_serialized_size + * DOH_serialize_into + * + * Convenience functions for invoking the "methods" of a deserialized object. + */ + + Size + DOH_get_serialized_size(DeserializedObjectHeader *dohptr) + { + return (*dohptr->doh_methods->get_serialized_size) (dohptr); + } + + void + DOH_serialize_into(DeserializedObjectHeader *dohptr, + void *result, Size allocated_size) + { + (*dohptr->doh_methods->serialize_into) (dohptr, result, allocated_size); + } + + /* + * Does the Datum represent a writable deserialized object? + */ + bool + DatumIsReadWriteDeserializedObject(Datum d, bool isnull, int16 typlen) + { + DeserializedObjectHeader *dohptr; + + /* Reject if it's NULL or not a varlena type */ + if (isnull || typlen != -1) + return false; + + /* Reject if not a deserialized-object pointer */ + if (!VARATT_IS_EXTERNAL_DESERIALIZED(DatumGetPointer(d))) + return false; + + /* Now safe to extract the object pointer */ + dohptr = DatumGetDOHP(d); + + /* Reject if this isn't the primary TOAST pointer for the object */ + if (DatumGetPointer(d) != (Pointer) dohptr->doh_primary_ptr) + return false; + + return true; + } + + /* + * If the Datum represents a R/W deserialized object, change it to R/O. + * Otherwise return the original Datum. + */ + Datum + MakeDeserializedObjectReadOnly(Datum d, bool isnull, int16 typlen) + { + DeserializedObjectHeader *dohptr; + + /* Nothing to do if it's NULL or not a varlena type */ + if (isnull || typlen != -1) + return d; + + /* Nothing to do if not a deserialized-object pointer */ + if (!VARATT_IS_EXTERNAL_DESERIALIZED(DatumGetPointer(d))) + return d; + + /* Now safe to extract the object pointer */ + dohptr = DatumGetDOHP(d); + + /* Nothing to do if this isn't the primary TOAST pointer for the object */ + if (DatumGetPointer(d) != (Pointer) dohptr->doh_primary_ptr) + return d; + + /* Else return the secondary pointer instead */ + return PointerGetDatum(dohptr->doh_secondary_ptr); + } + + /* + * Transfer ownership of a deserialized object to a new parent memory context. + * The object must be referenced by its primary (R/W) pointer. + */ + void + TransferDeserializedObject(Datum d, MemoryContext new_parent) + { + DeserializedObjectHeader *dohptr = DatumGetDOHP(d); + + /* Assert this is the primary TOAST pointer for the object */ + Assert(DatumGetPointer(d) == (Pointer) dohptr->doh_primary_ptr); + + /* Transfer ownership */ + MemoryContextSetParent(dohptr->doh_context, new_parent); + } + + /* + * Delete a deserialized object (must be referenced by its primary pointer). + */ + void + DeleteDeserializedObject(Datum d) + { + DeserializedObjectHeader *dohptr = DatumGetDOHP(d); + + /* Assert this is the primary TOAST pointer for the object */ + Assert(DatumGetPointer(d) == (Pointer) dohptr->doh_primary_ptr); + + /* Kill it */ + MemoryContextDelete(dohptr->doh_context); + } diff --git a/src/backend/utils/mmgr/mcxt.c b/src/backend/utils/mmgr/mcxt.c index 202bc78..4b24066 100644 *** a/src/backend/utils/mmgr/mcxt.c --- b/src/backend/utils/mmgr/mcxt.c *************** MemoryContextSetParent(MemoryContext con *** 266,271 **** --- 266,275 ---- AssertArg(MemoryContextIsValid(context)); AssertArg(context != new_parent); + /* Fast path if it's got correct parent already */ + if (new_parent == context->parent) + return; + /* Delink from existing parent, if any */ if (context->parent) { diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h index 40fde83..a98a7af 100644 *** a/src/include/executor/executor.h --- b/src/include/executor/executor.h *************** extern void FreeExprContext(ExprContext *** 312,318 **** extern void ReScanExprContext(ExprContext *econtext); #define ResetExprContext(econtext) \ ! MemoryContextReset((econtext)->ecxt_per_tuple_memory) extern ExprContext *MakePerTupleExprContext(EState *estate); --- 312,318 ---- extern void ReScanExprContext(ExprContext *econtext); #define ResetExprContext(econtext) \ ! MemoryContextResetAndDeleteChildren((econtext)->ecxt_per_tuple_memory) extern ExprContext *MakePerTupleExprContext(EState *estate); diff --git a/src/include/executor/tuptable.h b/src/include/executor/tuptable.h index 48f84bf..00686b0 100644 *** a/src/include/executor/tuptable.h --- b/src/include/executor/tuptable.h *************** extern Datum ExecFetchSlotTupleDatum(Tup *** 163,168 **** --- 163,169 ---- extern HeapTuple ExecMaterializeSlot(TupleTableSlot *slot); extern TupleTableSlot *ExecCopySlot(TupleTableSlot *dstslot, TupleTableSlot *srcslot); + extern TupleTableSlot *ExecMakeSlotContentsReadOnly(TupleTableSlot *slot); /* in access/common/heaptuple.c */ extern Datum slot_getattr(TupleTableSlot *slot, int attnum, bool *isnull); diff --git a/src/include/postgres.h b/src/include/postgres.h index 082c75b..f7cea45 100644 *** a/src/include/postgres.h --- b/src/include/postgres.h *************** typedef struct varatt_indirect *** 88,93 **** --- 88,110 ---- } varatt_indirect; /* + * struct varatt_deserialized is a "TOAST pointer" representing an out-of-line + * Datum that is stored in memory, in some type-specific, not necessarily + * physically contiguous format that is convenient for computation not + * storage. APIs for this, in particular the definition of struct + * DeserializedObjectHeader, are in utils/deserialized.h. + * + * Note that just as for struct varatt_external, this struct is stored + * unaligned within any containing tuple. + */ + typedef struct DeserializedObjectHeader DeserializedObjectHeader; + + typedef struct varatt_deserialized + { + DeserializedObjectHeader *dohptr; + } varatt_deserialized; + + /* * Type tag for the various sorts of "TOAST pointer" datums. The peculiar * value for VARTAG_ONDISK comes from a requirement for on-disk compatibility * with a previous notion that the tag field was the pointer datum's length. *************** typedef struct varatt_indirect *** 95,105 **** --- 112,124 ---- typedef enum vartag_external { VARTAG_INDIRECT = 1, + VARTAG_DESERIALIZED = 2, VARTAG_ONDISK = 18 } vartag_external; #define VARTAG_SIZE(tag) \ ((tag) == VARTAG_INDIRECT ? sizeof(varatt_indirect) : \ + (tag) == VARTAG_DESERIALIZED ? sizeof(varatt_deserialized) : \ (tag) == VARTAG_ONDISK ? sizeof(varatt_external) : \ TrapMacro(true, "unrecognized TOAST vartag")) *************** typedef struct *** 294,299 **** --- 313,320 ---- (VARATT_IS_EXTERNAL(PTR) && VARTAG_EXTERNAL(PTR) == VARTAG_ONDISK) #define VARATT_IS_EXTERNAL_INDIRECT(PTR) \ (VARATT_IS_EXTERNAL(PTR) && VARTAG_EXTERNAL(PTR) == VARTAG_INDIRECT) + #define VARATT_IS_EXTERNAL_DESERIALIZED(PTR) \ + (VARATT_IS_EXTERNAL(PTR) && VARTAG_EXTERNAL(PTR) == VARTAG_DESERIALIZED) #define VARATT_IS_SHORT(PTR) VARATT_IS_1B(PTR) #define VARATT_IS_EXTENDED(PTR) (!VARATT_IS_4B_U(PTR)) diff --git a/src/include/utils/array.h b/src/include/utils/array.h index 694bce7..c2255e1 100644 *** a/src/include/utils/array.h --- b/src/include/utils/array.h *************** extern Datum array_remove(PG_FUNCTION_AR *** 248,253 **** --- 248,261 ---- extern Datum array_replace(PG_FUNCTION_ARGS); extern Datum width_bucket_array(PG_FUNCTION_ARGS); + extern void CopyArrayEls(ArrayType *array, + Datum *values, + bool *nulls, + int nitems, + int typlen, + bool typbyval, + char typalign, + bool freedata); extern Datum array_ref(ArrayType *array, int nSubscripts, int *indx, int arraytyplen, int elmlen, bool elmbyval, char elmalign, bool *isNull); *************** extern Datum array_agg_array_transfn(PG_ *** 349,354 **** --- 357,375 ---- extern Datum array_agg_array_finalfn(PG_FUNCTION_ARGS); /* + * prototypes for functions defined in array_deserialized.c + */ + extern Datum deserialize_array(Datum arraydatum, MemoryContext parentcontext, + int elmlen, bool elmbyval, char elmalign); + extern Datum array_set_deserialized(Datum arraydatum, + int nSubscripts, int *indx, + Datum dataValue, bool isNull, + int arraytyplen, + int elmlen, bool elmbyval, char elmalign); + extern Datum array_push_deserialized(PG_FUNCTION_ARGS); + extern Datum array_dims_deserialized(PG_FUNCTION_ARGS); + + /* * prototypes for functions defined in array_typanalyze.c */ extern Datum array_typanalyze(PG_FUNCTION_ARGS); diff --git a/src/include/utils/datum.h b/src/include/utils/datum.h index 663414b..bcc203d 100644 *** a/src/include/utils/datum.h --- b/src/include/utils/datum.h *************** extern Size datumGetSize(Datum value, bo *** 31,43 **** extern Datum datumCopy(Datum value, bool typByVal, int typLen); /* - * datumFree - free a datum previously allocated by datumCopy, if any. - * - * Does nothing if datatype is pass-by-value. - */ - extern void datumFree(Datum value, bool typByVal, int typLen); - - /* * datumIsEqual * return true if two datums of the same type are equal, false otherwise. * --- 31,36 ---- diff --git a/src/include/utils/deserialized.h b/src/include/utils/deserialized.h index ...c5a261a . *** a/src/include/utils/deserialized.h --- b/src/include/utils/deserialized.h *************** *** 0 **** --- 1,135 ---- + /*------------------------------------------------------------------------- + * + * deserialized.h + * Declarations for access to "deserialized" value representations. + * + * Complex data types, particularly container types such as arrays and records, + * usually have on-disk representations that are compact but not especially + * convenient to modify. What's more, when we do modify them, having to + * recopy all the rest of the value can be extremely inefficient. Therefore, + * we provide a notion of a "deserialized" representation that is used only + * in memory and is optimized more for computation than storage. The format + * appearing on disk is called the data type's "serialized" representation, + * since it is required to be a contiguous blob of bytes -- but the type can + * have a deserialized representation that is not. Data types must provide + * means to translate a deserialized representation back to serialized form. + * + * A deserialized object is meant to survive across multiple operations, but + * not to be enormously long-lived; for example it might be a local variable + * in a PL/pgSQL procedure. So its extra bulk compared to the on-disk format + * is a worthwhile trade-off. + * + * References to deserialized objects are a type of TOAST pointer. + * Because of longstanding conventions in Postgres, this means that the + * serialized form of such an object must always be a varlena object. + * Fortunately that's no restriction in practice. + * + * + * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group + * Portions Copyright (c) 1994, Regents of the University of California + * + * src/include/utils/deserialized.h + * + *------------------------------------------------------------------------- + */ + #ifndef DESERIALIZED_H + #define DESERIALIZED_H + + /* Size of an EXTERNAL datum that contains a pointer to a deserialized object */ + #define DESERIALIZED_POINTER_SIZE (VARHDRSZ_EXTERNAL + sizeof(varatt_deserialized)) + + /* + * "Methods" that must be provided for any deserialized object. + * + * get_serialized_size: compute space needed for serialized representation + * (which, in general, must be a valid in-line, non-compressed varlena object). + * + * serialize_into: construct serialized representation in caller-allocated + * space at *result, of size allocated_size (which will always be the result + * of a preceding get_serialized_size call; it's passed for cross-checking). + * + * Note: construction of a heap tuple from a deserialized datum calls + * get_serialized_size twice, so it's worthwhile to make sure that doesn't + * incur too much overhead. + */ + typedef Size (*DOM_get_serialized_size_method) (DeserializedObjectHeader *dohptr); + typedef void (*DOM_serialize_into_method) (DeserializedObjectHeader *dohptr, + void *result, Size allocated_size); + + /* Struct of function pointers for a deserialized object's methods */ + typedef struct DeserializedObjectMethods + { + DOM_get_serialized_size_method get_serialized_size; + DOM_serialize_into_method serialize_into; + } DeserializedObjectMethods; + + /* + * Every deserialized object must contain this header; typically the header + * is embedded in some larger struct that adds type-specific fields. + * + * It is presumed that the header object and all subsidiary data are stored + * in doh_context, so that the object can be freed by deleting that context, + * or its storage lifespan can be altered by reparenting the context. + * (In principle the object could own additional resources, such as malloc'd + * storage, and use a memory context reset callback to free them upon reset or + * deletion of doh_context.) + * + * We consider a Datum pointing at the "primary" TOAST pointer to be a + * read/write reference. Any other TOAST pointer is a read-only reference. + * For convenience, a "secondary" toast pointer is also allocated in the + * object header, but any copied pointer would also be considered read-only. + * + * The typedef declaration for this appears in postgres.h. + */ + struct DeserializedObjectHeader + { + /* Phony varlena header */ + int32 vl_len_; /* always DOH_HEADER_MAGIC, see below */ + + /* Pointer to methods required for object type */ + const DeserializedObjectMethods *doh_methods; + + /* Memory context containing this header and subsidiary data */ + MemoryContext doh_context; + + /* "Primary" (R/W) TOAST pointer for this object is kept here */ + char doh_primary_ptr[DESERIALIZED_POINTER_SIZE]; + + /* "Secondary" (R/O) TOAST pointer for this object is kept here */ + char doh_secondary_ptr[DESERIALIZED_POINTER_SIZE]; + }; + + /* + * Particularly for read-only functions, it is handy to be able to work with + * either regular "flat" varlena inputs or deserialized inputs of the same + * data type. To allow determining which case an argument-fetching macro has + * returned, the first int32 of a DeserializedObjectHeader always contains -1 + * (DOH_HEADER_MAGIC to the code). This works since no 4-byte-header varlena + * could have that as its first 4 bytes. Caution: we could not reliably tell + * the difference between a DeserializedObjectHeader and a short-header object + * with this trick. However, it works fine for cases where the argument + * fetching code will return either a fully-uncompressed flat object or a + * deserialized object. + */ + #define DOH_HEADER_MAGIC (-1) + #define VARATT_IS_DESERIALIZED_HEADER(PTR) \ + (((DeserializedObjectHeader *) (PTR))->vl_len_ == DOH_HEADER_MAGIC) + + /* + * Generic support functions for deserialized objects. + * (Some of these might be worth inlining later.) + */ + + extern DeserializedObjectHeader *DatumGetDOHP(Datum d); + extern void DOH_init_header(DeserializedObjectHeader *dohptr, + const DeserializedObjectMethods *methods, + MemoryContext obj_context); + extern Size DOH_get_serialized_size(DeserializedObjectHeader *dohptr); + extern void DOH_serialize_into(DeserializedObjectHeader *dohptr, + void *result, Size allocated_size); + extern bool DatumIsReadWriteDeserializedObject(Datum d, bool isnull, int16 typlen); + extern Datum MakeDeserializedObjectReadOnly(Datum d, bool isnull, int16 typlen); + extern void TransferDeserializedObject(Datum d, MemoryContext new_parent); + extern void DeleteDeserializedObject(Datum d); + + #endif /* DESERIALIZED_H */ diff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c index ae5421f..833fcf0 100644 *** a/src/pl/plpgsql/src/pl_exec.c --- b/src/pl/plpgsql/src/pl_exec.c *************** *** 32,37 **** --- 32,38 ---- #include "utils/array.h" #include "utils/builtins.h" #include "utils/datum.h" + #include "utils/deserialized.h" #include "utils/lsyscache.h" #include "utils/memutils.h" #include "utils/rel.h" *************** static void exec_assign_value(PLpgSQL_ex *** 171,176 **** --- 172,178 ---- Datum value, Oid valtype, bool *isNull); static void exec_eval_datum(PLpgSQL_execstate *estate, PLpgSQL_datum *datum, + bool getrwpointer, Oid *typeid, int32 *typetypmod, Datum *value, *************** plpgsql_exec_function(PLpgSQL_function * *** 468,473 **** --- 470,482 ---- Size len; void *tmp; + /* temporary hack: reserialize if retval is deserialized */ + if (func->fn_rettyplen == -1 && + VARATT_IS_EXTERNAL_DESERIALIZED(DatumGetPointer(estate.retval))) + { + estate.retval = datumCopy(estate.retval, false, -1); + } + len = datumGetSize(estate.retval, false, func->fn_rettyplen); tmp = SPI_palloc(len); memcpy(tmp, DatumGetPointer(estate.retval), len); *************** exec_assign_value(PLpgSQL_execstate *est *** 4057,4084 **** var->refname))); /* ! * If type is by-reference, copy the new value (which is ! * probably in the eval_econtext) into the procedure's memory ! * context. */ ! if (!var->datatype->typbyval && !*isNull) ! newvalue = datumCopy(newvalue, ! false, ! var->datatype->typlen); ! /* ! * Now free the old value. (We can't do this any earlier ! * because of the possibility that we are assigning the var's ! * old value to it, eg "foo := foo". We could optimize out ! * the assignment altogether in such cases, but it's too ! * infrequent to be worth testing for.) ! */ ! free_var(var); ! var->value = newvalue; ! var->isnull = *isNull; ! if (!var->datatype->typbyval && !*isNull) ! var->freeval = true; break; } --- 4066,4114 ---- var->refname))); /* ! * If we're assigning the variable's existing value back ! * again, there's nothing to do. We must check this case to ! * avoid doing the wrong thing with deserialized objects. */ ! if (var->value == newvalue && !var->isnull && !*isNull) ! /* no work */ ; ! else ! { ! /* ! * If type is by-reference, copy the new value (which is ! * probably in the eval_econtext) into the procedure's ! * memory context. But if it's a read/write reference to ! * a deserialized object, no physical copy needs to ! * happen; at most we need to reparent the object's memory ! * context. ! */ ! if (!var->datatype->typbyval && !*isNull) ! { ! if (DatumIsReadWriteDeserializedObject(newvalue, ! false, ! var->datatype->typlen)) ! TransferDeserializedObject(newvalue, ! CurrentMemoryContext); ! else ! newvalue = datumCopy(newvalue, ! false, ! var->datatype->typlen); ! } ! /* ! * Now free the old value. (We don't do this any earlier ! * because of the possibility that we are assigning the ! * var's old value to it, eg "foo := foo". This shouldn't ! * happen any more because of the preceding test, but ! * let's be safe.) ! */ ! free_var(var); ! var->value = newvalue; ! var->isnull = *isNull; ! if (!var->datatype->typbyval && !*isNull) ! var->freeval = true; ! } break; } *************** exec_assign_value(PLpgSQL_execstate *est *** 4276,4282 **** } while (target->dtype == PLPGSQL_DTYPE_ARRAYELEM); /* Fetch current value of array datum */ ! exec_eval_datum(estate, target, &parenttypoid, &parenttypmod, &oldarraydatum, &oldarrayisnull); --- 4306,4312 ---- } while (target->dtype == PLPGSQL_DTYPE_ARRAYELEM); /* Fetch current value of array datum */ ! exec_eval_datum(estate, target, true, &parenttypoid, &parenttypmod, &oldarraydatum, &oldarrayisnull); *************** exec_assign_value(PLpgSQL_execstate *est *** 4424,4439 **** * * The type oid, typmod, value in Datum format, and null flag are returned. * * At present this doesn't handle PLpgSQL_expr or PLpgSQL_arrayelem datums. * ! * NOTE: caller must not modify the returned value, since it points right ! * at the stored value in the case of pass-by-reference datatypes. In some ! * cases we have to palloc a return value, and in such cases we put it into ! * the estate's short-term memory context. */ static void exec_eval_datum(PLpgSQL_execstate *estate, PLpgSQL_datum *datum, Oid *typeid, int32 *typetypmod, Datum *value, --- 4454,4473 ---- * * The type oid, typmod, value in Datum format, and null flag are returned. * + * If getrwpointer is TRUE, we'll return a R/W pointer to any variable that + * is a deserialized object; otherwise we return a R/O pointer. + * * At present this doesn't handle PLpgSQL_expr or PLpgSQL_arrayelem datums. * ! * NOTE: in most cases caller must not modify the returned value, since ! * it points right at the stored value in the case of pass-by-reference ! * datatypes. In some cases we have to palloc a return value, and in such ! * cases we put it into the estate's short-term memory context. */ static void exec_eval_datum(PLpgSQL_execstate *estate, PLpgSQL_datum *datum, + bool getrwpointer, Oid *typeid, int32 *typetypmod, Datum *value, *************** exec_eval_datum(PLpgSQL_execstate *estat *** 4449,4455 **** *typeid = var->datatype->typoid; *typetypmod = var->datatype->atttypmod; ! *value = var->value; *isnull = var->isnull; break; } --- 4483,4494 ---- *typeid = var->datatype->typoid; *typetypmod = var->datatype->atttypmod; ! if (getrwpointer) ! *value = var->value; ! else ! *value = MakeDeserializedObjectReadOnly(var->value, ! var->isnull, ! var->datatype->typlen); *isnull = var->isnull; break; } *************** setup_param_list(PLpgSQL_execstate *esta *** 5285,5291 **** PLpgSQL_var *var = (PLpgSQL_var *) datum; ParamExternData *prm = ¶mLI->params[dno]; ! prm->value = var->value; prm->isnull = var->isnull; prm->pflags = PARAM_FLAG_CONST; prm->ptype = var->datatype->typoid; --- 5324,5332 ---- PLpgSQL_var *var = (PLpgSQL_var *) datum; ParamExternData *prm = ¶mLI->params[dno]; ! prm->value = MakeDeserializedObjectReadOnly(var->value, ! var->isnull, ! var->datatype->typlen); prm->isnull = var->isnull; prm->pflags = PARAM_FLAG_CONST; prm->ptype = var->datatype->typoid; *************** plpgsql_param_fetch(ParamListInfo params *** 5351,5357 **** /* OK, evaluate the value and store into the appropriate paramlist slot */ datum = estate->datums[dno]; prm = ¶ms->params[dno]; ! exec_eval_datum(estate, datum, &prm->ptype, &prmtypmod, &prm->value, &prm->isnull); } --- 5392,5398 ---- /* OK, evaluate the value and store into the appropriate paramlist slot */ datum = estate->datums[dno]; prm = ¶ms->params[dno]; ! exec_eval_datum(estate, datum, false, &prm->ptype, &prmtypmod, &prm->value, &prm->isnull); } *************** make_tuple_from_row(PLpgSQL_execstate *e *** 5543,5549 **** if (row->varnos[i] < 0) /* should not happen */ elog(ERROR, "dropped rowtype entry for non-dropped column"); ! exec_eval_datum(estate, estate->datums[row->varnos[i]], &fieldtypeid, &fieldtypmod, &dvalues[i], &nulls[i]); if (fieldtypeid != tupdesc->attrs[i]->atttypid) --- 5584,5590 ---- if (row->varnos[i] < 0) /* should not happen */ elog(ERROR, "dropped rowtype entry for non-dropped column"); ! exec_eval_datum(estate, estate->datums[row->varnos[i]], false, &fieldtypeid, &fieldtypmod, &dvalues[i], &nulls[i]); if (fieldtypeid != tupdesc->attrs[i]->atttypid) *************** free_var(PLpgSQL_var *var) *** 6336,6342 **** { if (var->freeval) { ! pfree(DatumGetPointer(var->value)); var->freeval = false; } } --- 6377,6388 ---- { if (var->freeval) { ! if (DatumIsReadWriteDeserializedObject(var->value, ! var->isnull, ! var->datatype->typlen)) ! DeleteDeserializedObject(var->value); ! else ! pfree(DatumGetPointer(var->value)); var->freeval = false; } } *************** format_expr_params(PLpgSQL_execstate *es *** 6543,6550 **** curvar = (PLpgSQL_var *) estate->datums[dno]; ! exec_eval_datum(estate, (PLpgSQL_datum *) curvar, ¶mtypeid, ! ¶mtypmod, ¶mdatum, ¶misnull); appendStringInfo(¶mstr, "%s%s = ", paramno > 0 ? ", " : "", --- 6589,6597 ---- curvar = (PLpgSQL_var *) estate->datums[dno]; ! exec_eval_datum(estate, (PLpgSQL_datum *) curvar, false, ! ¶mtypeid, ¶mtypmod, ! ¶mdatum, ¶misnull); appendStringInfo(¶mstr, "%s%s = ", paramno > 0 ? ", " : "",
Tom, * Tom Lane (tgl@sss.pgh.pa.us) wrote: > I've now taken this idea as far as building the required infrastructure > and revamping a couple of array operators to use it. There's a lot yet > to do, but I've done enough to get some preliminary ideas about > performance (see below). Nice! > * Although I said above that everything owned by a deserialized object > has to live in a single memory context, I do have ideas about relaxing > that. The core idea would be to invent a "memory context reset/delete > callback" feature in mcxt.c. Then a deserialized object could register > such a callback on its own memory context, and use the callback to clean > up resources outside its context. This is potentially useful for instance > for something like PostGIS, where an object likely includes some data that > was allocated with malloc not palloc because it was created by library > functions that aren't Postgres-aware. Another likely use-case is for > deserialized objects representing composite types to maintain reference > counts on their tuple descriptors instead of having to copy said > descriptors into their private contexts. This'd be material for a > separate patch though. Being able to register a callback to be used on deletion of the context would certainly be very nice and strikes me as pretty independent of the rest of this. You've probably thought of this already, but registering the callback should probably allow the caller to pass in a pointer to be passed back to the callback function when the delete happens, so that there's a place for the metadata to be stored about what the callback function needs to clean up when it's called. > So that's the plan, and attached is a very-much-WIP patch that uses this > approach to speed up plpgsql array element assignments (and not a whole > lot else as yet). Here's the basic test case I've been using: I've not looked at the code at all as yet, but it makes sense to me. > With the attached patch, those timings drop to 80 and 150 ms respectively. And those numbers are pretty fantastic and would address an area we regularly get dinged on. > Still, if the worst-case slowdown is around 20% on trivially-sized arrays, > I'd gladly take that to have better performance on larger arrays. And I > think this example is close to the worst case for the patch's approach, > since it's testing small, fixed-element-length, no-nulls arrays, which is > what the existing code can handle without spending a lot of cycles. Agreed. > BTW, I'm not all that thrilled with the "deserialized object" terminology. > I found myself repeatedly tripping up on which form was serialized and > which de-. If anyone's got a better naming idea I'm willing to adopt it. Unfortunately, nothing comes to mind. Serialization is, at least, a pretty well understood concept and so the naming will likely make sense to newcomers, even if it's difficult to keep track of which is serialized and which is deserialized. > I'm not sure exactly how to push this forward. I would not want to > commit it without converting a significant number of array functions to > understand about deserialized inputs, and by the time I've finished that > work it's likely to be too late for 9.5. OTOH I'm sure that the PostGIS > folk would love to have this infrastructure in 9.5 not 9.6 so they could > make a start on fixing their issues. (Further down the pike, I'd plan to > look at adapting composite-type operations, JSONB, etc, to make use of > this approach, but that certainly isn't happening for 9.5.) > > Thoughts, advice, better ideas? I'm not really a big fan of putting an infrastructure out there for modules to use that we don't use ourselves (particularly when it's clear that there are places where we could/should be). On the other hand, this doesn't impact on-disk format and therefore I'm less worried that we'll end up with a release-critical issue when we're getting ready to put 9.5 out there. So, I'm on the fence about it. I'd love to see all of this in 9.5 with the array functions converted, but I don't think it'd be horrible if only a subset had been done in time for 9.5. The others aren't going to go anywhere and will still work. I do think it'd be better to have at least some core users of this new infrastructure rather than just putting it out there for modules to use but I agree it'd be a bit grotty to have only some of the array functions converted. Thanks! Stephen
[ this is addressing a tangential point ... ] Stephen Frost <sfrost@snowman.net> writes: > * Tom Lane (tgl@sss.pgh.pa.us) wrote: >> * Although I said above that everything owned by a deserialized object >> has to live in a single memory context, I do have ideas about relaxing >> that. The core idea would be to invent a "memory context reset/delete >> callback" feature in mcxt.c. Then a deserialized object could register >> such a callback on its own memory context, and use the callback to clean >> up resources outside its context. > Being able to register a callback to be used on deletion of the context > would certainly be very nice and strikes me as pretty independent of the > rest of this. You've probably thought of this already, but registering > the callback should probably allow the caller to pass in a pointer to be > passed back to the callback function when the delete happens, so that > there's a place for the metadata to be stored about what the callback > function needs to clean up when it's called. Yeah, there would likely be use-cases for that outside of deserialized objects. I could submit a separate patch for that now, but I'm hesitant to add a mechanism without any use-case in the same patch. But maybe we could find a caller somewhere in the core aggregate code --- there are some aggregates that need cleanup callbacks already, IIRC, and maybe we could change them to use a memory context callback instead of whatever they're doing now. regards, tom lane
Without having read the patch, I think this is great. I've been wishing for something like this while working on my variant data type. Are there any cases where we would want to use this on a non-variant? Perhaps types where we're paying an alignment penalty? On 2/10/15 3:00 PM, Stephen Frost wrote: >> >BTW, I'm not all that thrilled with the "deserialized object" terminology. >> >I found myself repeatedly tripping up on which form was serialized and >> >which de-. If anyone's got a better naming idea I'm willing to adopt it. > Unfortunately, nothing comes to mind. Serialization is, at least, a > pretty well understood concept and so the naming will likely make sense > to newcomers, even if it's difficult to keep track of which is > serialized and which is deserialized. Apologies if I'm just dense, but what's the confusion? Is it what a serialized datum means in this context? (de)serialized seems like a perfectly logical name to me... >> >I'm not sure exactly how to push this forward. I would not want to >> >commit it without converting a significant number of array functions to >> >understand about deserialized inputs, and by the time I've finished that >> >work it's likely to be too late for 9.5. OTOH I'm sure that the PostGIS >> >folk would love to have this infrastructure in 9.5 not 9.6 so they could >> >make a start on fixing their issues. (Further down the pike, I'd plan to >> >look at adapting composite-type operations, JSONB, etc, to make use of >> >this approach, but that certainly isn't happening for 9.5.) >> > >> >Thoughts, advice, better ideas? > I'm not really a big fan of putting an infrastructure out there for > modules to use that we don't use ourselves (particularly when it's clear > that there are places where we could/should be). On the other hand, > this doesn't impact on-disk format and therefore I'm less worried that > we'll end up with a release-critical issue when we're getting ready to > put 9.5 out there. > > So, I'm on the fence about it. I'd love to see all of this in 9.5 with > the array functions converted, but I don't think it'd be horrible if > only a subset had been done in time for 9.5. The others aren't going to > go anywhere and will still work. I do think it'd be better to have at > least some core users of this new infrastructure rather than just > putting it out there for modules to use but I agree it'd be a bit grotty > to have only some of the array functions converted. I think the solution here is to have people other than Tom do the gruntwork of applying this to the remaining array code. My thought is that if Tom shows how to do this correctly in a rather complex case (ie, where you need to worry about primary vs secondary), then less experienced hackers should be able to take the ball and run with it. Maybe we won't get complete array coverage, but I think any performance gains here are a win. And really, don't we just need enough usage so the buildfarm will tell us if we accidentally break something? -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com
Jim Nasby <Jim.Nasby@BlueTreble.com> writes: > Without having read the patch, I think this is great. I've been wishing > for something like this while working on my variant data type. > Are there any cases where we would want to use this on a non-variant? > Perhaps types where we're paying an alignment penalty? What do you mean by non-variant? The use cases that have come to mind for me are: * arrays, of course * composite types (records) * PostGIS geometry type * JSONB, hstore * possibly regex patterns (we could invent a data type representing these and then store the compiled form as a deserialized representation; although there would be some issues to be worked out to get any actual win, probably) The principal thing that's a bit hard to figure out is when it's a win to convert to a deserialized representation and when you should just leave well enough alone. I'm planning to investigate that further in the context of plpgsql array variables, but I'm not sure how well those answers will carry over to datatypes that plpgsql has no intrinsic understanding of. regards, tom lane
On 2/10/15 5:19 PM, Tom Lane wrote: > Jim Nasby <Jim.Nasby@BlueTreble.com> writes: >> Without having read the patch, I think this is great. I've been wishing >> for something like this while working on my variant data type. > >> Are there any cases where we would want to use this on a non-variant? >> Perhaps types where we're paying an alignment penalty? > > What do you mean by non-variant? Ugh, sorry, brainfart. I meant to say non-varlena. I can't think of any non-varlena types we'd want this for, but maybe someone else can think of a case. If there is a use-case I wouldn't handle it with this patch, but we'd want to consider it... -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com
Jim Nasby <Jim.Nasby@BlueTreble.com> writes: > On 2/10/15 5:19 PM, Tom Lane wrote: >> What do you mean by non-variant? > Ugh, sorry, brainfart. I meant to say non-varlena. > I can't think of any non-varlena types we'd want this for, but maybe > someone else can think of a case. If there is a use-case I wouldn't > handle it with this patch, but we'd want to consider it... There isn't any practical way to interpose TOAST pointers for non-varlena types, since we make no assumptions about the bit contents of fixed-length types. But I'm having a hard time thinking of a fixed-length type in which you'd have any need for a deserialized representation, either. I think restricting this feature to varlena types is just fine. regards, tom lane
* Tom Lane (tgl@sss.pgh.pa.us) wrote: > I think restricting this feature to varlena types is just fine. Agreed. Thanks, Stephen
On Tue, Feb 10, 2015 at 3:00 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I've now taken this idea as far as building the required infrastructure > and revamping a couple of array operators to use it. There's a lot yet > to do, but I've done enough to get some preliminary ideas about > performance (see below). Very impressive. This is something that's been mentioned before and which I always thought would be great to have, but I didn't expect it would be this easy to cobble together a working implementation. Or maybe "easy" isn't the right term, but this isn't a very big patch. > BTW, I'm not all that thrilled with the "deserialized object" terminology. > I found myself repeatedly tripping up on which form was serialized and > which de-. If anyone's got a better naming idea I'm willing to adopt it. My first thought is that we should form some kind of TOAST-like backronym, like Serialization Avoidance Loading and Access Device (SALAD) or Break-up, Read, Edit, Assemble, and Deposit (BREAD). I don't think there is anything per se wrong with the terms serialization and deserialization; indeed, I used the same ones in the parallel-mode stuff. But they are fairly general terms, so it might be nice to have something more specific that applies just to this particular usage. I found the notion of "primary" and "secondary" TOAST pointers to be quite confusing. I *think* what you are doing is storing two pointers to the object in the object, and a pointer to the object is really a pointer to one of those two pointers to the object. Depending on which one it is, you can write the object, or not. This is a clever representation, but it's hard to wrap your head around, and I'm not sure "primary" and "secondary" are the best names, although I don't have an idea as to what would be better. I'm a bit confused, though: once you give out a secondary pointer, how is it safe to write the object through the primary pointer? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Tue, Feb 10, 2015 at 3:00 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> BTW, I'm not all that thrilled with the "deserialized object" terminology. >> I found myself repeatedly tripping up on which form was serialized and >> which de-. If anyone's got a better naming idea I'm willing to adopt it. > My first thought is that we should form some kind of TOAST-like > backronym, like Serialization Avoidance Loading and Access Device > (SALAD) or Break-up, Read, Edit, Assemble, and Deposit (BREAD). I > don't think there is anything per se wrong with the terms > serialization and deserialization; indeed, I used the same ones in the > parallel-mode stuff. But they are fairly general terms, so it might > be nice to have something more specific that applies just to this > particular usage. Hm. I'm not against the concept, but those particular suggestions don't grab me. > I found the notion of "primary" and "secondary" TOAST pointers to be > quite confusing. I *think* what you are doing is storing two pointers > to the object in the object, and a pointer to the object is really a > pointer to one of those two pointers to the object. Depending on > which one it is, you can write the object, or not. There's more to it than that. (Writing more docs is one of the to-do items ;-).) We could alternatively have done that with two different va_tag values for "read write" and "read only", which indeed was my initial intention before I thought of this dodge. However, then you have to figure out where to store such pointers, which is problematic both for plpgsql variable assignment and for ExecMakeSlotContentsReadOnly, especially the latter which would have to put any freshly-made pointer in a long-lived context resulting in query-lifespan memory leaks. So I early decided that the read-write pointer should live right in the object's own context where it need not be copied when swinging the context ownership someplace else, and later realized that there should also be a permanent read-only pointer in there for the use of ExecMakeSlotContentsReadOnly, and then realized that they didn't need to have different va_tag values if we implemented the "is read-write pointer" test as it's done in the patch. Having only one va_tag value not two saves cycles, I think, because there are a lot of low-level tests that don't need to distinguish, eg VARTAG_SIZE(). However it does make it more expensive when you do need to distinguish, so I might reconsider that decision later. (Since these will never go to disk, we can whack the representation around pretty freely if needed.) Also, I have hopes of allowing deserialized-object pointers to be copied into tuples as pointers rather than by reserialization, if we can establish that the tuple is short-lived enough that the pointer will stay good, which would be true in a lot of cases during execution of queries by plpgsql. With the patch's design, a pointer so copied will automatically be considered read-only, which I *think* is the behavior we'd need. If it turns out that it's okay to propagate read-write-ness through such a copy step then that would argue in favor of using two va_tag values. It may be that this solution is overly cute and we should just use two tag values. But I wanted to be sure it was possible for copying of a pointer to automatically lose read-write-ness, in case we need to have such a guarantee. > This is a clever > representation, but it's hard to wrap your head around, and I'm not > sure "primary" and "secondary" are the best names, although I don't > have an idea as to what would be better. I'm a bit confused, though: > once you give out a secondary pointer, how is it safe to write the > object through the primary pointer? It's no different from allowing plpgsql to update the values of variables of pass-by-reference types even though it has previously given out Datums that are pointers to them: by the time we're ready to execute an assignment, any query execution that had such a pointer is over and done with. (This implies that cursor parameters have to be physically copied into the cursor's execution state, which is one of a depressingly large number of reasons why datumCopy() has to physically copy a deserialized value rather than just copying the pointer. But otherwise it works.) There is more work to do to figure out how we can safely give out a read/write pointer for cases likehstore_var := hstore_concat(hstore_var, ...); Aside from the question of whether hstore_concat guarantees not to trash the value on failure, we'd have to restrict this (I think) to expressions in which there is only one reference to the target variable and it's an argument of the topmost function/operator. But that's something I've not tried to implement yet. regards, tom lane
On Thu, Feb 12, 2015 at 9:50 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> My first thought is that we should form some kind of TOAST-like >> backronym, like Serialization Avoidance Loading and Access Device >> (SALAD) or Break-up, Read, Edit, Assemble, and Deposit (BREAD). I >> don't think there is anything per se wrong with the terms >> serialization and deserialization; indeed, I used the same ones in the >> parallel-mode stuff. But they are fairly general terms, so it might >> be nice to have something more specific that applies just to this >> particular usage. > > Hm. I'm not against the concept, but those particular suggestions don't > grab me. Fair enough. I guess the core of my point is just that I suggest we invent a name for this thing. "Serialize" and "deserialize" describe what you are doing just fine, but the mechanism itself should be called something, I think. When you say "varlena header" or "TOAST pointer" that is a name for a very particular thing, not just a general category of things you might do. If we replaced every instance of "TOAST pointer" to "reference to where the full value is stored", it would be much less clear, and naming all of the related functions would be harder. >> This is a clever >> representation, but it's hard to wrap your head around, and I'm not >> sure "primary" and "secondary" are the best names, although I don't >> have an idea as to what would be better. I'm a bit confused, though: >> once you give out a secondary pointer, how is it safe to write the >> object through the primary pointer? > > It's no different from allowing plpgsql to update the values of variables > of pass-by-reference types even though it has previously given out Datums > that are pointers to them: by the time we're ready to execute an > assignment, any query execution that had such a pointer is over and done > with. (This implies that cursor parameters have to be physically copied > into the cursor's execution state, which is one of a depressingly large > number of reasons why datumCopy() has to physically copy a deserialized > value rather than just copying the pointer. But otherwise it works.) OK, I see. So giving out a secondary pointer doesn't necessarily preclude further changes via the primary pointer, but you'd better be sure that you don't try until such time as all of those secondary references are gone. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Manipulating complex types as non-contiguous structures in-memory
From
Martijn van Oosterhout
Date:
On Thu, Feb 12, 2015 at 08:52:56AM -0500, Robert Haas wrote: > > BTW, I'm not all that thrilled with the "deserialized object" terminology. > > I found myself repeatedly tripping up on which form was serialized and > > which de-. If anyone's got a better naming idea I'm willing to adopt it. > > My first thought is that we should form some kind of TOAST-like > backronym, like Serialization Avoidance Loading and Access Device > (SALAD) or Break-up, Read, Edit, Assemble, and Deposit (BREAD). I > don't think there is anything per se wrong with the terms > serialization and deserialization; indeed, I used the same ones in the > parallel-mode stuff. But they are fairly general terms, so it might > be nice to have something more specific that applies just to this > particular usage. The words that sprung to mind for me were: packed/unpacked. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > He who writes carelessly confesses thereby at the very outset that he does > not attach much importance to his own thoughts. -- Arthur Schopenhauer
On 2/13/15 2:04 AM, Martijn van Oosterhout wrote: > On Thu, Feb 12, 2015 at 08:52:56AM -0500, Robert Haas wrote: >>> > >BTW, I'm not all that thrilled with the "deserialized object" terminology. >>> > >I found myself repeatedly tripping up on which form was serialized and >>> > >which de-. If anyone's got a better naming idea I'm willing to adopt it. >> > >> >My first thought is that we should form some kind of TOAST-like >> >backronym, like Serialization Avoidance Loading and Access Device >> >(SALAD) or Break-up, Read, Edit, Assemble, and Deposit (BREAD). I >> >don't think there is anything per se wrong with the terms >> >serialization and deserialization; indeed, I used the same ones in the >> >parallel-mode stuff. But they are fairly general terms, so it might >> >be nice to have something more specific that applies just to this >> >particular usage. > The words that sprung to mind for me were: packed/unpacked. +1 After thinking about it, I don't think having a more distinctive name (like TOAST) is necessary for this feature. TOAST is something that's rather visible to end users, whereas packing would only matter to someone creating a new varlena type. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com
Martijn van Oosterhout <kleptog@svana.org> writes: > On Thu, Feb 12, 2015 at 08:52:56AM -0500, Robert Haas wrote: >>> BTW, I'm not all that thrilled with the "deserialized object" terminology. >>> I found myself repeatedly tripping up on which form was serialized and >>> which de-. If anyone's got a better naming idea I'm willing to adopt it. >> My first thought is that we should form some kind of TOAST-like >> backronym, like Serialization Avoidance Loading and Access Device >> (SALAD) or Break-up, Read, Edit, Assemble, and Deposit (BREAD). I >> don't think there is anything per se wrong with the terms >> serialization and deserialization; indeed, I used the same ones in the >> parallel-mode stuff. But they are fairly general terms, so it might >> be nice to have something more specific that applies just to this >> particular usage. > The words that sprung to mind for me were: packed/unpacked. Trouble is that we're already using "packed" with a specific connotation in that same area of the code, namely for short-header varlena values. (See pg_detoast_datum_packed() etc.) So I don't think this will work. But maybe a different adjective? regards, tom lane
On Sat, Feb 14, 2015 at 10:45 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Martijn van Oosterhout <kleptog@svana.org> writes: >> On Thu, Feb 12, 2015 at 08:52:56AM -0500, Robert Haas wrote: >>>> BTW, I'm not all that thrilled with the "deserialized object" terminology. >>>> I found myself repeatedly tripping up on which form was serialized and >>>> which de-. If anyone's got a better naming idea I'm willing to adopt it. > >>> My first thought is that we should form some kind of TOAST-like >>> backronym, like Serialization Avoidance Loading and Access Device >>> (SALAD) or Break-up, Read, Edit, Assemble, and Deposit (BREAD). I >>> don't think there is anything per se wrong with the terms >>> serialization and deserialization; indeed, I used the same ones in the >>> parallel-mode stuff. But they are fairly general terms, so it might >>> be nice to have something more specific that applies just to this >>> particular usage. > >> The words that sprung to mind for me were: packed/unpacked. > > Trouble is that we're already using "packed" with a specific connotation > in that same area of the code, namely for short-header varlena values. > (See pg_detoast_datum_packed() etc.) So I don't think this will work. > But maybe a different adjective? expanded? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Sat, Feb 14, 2015 at 10:45 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Martijn van Oosterhout <kleptog@svana.org> writes: >>> The words that sprung to mind for me were: packed/unpacked. >> Trouble is that we're already using "packed" with a specific connotation >> in that same area of the code, namely for short-header varlena values. >> (See pg_detoast_datum_packed() etc.) So I don't think this will work. >> But maybe a different adjective? > expanded? That seems to work from the standpoint of not conflicting with other nearby usages in our code, and it's got the right semantics I think. Any other suggestions out there? Otherwise I'll probably go with this. regards, tom lane
Here's an updated version of the patch I sent before. Notable changes: * I switched over to calling "deserialized" objects "expanded" objects, and the default representation is now called "flat" or "flattened" instead of "reserialized". Per suggestion from Robert. * I got rid of the bit about detecting read-write pointers by address comparison. Instead there are now two vartag values for R/W and R/O pointers. After further reflection I concluded that my previous worry about wanting copied pointers to automatically become read-only was probably wrong, so there's no need for extra confusion here. * I added support for extracting array elements from expanded values (array_ref). * I hacked plpgsql to force values of array-type variables into expanded form; this is needed to get any win from the array_ref change if the function doesn't do any assignments to elements of the array. This is an improvement over the original patch, which hardwired array_set to force expansion, but I remain unsatisfied with it as a long-term answer. It's not clear that it's always a win to do this (but the tradeoff will change as we convert more array support functions to handle expanded inputs, so it's probably not worth getting too excited about that aspect of it yet). A bigger complaint is that this approach cannot fix things for non-builtin types such as hstore. I'm hesitant to add a pg_type column carrying an expansion function OID, but there may be no other workable answer for extension types. The patch as it stands is able to do nice things with create or replace function arraysetnum(n int) returns numeric[] as $$ declare res numeric[] := '{}'; begin for i in 1 .. n loop res[i] := i; end loop; return res; end $$ language plpgsql strict; create or replace function arraysumnum(arr numeric[]) returns numeric as $$ declare res numeric := 0; begin for i in array_lower(arr, 1) .. array_upper(arr, 1) loop res := res + arr[i]; end loop; return res; end $$ language plpgsql strict; regression=# select arraysumnum(arraysetnum(100000)); arraysumnum ------------- 5000050000 (1 row) Time: 304.336 ms (versus approximately 1 minute in 9.4, although these numbers are for cassert builds so should be taken with a grain of salt.) There are still a couple more flattening/expansion conversions than I'd like, in particular the array returned by arraysetnum() gets flattened on its way out, which would be good to avoid. I'm going to stick this into the commitfest even though it's not really close to being committable; I see some other people doing likewise with their pet patches ;-). What it could particularly do with some reviewing help on is exploring the performance changes it creates; what cases does it make substantially worse? regards, tom lane diff --git a/src/backend/access/common/heaptuple.c b/src/backend/access/common/heaptuple.c index 867035d..860ad78 100644 *** a/src/backend/access/common/heaptuple.c --- b/src/backend/access/common/heaptuple.c *************** *** 60,65 **** --- 60,66 ---- #include "access/sysattr.h" #include "access/tuptoaster.h" #include "executor/tuptable.h" + #include "utils/expandeddatum.h" /* Does att's datatype allow packing into the 1-byte-header varlena format? */ *************** heap_compute_data_size(TupleDesc tupleDe *** 93,105 **** for (i = 0; i < numberOfAttributes; i++) { Datum val; if (isnull[i]) continue; val = values[i]; ! if (ATT_IS_PACKABLE(att[i]) && VARATT_CAN_MAKE_SHORT(DatumGetPointer(val))) { /* --- 94,108 ---- for (i = 0; i < numberOfAttributes; i++) { Datum val; + Form_pg_attribute atti; if (isnull[i]) continue; val = values[i]; + atti = att[i]; ! if (ATT_IS_PACKABLE(atti) && VARATT_CAN_MAKE_SHORT(DatumGetPointer(val))) { /* *************** heap_compute_data_size(TupleDesc tupleDe *** 108,118 **** */ data_length += VARATT_CONVERTED_SHORT_SIZE(DatumGetPointer(val)); } else { ! data_length = att_align_datum(data_length, att[i]->attalign, ! att[i]->attlen, val); ! data_length = att_addlength_datum(data_length, att[i]->attlen, val); } } --- 111,131 ---- */ data_length += VARATT_CONVERTED_SHORT_SIZE(DatumGetPointer(val)); } + else if (atti->attlen == -1 && + VARATT_IS_EXTERNAL_EXPANDED(DatumGetPointer(val))) + { + /* + * we want to flatten the expanded value so that the constructed + * tuple doesn't depend on it + */ + data_length = att_align_nominal(data_length, atti->attalign); + data_length += EOH_get_flat_size(DatumGetEOHP(val)); + } else { ! data_length = att_align_datum(data_length, atti->attalign, ! atti->attlen, val); ! data_length = att_addlength_datum(data_length, atti->attlen, val); } } *************** heap_fill_tuple(TupleDesc tupleDesc, *** 203,212 **** *infomask |= HEAP_HASVARWIDTH; if (VARATT_IS_EXTERNAL(val)) { ! *infomask |= HEAP_HASEXTERNAL; ! /* no alignment, since it's short by definition */ ! data_length = VARSIZE_EXTERNAL(val); ! memcpy(data, val, data_length); } else if (VARATT_IS_SHORT(val)) { --- 216,241 ---- *infomask |= HEAP_HASVARWIDTH; if (VARATT_IS_EXTERNAL(val)) { ! if (VARATT_IS_EXTERNAL_EXPANDED(val)) ! { ! /* ! * we want to flatten the expanded value so that the ! * constructed tuple doesn't depend on it ! */ ! ExpandedObjectHeader *eoh = DatumGetEOHP(values[i]); ! ! data = (char *) att_align_nominal(data, ! att[i]->attalign); ! data_length = EOH_get_flat_size(eoh); ! EOH_flatten_into(eoh, data, data_length); ! } ! else ! { ! *infomask |= HEAP_HASEXTERNAL; ! /* no alignment, since it's short by definition */ ! data_length = VARSIZE_EXTERNAL(val); ! memcpy(data, val, data_length); ! } } else if (VARATT_IS_SHORT(val)) { diff --git a/src/backend/access/heap/tuptoaster.c b/src/backend/access/heap/tuptoaster.c index f8c1401..ebcbbc4 100644 *** a/src/backend/access/heap/tuptoaster.c --- b/src/backend/access/heap/tuptoaster.c *************** *** 37,42 **** --- 37,43 ---- #include "catalog/catalog.h" #include "common/pg_lzcompress.h" #include "miscadmin.h" + #include "utils/expandeddatum.h" #include "utils/fmgroids.h" #include "utils/rel.h" #include "utils/typcache.h" *************** heap_tuple_fetch_attr(struct varlena * a *** 130,135 **** --- 131,149 ---- result = (struct varlena *) palloc(VARSIZE_ANY(attr)); memcpy(result, attr, VARSIZE_ANY(attr)); } + else if (VARATT_IS_EXTERNAL_EXPANDED(attr)) + { + /* + * This is an expanded-object pointer --- get flat format + */ + ExpandedObjectHeader *eoh; + Size resultsize; + + eoh = DatumGetEOHP(PointerGetDatum(attr)); + resultsize = EOH_get_flat_size(eoh); + result = (struct varlena *) palloc(resultsize); + EOH_flatten_into(eoh, (void *) result, resultsize); + } else { /* *************** heap_tuple_untoast_attr(struct varlena * *** 196,201 **** --- 210,224 ---- attr = result; } } + else if (VARATT_IS_EXTERNAL_EXPANDED(attr)) + { + /* + * This is an expanded-object pointer --- get flat format + */ + attr = heap_tuple_fetch_attr(attr); + /* flatteners are not allowed to produce compressed/short output */ + Assert(!VARATT_IS_EXTENDED(attr)); + } else if (VARATT_IS_COMPRESSED(attr)) { /* *************** heap_tuple_untoast_attr_slice(struct var *** 263,268 **** --- 286,296 ---- return heap_tuple_untoast_attr_slice(redirect.pointer, sliceoffset, slicelength); } + else if (VARATT_IS_EXTERNAL_EXPANDED(attr)) + { + /* pass it off to heap_tuple_fetch_attr to flatten */ + preslice = heap_tuple_fetch_attr(attr); + } else preslice = attr; *************** toast_raw_datum_size(Datum value) *** 344,349 **** --- 372,381 ---- return toast_raw_datum_size(PointerGetDatum(toast_pointer.pointer)); } + else if (VARATT_IS_EXTERNAL_EXPANDED(attr)) + { + result = EOH_get_flat_size(DatumGetEOHP(value)); + } else if (VARATT_IS_COMPRESSED(attr)) { /* here, va_rawsize is just the payload size */ *************** toast_datum_size(Datum value) *** 400,405 **** --- 432,441 ---- return toast_datum_size(PointerGetDatum(toast_pointer.pointer)); } + else if (VARATT_IS_EXTERNAL_EXPANDED(attr)) + { + result = EOH_get_flat_size(DatumGetEOHP(value)); + } else if (VARATT_IS_SHORT(attr)) { result = VARSIZE_SHORT(attr); diff --git a/src/backend/executor/execTuples.c b/src/backend/executor/execTuples.c index 753754d..a05d8b1 100644 *** a/src/backend/executor/execTuples.c --- b/src/backend/executor/execTuples.c *************** *** 88,93 **** --- 88,94 ---- #include "nodes/nodeFuncs.h" #include "storage/bufmgr.h" #include "utils/builtins.h" + #include "utils/expandeddatum.h" #include "utils/lsyscache.h" #include "utils/typcache.h" *************** ExecCopySlot(TupleTableSlot *dstslot, Tu *** 812,817 **** --- 813,864 ---- return ExecStoreTuple(newTuple, dstslot, InvalidBuffer, true); } + /* -------------------------------- + * ExecMakeSlotContentsReadOnly + * Mark any R/W expanded datums in the slot as read-only. + * + * This is needed when a slot that might contain R/W datum references is to be + * used as input for general expression evaluation. Since the expression(s) + * might contain more than one Var referencing the same R/W datum, we could + * get wrong answers if functions acting on those Vars thought they could + * modify the expanded value in-place. + * + * For notational reasons, we return the same slot passed in. + * -------------------------------- + */ + TupleTableSlot * + ExecMakeSlotContentsReadOnly(TupleTableSlot *slot) + { + /* + * sanity checks + */ + Assert(slot != NULL); + Assert(slot->tts_tupleDescriptor != NULL); + Assert(!slot->tts_isempty); + + /* + * If the slot contains a physical tuple, it can't contain any expanded + * datums, because we flatten those when making a physical tuple. This + * might change later; but for now, we need do nothing unless the slot is + * virtual. + */ + if (slot->tts_tuple == NULL) + { + Form_pg_attribute *att = slot->tts_tupleDescriptor->attrs; + int attnum; + + for (attnum = 0; attnum < slot->tts_nvalid; attnum++) + { + slot->tts_values[attnum] = + MakeExpandedObjectReadOnly(slot->tts_values[attnum], + slot->tts_isnull[attnum], + att[attnum]->attlen); + } + } + + return slot; + } + /* ---------------------------------------------------------------- * convenience initialization routines diff --git a/src/backend/executor/nodeSubqueryscan.c b/src/backend/executor/nodeSubqueryscan.c index 3f66e24..e5d1e54 100644 *** a/src/backend/executor/nodeSubqueryscan.c --- b/src/backend/executor/nodeSubqueryscan.c *************** SubqueryNext(SubqueryScanState *node) *** 56,62 **** --- 56,70 ---- * We just return the subplan's result slot, rather than expending extra * cycles for ExecCopySlot(). (Our own ScanTupleSlot is used only for * EvalPlanQual rechecks.) + * + * We do need to mark the slot contents read-only to prevent interference + * between different functions reading the same datum from the slot. It's + * a bit hokey to do this to the subplan's slot, but should be safe + * enough. */ + if (!TupIsNull(slot)) + slot = ExecMakeSlotContentsReadOnly(slot); + return slot; } diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c index 4b86e91..daa9f69 100644 *** a/src/backend/executor/spi.c --- b/src/backend/executor/spi.c *************** SPI_pfree(void *pointer) *** 1014,1019 **** --- 1014,1040 ---- pfree(pointer); } + Datum + SPI_datumTransfer(Datum value, bool typByVal, int typLen) + { + MemoryContext oldcxt = NULL; + Datum result; + + if (_SPI_curid + 1 == _SPI_connected) /* connected */ + { + if (_SPI_current != &(_SPI_stack[_SPI_curid + 1])) + elog(ERROR, "SPI stack corrupted"); + oldcxt = MemoryContextSwitchTo(_SPI_current->savedcxt); + } + + result = datumTransfer(value, typByVal, typLen); + + if (oldcxt) + MemoryContextSwitchTo(oldcxt); + + return result; + } + void SPI_freetuple(HeapTuple tuple) { diff --git a/src/backend/utils/adt/Makefile b/src/backend/utils/adt/Makefile index 20e5ff1..d1ed33f 100644 *** a/src/backend/utils/adt/Makefile --- b/src/backend/utils/adt/Makefile *************** endif *** 16,25 **** endif # keep this list arranged alphabetically or it gets to be a mess ! OBJS = acl.o arrayfuncs.o array_selfuncs.o array_typanalyze.o \ ! array_userfuncs.o arrayutils.o ascii.o bool.o \ ! cash.o char.o date.o datetime.o datum.o dbsize.o domains.o \ ! encode.o enum.o float.o format_type.o formatting.o genfile.o \ geo_ops.o geo_selfuncs.o inet_cidr_ntop.o inet_net_pton.o int.o \ int8.o json.o jsonb.o jsonb_gin.o jsonb_op.o jsonb_util.o \ jsonfuncs.o like.o lockfuncs.o mac.o misc.o nabstime.o name.o \ --- 16,26 ---- endif # keep this list arranged alphabetically or it gets to be a mess ! OBJS = acl.o arrayfuncs.o array_expanded.o array_selfuncs.o \ ! array_typanalyze.o array_userfuncs.o arrayutils.o ascii.o \ ! bool.o cash.o char.o date.o datetime.o datum.o dbsize.o domains.o \ ! encode.o enum.o expandeddatum.o \ ! float.o format_type.o formatting.o genfile.o \ geo_ops.o geo_selfuncs.o inet_cidr_ntop.o inet_net_pton.o int.o \ int8.o json.o jsonb.o jsonb_gin.o jsonb_op.o jsonb_util.o \ jsonfuncs.o like.o lockfuncs.o mac.o misc.o nabstime.o name.o \ diff --git a/src/backend/utils/adt/array_expanded.c b/src/backend/utils/adt/array_expanded.c index ...d0ba8e6 . *** a/src/backend/utils/adt/array_expanded.c --- b/src/backend/utils/adt/array_expanded.c *************** *** 0 **** --- 1,1038 ---- + /*------------------------------------------------------------------------- + * + * array_expanded.c + * Functions for manipulating expanded arrays. + * + * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group + * Portions Copyright (c) 1994, Regents of the University of California + * + * + * IDENTIFICATION + * src/backend/utils/adt/array_expanded.c + * + *------------------------------------------------------------------------- + */ + #include "postgres.h" + + #include "access/tupmacs.h" + #include "utils/array.h" + #include "utils/builtins.h" + #include "utils/datum.h" + #include "utils/expandeddatum.h" + #include "utils/lsyscache.h" + #include "utils/memutils.h" + + + /* + * An expanded array is contained within a private memory context (as + * all expanded objects must be) and has a control structure as below. + * + * The expanded array might contain a regular "flat" array if that was the + * original input and we've not modified it significantly. Otherwise, the + * contents are represented by Datum/isnull arrays plus dimensionality and + * type information. We could also have both forms, if we've deconstructed + * the original array for access purposes but not yet changed it. For pass- + * by-reference element types, the Datums would point into the flat array in + * this situation. Once we start modifying array elements, new pass-by-ref + * elements are separately palloc'd within the memory context. + */ + #define EA_MAGIC 689375833 /* ID for debugging crosschecks */ + + typedef struct ExpandedArrayHeader + { + /* Standard header for expanded objects */ + ExpandedObjectHeader hdr; + + /* Magic value identifying an expanded array (for debugging only) */ + int ea_magic; + + /* Dimensionality info (always valid) */ + int ndims; /* # of dimensions */ + int *dims; /* array dimensions */ + int *lbound; /* index lower bounds for each dimension */ + + /* Element type info (always valid) */ + Oid element_type; /* element type OID */ + int16 typlen; /* needed info about element datatype */ + bool typbyval; + char typalign; + + /* + * If we have a Datum-array representation of the array, it's kept here; + * else dvalues/dnulls are NULL. The dvalues and dnulls arrays are always + * palloc'd within the object private context, but may change size from + * time to time. For pass-by-ref element types, dvalues entries might + * point either into the fstartptr..fendptr area, or to separately + * palloc'd chunks. Elements should always be fully detoasted, as they + * are in the standard flat representation. + * + * Even when dvalues is valid, dnulls can be NULL if there are no null + * elements. + */ + Datum *dvalues; /* array of Datums */ + bool *dnulls; /* array of is-null flags for Datums */ + int dvalueslen; /* allocated length of above arrays */ + int nelems; /* number of valid entries in above arrays */ + + /* + * flat_size is the current space requirement for the flat equivalent of + * the expanded array, if known; otherwise it's 0. We store this to make + * consecutive calls of get_flat_size cheap. + */ + Size flat_size; + + /* + * fvalue points to the flat representation if it is valid, else it is + * NULL. If we have or ever had a flat representation then + * fstartptr/fendptr point to the start and end+1 of its data area; this + * is so that we can tell which Datum pointers point into the flat + * representation rather than being pointers to separately palloc'd data. + */ + ArrayType *fvalue; /* must be a fully detoasted array */ + char *fstartptr; /* start of its data area */ + char *fendptr; /* end+1 of its data area */ + } ExpandedArrayHeader; + + /* "Methods" required for an expanded object */ + static Size EA_get_flat_size(ExpandedObjectHeader *eohptr); + static void EA_flatten_into(ExpandedObjectHeader *eohptr, + void *result, Size allocated_size); + + static const ExpandedObjectMethods EA_methods = + { + EA_get_flat_size, + EA_flatten_into + }; + + /* + * Functions that can handle either a "flat" varlena array or an expanded + * array use this union to work with their input. + */ + typedef union AnyArrayType + { + ArrayType flt; + ExpandedArrayHeader xpn; + } AnyArrayType; + + /* + * Macros for working with AnyArrayType inputs. Beware multiple references! + */ + #define AARR_NDIM(a) \ + (VARATT_IS_EXPANDED_HEADER(a) ? (a)->xpn.ndims : ARR_NDIM(&(a)->flt)) + #define AARR_HASNULL(a) \ + (VARATT_IS_EXPANDED_HEADER(a) ? \ + ((a)->xpn.dvalues != NULL ? (a)->xpn.dnulls != NULL : ARR_HASNULL((a)->xpn.fvalue)) : \ + ARR_HASNULL(&(a)->flt)) + #define AARR_ELEMTYPE(a) \ + (VARATT_IS_EXPANDED_HEADER(a) ? (a)->xpn.element_type : ARR_ELEMTYPE(&(a)->flt)) + #define AARR_DIMS(a) \ + (VARATT_IS_EXPANDED_HEADER(a) ? (a)->xpn.dims : ARR_DIMS(&(a)->flt)) + #define AARR_LBOUND(a) \ + (VARATT_IS_EXPANDED_HEADER(a) ? (a)->xpn.lbound : ARR_LBOUND(&(a)->flt)) + + + /* + * expand_array: convert an array Datum into an expanded array + * + * The expanded object will be a child of parentcontext. + * + * Caller can provide element type's representational data; we do that because + * caller is often in a position to cache it across repeated calls. If the + * caller can't do that, pass zeroes for elmlen/elmbyval/elmalign. + */ + Datum + expand_array(Datum arraydatum, MemoryContext parentcontext, + int elmlen, bool elmbyval, char elmalign) + { + ArrayType *array; + ExpandedArrayHeader *eah; + MemoryContext objcxt; + MemoryContext oldcxt; + + /* allocate private context for expanded object */ + /* TODO: should we use some other memory context size parameters? */ + objcxt = AllocSetContextCreate(parentcontext, + "expanded array", + ALLOCSET_DEFAULT_MINSIZE, + ALLOCSET_DEFAULT_INITSIZE, + ALLOCSET_DEFAULT_MAXSIZE); + + /* set up expanded array header */ + eah = (ExpandedArrayHeader *) + MemoryContextAlloc(objcxt, sizeof(ExpandedArrayHeader)); + + EOH_init_header(&eah->hdr, &EA_methods, objcxt); + eah->ea_magic = EA_MAGIC; + + /* + * Detoast and copy original array into private context, as a flat array. + * We flatten it even if it's in expanded form; it's not clear that adding + * a special-case path for that would be worth the trouble. + * + * Note that this coding risks leaking some memory in the private context + * if we have to fetch data from a TOAST table; however, experimentation + * says that the leak is minimal. Doing it this way saves a copy step, + * which seems worthwhile, especially if the array is large enough to need + * external storage. + */ + oldcxt = MemoryContextSwitchTo(objcxt); + array = DatumGetArrayTypePCopy(arraydatum); + MemoryContextSwitchTo(oldcxt); + + eah->ndims = ARR_NDIM(array); + /* note these pointers point into the fvalue header! */ + eah->dims = ARR_DIMS(array); + eah->lbound = ARR_LBOUND(array); + + /* save array's element-type data for possible use later */ + eah->element_type = ARR_ELEMTYPE(array); + if (elmlen) + { + /* Caller provided representational data */ + eah->typlen = elmlen; + eah->typbyval = elmbyval; + eah->typalign = elmalign; + } + else + { + /* No, so look it up */ + get_typlenbyvalalign(eah->element_type, + &eah->typlen, + &eah->typbyval, + &eah->typalign); + } + + /* we don't make a deconstructed representation now */ + eah->dvalues = NULL; + eah->dnulls = NULL; + eah->dvalueslen = 0; + eah->nelems = 0; + eah->flat_size = 0; + + /* remember we have a flat representation */ + eah->fvalue = array; + eah->fstartptr = ARR_DATA_PTR(array); + eah->fendptr = ((char *) array) + ARR_SIZE(array); + + /* return a R/W pointer to the expanded array */ + return PointerGetDatum(eah->hdr.eoh_rw_ptr); + } + + /* + * construct_empty_expanded_array: make an empty expanded array + * given only type information. (elmlen etc can be zeroes.) + */ + static ExpandedArrayHeader * + construct_empty_expanded_array(Oid element_type, + MemoryContext parentcontext, + int elmlen, bool elmbyval, char elmalign) + { + ArrayType *array = construct_empty_array(element_type); + Datum d; + + d = expand_array(PointerGetDatum(array), parentcontext, + elmlen, elmbyval, elmalign); + return (ExpandedArrayHeader *) DatumGetEOHP(d); + } + + + /* + * get_flat_size method for expanded arrays + */ + static Size + EA_get_flat_size(ExpandedObjectHeader *eohptr) + { + ExpandedArrayHeader *eah = (ExpandedArrayHeader *) eohptr; + int nelems; + int ndims; + Datum *dvalues; + bool *dnulls; + Size nbytes; + int i; + + Assert(eah->ea_magic == EA_MAGIC); + + /* Easy if we have a valid flattened value */ + if (eah->fvalue) + return ARR_SIZE(eah->fvalue); + + /* If we have a cached size value, believe that */ + if (eah->flat_size) + return eah->flat_size; + + /* + * Compute space needed by examining dvalues/dnulls. Note that the result + * array will have a nulls bitmap if dnulls isn't NULL, even if the array + * doesn't actually contain any nulls now. + */ + nelems = eah->nelems; + ndims = eah->ndims; + Assert(nelems == ArrayGetNItems(ndims, eah->dims)); + dvalues = eah->dvalues; + dnulls = eah->dnulls; + nbytes = 0; + for (i = 0; i < nelems; i++) + { + if (dnulls && dnulls[i]) + continue; + nbytes = att_addlength_datum(nbytes, eah->typlen, dvalues[i]); + nbytes = att_align_nominal(nbytes, eah->typalign); + /* check for overflow of total request */ + if (!AllocSizeIsValid(nbytes)) + ereport(ERROR, + (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED), + errmsg("array size exceeds the maximum allowed (%d)", + (int) MaxAllocSize))); + } + + if (dnulls) + nbytes += ARR_OVERHEAD_WITHNULLS(ndims, nelems); + else + nbytes += ARR_OVERHEAD_NONULLS(ndims); + + /* cache for next time */ + eah->flat_size = nbytes; + + return nbytes; + } + + /* + * flatten_into method for expanded arrays + */ + static void + EA_flatten_into(ExpandedObjectHeader *eohptr, + void *result, Size allocated_size) + { + ExpandedArrayHeader *eah = (ExpandedArrayHeader *) eohptr; + ArrayType *aresult = (ArrayType *) result; + int nelems; + int ndims; + int32 dataoffset; + + Assert(eah->ea_magic == EA_MAGIC); + + /* Easy if we have a valid flattened value */ + if (eah->fvalue) + { + Assert(allocated_size == ARR_SIZE(eah->fvalue)); + memcpy(result, eah->fvalue, allocated_size); + return; + } + + /* Else allocation should match previous get_flat_size result */ + Assert(allocated_size == eah->flat_size); + + /* Fill result array from dvalues/dnulls */ + nelems = eah->nelems; + ndims = eah->ndims; + + if (eah->dnulls) + dataoffset = ARR_OVERHEAD_WITHNULLS(ndims, nelems); + else + dataoffset = 0; /* marker for no null bitmap */ + + /* We must ensure that any pad space is zero-filled */ + memset(aresult, 0, allocated_size); + + SET_VARSIZE(aresult, allocated_size); + aresult->ndim = ndims; + aresult->dataoffset = dataoffset; + aresult->elemtype = eah->element_type; + memcpy(ARR_DIMS(aresult), eah->dims, ndims * sizeof(int)); + memcpy(ARR_LBOUND(aresult), eah->lbound, ndims * sizeof(int)); + + CopyArrayEls(aresult, + eah->dvalues, eah->dnulls, nelems, + eah->typlen, eah->typbyval, eah->typalign, + false); + } + + /* + * Argument fetching support code + */ + + #ifdef NOT_YET_USED + + /* + * DatumGetExpandedArray: get a writable expanded array from an input argument + */ + static ExpandedArrayHeader * + DatumGetExpandedArray(Datum d) + { + ExpandedArrayHeader *eah; + + /* If it's a writable expanded array already, just return it */ + if (VARATT_IS_EXTERNAL_EXPANDED_RW(DatumGetPointer(d))) + { + eah = (ExpandedArrayHeader *) DatumGetEOHP(d); + Assert(eah->ea_magic == EA_MAGIC); + return eah; + } + + /* + * If it's a non-writable expanded array, copy it, extracting the element + * representational data to save a catalog lookup. + */ + if (VARATT_IS_EXTERNAL_EXPANDED_RO(DatumGetPointer(d))) + { + eah = (ExpandedArrayHeader *) DatumGetEOHP(d); + Assert(eah->ea_magic == EA_MAGIC); + d = expand_array(d, CurrentMemoryContext, + eah->typlen, eah->typbyval, eah->typalign); + return (ExpandedArrayHeader *) DatumGetEOHP(d); + } + + /* Else expand the hard way */ + d = expand_array(d, CurrentMemoryContext, 0, 0, 0); + return (ExpandedArrayHeader *) DatumGetEOHP(d); + } + + #define PG_GETARG_EXPANDED_ARRAY(n) DatumGetExpandedArray(PG_GETARG_DATUM(n)) + + #endif + + /* + * As above, when caller has the ability to cache element type info + */ + static ExpandedArrayHeader * + DatumGetExpandedArrayX(Datum d, + int elmlen, bool elmbyval, char elmalign) + { + ExpandedArrayHeader *eah; + + /* If it's a writable expanded array already, just return it */ + if (VARATT_IS_EXTERNAL_EXPANDED_RW(DatumGetPointer(d))) + { + eah = (ExpandedArrayHeader *) DatumGetEOHP(d); + Assert(eah->ea_magic == EA_MAGIC); + Assert(eah->typlen == elmlen); + Assert(eah->typbyval == elmbyval); + Assert(eah->typalign == elmalign); + return eah; + } + + /* Else expand using caller's data */ + d = expand_array(d, CurrentMemoryContext, elmlen, elmbyval, elmalign); + return (ExpandedArrayHeader *) DatumGetEOHP(d); + } + + #define PG_GETARG_EXPANDED_ARRAYX(n, elmlen, elmbyval, elmalign) \ + DatumGetExpandedArrayX(PG_GETARG_DATUM(n), elmlen, elmbyval, elmalign) + + /* + * DatumGetAnyArray: return either an expanded array or a detoasted varlena + * array. The result must not be modified in-place. + */ + static AnyArrayType * + DatumGetAnyArray(Datum d) + { + ExpandedArrayHeader *eah; + + /* + * If it's an expanded array (RW or RO), return the header pointer. + */ + if (VARATT_IS_EXTERNAL_EXPANDED(DatumGetPointer(d))) + { + eah = (ExpandedArrayHeader *) DatumGetEOHP(d); + Assert(eah->ea_magic == EA_MAGIC); + return (AnyArrayType *) eah; + } + + /* Else do regular detoasting as needed */ + return (AnyArrayType *) PG_DETOAST_DATUM(d); + } + + #define PG_GETARG_ANY_ARRAY(n) DatumGetAnyArray(PG_GETARG_DATUM(n)) + + /* + * Create the Datum/isnull representation if we didn't do so previously + */ + static void + deconstruct_expanded_array(ExpandedArrayHeader *eah) + { + if (eah->dvalues == NULL) + { + MemoryContext oldcxt = MemoryContextSwitchTo(eah->hdr.eoh_context); + Datum *dvalues; + bool *dnulls; + int nelems; + + dnulls = NULL; + deconstruct_array(eah->fvalue, + eah->element_type, + eah->typlen, eah->typbyval, eah->typalign, + &dvalues, + ARR_HASNULL(eah->fvalue) ? &dnulls : NULL, + &nelems); + + /* + * Update header only after successful completion of this step. If + * deconstruct_array fails partway through, worst consequence is some + * leaked memory in the object's context. If the caller fails at a + * later point, that's fine, since the deconstructed representation is + * valid anyhow. + */ + eah->dvalues = dvalues; + eah->dnulls = dnulls; + eah->dvalueslen = eah->nelems = nelems; + MemoryContextSwitchTo(oldcxt); + } + } + + /* + * Equivalent of array_ref() for an expanded array + */ + Datum + array_ref_expanded(Datum arraydatum, + int nSubscripts, int *indx, + int arraytyplen, + int elmlen, bool elmbyval, char elmalign, + bool *isNull) + { + ExpandedArrayHeader *eah; + int i, + ndim, + *dim, + *lb, + offset; + Datum *dvalues; + bool *dnulls; + + eah = (ExpandedArrayHeader *) DatumGetEOHP(arraydatum); + Assert(eah->ea_magic == EA_MAGIC); + + /* sanity-check caller's state; we don't use the passed data otherwise */ + Assert(arraytyplen == -1); + Assert(elmlen == eah->typlen); + Assert(elmbyval == eah->typbyval); + Assert(elmalign == eah->typalign); + + ndim = eah->ndims; + dim = eah->dims; + lb = eah->lbound; + + /* + * Return NULL for invalid subscript + */ + if (ndim != nSubscripts || ndim <= 0 || ndim > MAXDIM) + { + *isNull = true; + return (Datum) 0; + } + for (i = 0; i < ndim; i++) + { + if (indx[i] < lb[i] || indx[i] >= (dim[i] + lb[i])) + { + *isNull = true; + return (Datum) 0; + } + } + + /* + * Calculate the element number + */ + offset = ArrayGetOffset(nSubscripts, dim, lb, indx); + + /* + * Deconstruct array if we didn't already. Note that we apply this even + * if the input is nominally read-only: it should be safe enough. + */ + deconstruct_expanded_array(eah); + + dvalues = eah->dvalues; + dnulls = eah->dnulls; + + /* + * Check for NULL array element + */ + if (dnulls && dnulls[offset]) + { + *isNull = true; + return (Datum) 0; + } + + /* + * OK, get the element. It's OK to return a pass-by-ref value as a + * pointer into the expanded array, for the same reason that array_ref can + * return a pointer into flat arrays: the value is assumed not to change + * for as long as the Datum reference can exist. + */ + *isNull = false; + return dvalues[offset]; + } + + /* + * Equivalent of array_set() for an expanded array + * + * array_set took care of detoasting dataValue, the rest is up to us + * + * Note: as with any operation on a read/write expanded object, we must + * take pains not to leave the object in a corrupt state if we fail partway + * through. + */ + Datum + array_set_expanded(Datum arraydatum, + int nSubscripts, int *indx, + Datum dataValue, bool isNull, + int arraytyplen, + int elmlen, bool elmbyval, char elmalign) + { + ExpandedArrayHeader *eah; + Datum *dvalues; + bool *dnulls; + int i, + ndim, + dim[MAXDIM], + lb[MAXDIM], + offset; + bool dimschanged, + newhasnulls; + int addedbefore, + addedafter; + char *oldValue; + + eah = (ExpandedArrayHeader *) DatumGetEOHP(arraydatum); + Assert(eah->ea_magic == EA_MAGIC); + + /* if this fails, we shouldn't be modifying this array in-place */ + Assert(VARATT_IS_EXTERNAL_EXPANDED_RW(DatumGetPointer(arraydatum))); + + /* sanity-check caller's state; we don't use the passed data otherwise */ + Assert(arraytyplen == -1); + Assert(elmlen == eah->typlen); + Assert(elmbyval == eah->typbyval); + Assert(elmalign == eah->typalign); + + /* + * Copy dimension info into local storage. This allows us to modify the + * dimensions if needed, while not messing up the expanded value if we + * fail partway through. + */ + ndim = eah->ndims; + Assert(ndim >= 0 && ndim <= MAXDIM); + memcpy(dim, eah->dims, ndim * sizeof(int)); + memcpy(lb, eah->lbound, ndim * sizeof(int)); + dimschanged = false; + + /* + * if number of dims is zero, i.e. an empty array, create an array with + * nSubscripts dimensions, and set the lower bounds to the supplied + * subscripts. + */ + if (ndim == 0) + { + /* + * Allocate adequate space for new dimension info. This is harmless + * if we fail later. + */ + Assert(nSubscripts > 0 && nSubscripts <= MAXDIM); + eah->dims = (int *) MemoryContextAllocZero(eah->hdr.eoh_context, + nSubscripts * sizeof(int)); + eah->lbound = (int *) MemoryContextAllocZero(eah->hdr.eoh_context, + nSubscripts * sizeof(int)); + + /* Update local copies of dimension info */ + ndim = nSubscripts; + for (i = 0; i < nSubscripts; i++) + { + dim[i] = 0; + lb[i] = indx[i]; + } + dimschanged = true; + } + else if (ndim != nSubscripts) + ereport(ERROR, + (errcode(ERRCODE_ARRAY_SUBSCRIPT_ERROR), + errmsg("wrong number of array subscripts"))); + + /* + * Deconstruct array if we didn't already. (Someday maybe add a special + * case path for fixed-length, no-nulls cases, where we can overwrite an + * element in place without ever deconstructing. But today is not that + * day.) + */ + deconstruct_expanded_array(eah); + + /* + * Copy new element into array's context, if needed (we assume it's + * already detoasted, so no junk should be created). If we fail further + * down, this memory is leaked, but that's reasonably harmless. + */ + if (!eah->typbyval && !isNull) + { + MemoryContext oldcxt = MemoryContextSwitchTo(eah->hdr.eoh_context); + + dataValue = datumCopy(dataValue, false, eah->typlen); + MemoryContextSwitchTo(oldcxt); + } + + dvalues = eah->dvalues; + dnulls = eah->dnulls; + + newhasnulls = ((dnulls != NULL) || isNull); + addedbefore = addedafter = 0; + + /* + * Check subscripts (this logic matches original array_set) + */ + if (ndim == 1) + { + if (indx[0] < lb[0]) + { + addedbefore = lb[0] - indx[0]; + dim[0] += addedbefore; + lb[0] = indx[0]; + dimschanged = true; + if (addedbefore > 1) + newhasnulls = true; /* will insert nulls */ + } + if (indx[0] >= (dim[0] + lb[0])) + { + addedafter = indx[0] - (dim[0] + lb[0]) + 1; + dim[0] += addedafter; + dimschanged = true; + if (addedafter > 1) + newhasnulls = true; /* will insert nulls */ + } + } + else + { + /* + * XXX currently we do not support extending multi-dimensional arrays + * during assignment + */ + for (i = 0; i < ndim; i++) + { + if (indx[i] < lb[i] || + indx[i] >= (dim[i] + lb[i])) + ereport(ERROR, + (errcode(ERRCODE_ARRAY_SUBSCRIPT_ERROR), + errmsg("array subscript out of range"))); + } + } + + /* Now we can calculate linear offset of target item in array */ + offset = ArrayGetOffset(nSubscripts, dim, lb, indx); + + /* Physically enlarge existing dvalues/dnulls arrays if needed */ + if (dim[0] > eah->dvalueslen) + { + /* We want some extra space if we're enlarging */ + int newlen = dim[0] + dim[0] / 8; + + eah->dvalues = dvalues = (Datum *) + repalloc(dvalues, newlen * sizeof(Datum)); + if (dnulls) + eah->dnulls = dnulls = (bool *) + repalloc(dnulls, newlen * sizeof(bool)); + eah->dvalueslen = newlen; + } + + /* + * If we need a nulls bitmap and don't already have one, create it, being + * sure to mark all existing entries as not null. + */ + if (newhasnulls && dnulls == NULL) + eah->dnulls = dnulls = (bool *) + MemoryContextAllocZero(eah->hdr.eoh_context, + eah->dvalueslen * sizeof(bool)); + + /* + * We now have all the needed space allocated, so we're ready to make + * irreversible changes. Be very wary of allowing failure below here. + */ + + /* Flattened value will no longer represent array accurately */ + eah->fvalue = NULL; + /* And we don't know the flattened size either */ + eah->flat_size = 0; + + /* Update dimensionality info if needed */ + if (dimschanged) + { + eah->ndims = ndim; + memcpy(eah->dims, dim, ndim * sizeof(int)); + memcpy(eah->lbound, lb, ndim * sizeof(int)); + } + + /* Reposition items if needed, and fill addedbefore items with nulls */ + if (addedbefore > 0) + { + memmove(dvalues + addedbefore, dvalues, eah->nelems * sizeof(Datum)); + for (i = 0; i < addedbefore; i++) + dvalues[i] = (Datum) 0; + if (dnulls) + { + memmove(dnulls + addedbefore, dnulls, eah->nelems * sizeof(bool)); + for (i = 0; i < addedbefore; i++) + dnulls[i] = true; + } + eah->nelems += addedbefore; + } + + /* fill addedafter items with nulls */ + if (addedafter > 0) + { + for (i = 0; i < addedafter; i++) + dvalues[eah->nelems + i] = (Datum) 0; + if (dnulls) + { + for (i = 0; i < addedafter; i++) + dnulls[eah->nelems + i] = true; + } + eah->nelems += addedafter; + } + + /* Grab old element value for pfree'ing, if needed. */ + if (!eah->typbyval && (dnulls == NULL || !dnulls[offset])) + oldValue = (char *) DatumGetPointer(dvalues[offset]); + else + oldValue = NULL; + + /* And finally we can insert the new element. */ + dvalues[offset] = dataValue; + if (dnulls) + dnulls[offset] = isNull; + + /* + * Free old element if needed; this keeps repeated element replacements + * from bloating the array's storage. If the pfree somehow fails, it + * won't corrupt the array. + */ + if (oldValue) + { + /* Don't try to pfree a part of the original flat array */ + if (oldValue < eah->fstartptr || oldValue >= eah->fendptr) + pfree(oldValue); + } + + /* Done, return standard TOAST pointer for object */ + return PointerGetDatum(eah->hdr.eoh_rw_ptr); + } + + /* + * Reimplementation of array_push for expanded arrays + */ + typedef struct ArrayPushState + { + Oid arg0_typeid; + Oid arg1_typeid; + bool array_on_left; + Oid element_type; + int16 typlen; + bool typbyval; + char typalign; + } ArrayPushState; + + Datum + array_push_expanded(PG_FUNCTION_ARGS) + { + ExpandedArrayHeader *eah; + Datum newelem; + bool isNull; + int *dimv, + *lb; + ArrayType *result; + int indx; + int lb0; + Oid element_type; + int16 typlen; + bool typbyval; + char typalign; + Oid arg0_typeid = get_fn_expr_argtype(fcinfo->flinfo, 0); + Oid arg1_typeid = get_fn_expr_argtype(fcinfo->flinfo, 1); + ArrayPushState *my_extra; + + if (arg0_typeid == InvalidOid || arg1_typeid == InvalidOid) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("could not determine input data types"))); + + /* + * We arrange to look up info about element type only once per series of + * calls, assuming the element type doesn't change underneath us. + */ + my_extra = (ArrayPushState *) fcinfo->flinfo->fn_extra; + if (my_extra == NULL) + { + fcinfo->flinfo->fn_extra = MemoryContextAlloc(fcinfo->flinfo->fn_mcxt, + sizeof(ArrayPushState)); + my_extra = (ArrayPushState *) fcinfo->flinfo->fn_extra; + my_extra->arg0_typeid = InvalidOid; + } + + if (my_extra->arg0_typeid != arg0_typeid || + my_extra->arg1_typeid != arg1_typeid) + { + /* Determine which input is the array */ + Oid arg0_elemid = get_element_type(arg0_typeid); + Oid arg1_elemid = get_element_type(arg1_typeid); + + if (arg0_elemid != InvalidOid) + { + my_extra->array_on_left = true; + element_type = arg0_elemid; + } + else if (arg1_elemid != InvalidOid) + { + my_extra->array_on_left = false; + element_type = arg1_elemid; + } + else + { + /* Shouldn't get here given proper type checking in parser */ + ereport(ERROR, + (errcode(ERRCODE_DATATYPE_MISMATCH), + errmsg("neither input type is an array"))); + PG_RETURN_NULL(); /* keep compiler quiet */ + } + + my_extra->arg0_typeid = arg0_typeid; + my_extra->arg1_typeid = arg1_typeid; + + /* Get info about element type */ + get_typlenbyvalalign(element_type, + &my_extra->typlen, + &my_extra->typbyval, + &my_extra->typalign); + my_extra->element_type = element_type; + } + + element_type = my_extra->element_type; + typlen = my_extra->typlen; + typbyval = my_extra->typbyval; + typalign = my_extra->typalign; + + /* + * Now we can fetch the arguments, using cached type info if needed + */ + if (my_extra->array_on_left) + { + if (PG_ARGISNULL(0)) + eah = construct_empty_expanded_array(element_type, + CurrentMemoryContext, + typlen, typbyval, typalign); + else + eah = PG_GETARG_EXPANDED_ARRAYX(0, typlen, typbyval, typalign); + isNull = PG_ARGISNULL(1); + if (isNull) + newelem = (Datum) 0; + else + newelem = PG_GETARG_DATUM(1); + } + else + { + if (PG_ARGISNULL(1)) + eah = construct_empty_expanded_array(element_type, + CurrentMemoryContext, + typlen, typbyval, typalign); + else + eah = PG_GETARG_EXPANDED_ARRAYX(1, typlen, typbyval, typalign); + isNull = PG_ARGISNULL(0); + if (isNull) + newelem = (Datum) 0; + else + newelem = PG_GETARG_DATUM(0); + } + + Assert(element_type == eah->element_type); + + /* + * Perform push (this logic is basically unchanged from original) + */ + if (eah->ndims == 1) + { + lb = eah->lbound; + dimv = eah->dims; + + if (my_extra->array_on_left) + { + /* append newelem */ + int ub = dimv[0] + lb[0] - 1; + + indx = ub + 1; + /* overflow? */ + if (indx < ub) + ereport(ERROR, + (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE), + errmsg("integer out of range"))); + } + else + { + /* prepend newelem */ + indx = lb[0] - 1; + /* overflow? */ + if (indx > lb[0]) + ereport(ERROR, + (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE), + errmsg("integer out of range"))); + } + lb0 = lb[0]; + } + else if (eah->ndims == 0) + { + indx = 1; + lb0 = 1; + } + else + ereport(ERROR, + (errcode(ERRCODE_DATA_EXCEPTION), + errmsg("argument must be empty or one-dimensional array"))); + + result = array_set((ArrayType *) eah->hdr.eoh_rw_ptr, + 1, &indx, newelem, isNull, + -1, typlen, typbyval, typalign); + + Assert(result == (ArrayType *) eah->hdr.eoh_rw_ptr); + + /* + * Readjust result's LB to match the input's. We need do nothing in the + * append case, but it's the simplest way to implement the prepend case. + */ + if (eah->ndims == 1 && !my_extra->array_on_left) + { + /* This is ok whether we've deconstructed or not */ + eah->lbound[0] = lb0; + } + + PG_RETURN_POINTER(result); + } + + /* + * array_dims : + * returns the dimensions of the array pointed to by "v", as a "text" + * + * This is here as an example of handling either flat or expanded inputs. + */ + Datum + array_dims_expanded(PG_FUNCTION_ARGS) + { + AnyArrayType *v = PG_GETARG_ANY_ARRAY(0); + char *p; + int i; + int *dimv, + *lb; + + /* + * 33 since we assume 15 digits per number + ':' +'[]' + * + * +1 for trailing null + */ + char buf[MAXDIM * 33 + 1]; + + /* Sanity check: does it look like an array at all? */ + if (AARR_NDIM(v) <= 0 || AARR_NDIM(v) > MAXDIM) + PG_RETURN_NULL(); + + dimv = AARR_DIMS(v); + lb = AARR_LBOUND(v); + + p = buf; + for (i = 0; i < AARR_NDIM(v); i++) + { + sprintf(p, "[%d:%d]", lb[i], dimv[i] + lb[i] - 1); + p += strlen(p); + } + + PG_RETURN_TEXT_P(cstring_to_text(buf)); + } diff --git a/src/backend/utils/adt/array_userfuncs.c b/src/backend/utils/adt/array_userfuncs.c index 600646e..7eee40c 100644 *** a/src/backend/utils/adt/array_userfuncs.c --- b/src/backend/utils/adt/array_userfuncs.c *************** *** 25,30 **** --- 25,33 ---- Datum array_push(PG_FUNCTION_ARGS) { + #if 1 + return array_push_expanded(fcinfo); + #else ArrayType *v; Datum newelem; bool isNull; *************** array_push(PG_FUNCTION_ARGS) *** 157,162 **** --- 160,166 ---- ARR_LBOUND(result)[0] = ARR_LBOUND(v)[0]; PG_RETURN_ARRAYTYPE_P(result); + #endif } /*----------------------------------------------------------------------------- diff --git a/src/backend/utils/adt/arrayfuncs.c b/src/backend/utils/adt/arrayfuncs.c index 5591b46..3994843 100644 *** a/src/backend/utils/adt/arrayfuncs.c --- b/src/backend/utils/adt/arrayfuncs.c *************** *** 27,32 **** --- 27,33 ---- #include "utils/array.h" #include "utils/builtins.h" #include "utils/datum.h" + #include "utils/expandeddatum.h" #include "utils/lsyscache.h" #include "utils/memutils.h" #include "utils/typcache.h" *************** static void ReadArrayBinary(StringInfo b *** 93,102 **** int typlen, bool typbyval, char typalign, Datum *values, bool *nulls, bool *hasnulls, int32 *nbytes); - static void CopyArrayEls(ArrayType *array, - Datum *values, bool *nulls, int nitems, - int typlen, bool typbyval, char typalign, - bool freedata); static bool array_get_isnull(const bits8 *nullbitmap, int offset); static void array_set_isnull(bits8 *nullbitmap, int offset, bool isNull); static Datum ArrayCast(char *value, bool byval, int len); --- 94,99 ---- *************** ReadArrayStr(char *arrayStr, *** 939,945 **** * the values are not toasted. (Doing it here doesn't work since the * caller has already allocated space for the array...) */ ! static void CopyArrayEls(ArrayType *array, Datum *values, bool *nulls, --- 936,942 ---- * the values are not toasted. (Doing it here doesn't work since the * caller has already allocated space for the array...) */ ! void CopyArrayEls(ArrayType *array, Datum *values, bool *nulls, *************** array_ndims(PG_FUNCTION_ARGS) *** 1666,1671 **** --- 1663,1671 ---- Datum array_dims(PG_FUNCTION_ARGS) { + #if 1 + return array_dims_expanded(fcinfo); + #else ArrayType *v = PG_GETARG_ARRAYTYPE_P(0); char *p; int i; *************** array_dims(PG_FUNCTION_ARGS) *** 1694,1699 **** --- 1694,1700 ---- } PG_RETURN_TEXT_P(cstring_to_text(buf)); + #endif } /* *************** array_ref(ArrayType *array, *** 1849,1854 **** --- 1850,1867 ---- arraydataptr = (char *) array; arraynullsptr = NULL; } + else if (VARATT_IS_EXTERNAL_EXPANDED(array)) + { + /* hand off to array_expanded.c */ + return array_ref_expanded(PointerGetDatum(array), + nSubscripts, + indx, + arraytyplen, + elmlen, + elmbyval, + elmalign, + isNull); + } else { /* detoast input array if necessary */ *************** array_set(ArrayType *array, *** 2161,2166 **** --- 2174,2202 ---- if (elmlen == -1 && !isNull) dataValue = PointerGetDatum(PG_DETOAST_DATUM(dataValue)); + /* if array is in expanded form, hand off to array_expanded.c */ + if (VARATT_IS_EXTERNAL_EXPANDED(array)) + { + /* Convert to R/W form if not that already */ + if (!VARATT_IS_EXTERNAL_EXPANDED_RW(array)) + { + array = (ArrayType *) DatumGetPointer( + expand_array(PointerGetDatum(array), CurrentMemoryContext, + elmlen, elmbyval, elmalign)); + } + + return (ArrayType *) DatumGetPointer( + array_set_expanded(PointerGetDatum(array), + nSubscripts, + indx, + dataValue, + isNull, + arraytyplen, + elmlen, + elmbyval, + elmalign)); + } + /* detoast input array if necessary */ array = DatumGetArrayTypeP(PointerGetDatum(array)); diff --git a/src/backend/utils/adt/datum.c b/src/backend/utils/adt/datum.c index 014eca5..e8af030 100644 *** a/src/backend/utils/adt/datum.c --- b/src/backend/utils/adt/datum.c *************** *** 12,19 **** * *------------------------------------------------------------------------- */ /* ! * In the implementation of the next routines we assume the following: * * A) if a type is "byVal" then all the information is stored in the * Datum itself (i.e. no pointers involved!). In this case the --- 12,20 ---- * *------------------------------------------------------------------------- */ + /* ! * In the implementation of these routines we assume the following: * * A) if a type is "byVal" then all the information is stored in the * Datum itself (i.e. no pointers involved!). In this case the *************** *** 34,44 **** --- 35,49 ---- * * Note that we do not treat "toasted" datums specially; therefore what * will be copied or compared is the compressed data or toast reference. + * An exception is made for datumCopy() of an expanded object, however, + * because most callers expect to get a simple contiguous (and pfree'able) + * result from datumCopy(). See also datumTransfer(). */ #include "postgres.h" #include "utils/datum.h" + #include "utils/expandeddatum.h" /*------------------------------------------------------------------------- *************** *** 46,51 **** --- 51,57 ---- * * Find the "real" size of a datum, given the datum value, * whether it is a "by value", and the declared type length. + * (For TOAST pointer datums, this is the size of the pointer datum.) * * This is essentially an out-of-line version of the att_addlength_datum() * macro in access/tupmacs.h. We do a tad more error checking though. *************** datumGetSize(Datum value, bool typByVal, *** 106,114 **** /*------------------------------------------------------------------------- * datumCopy * ! * make a copy of a datum * * If the datatype is pass-by-reference, memory is obtained with palloc(). *------------------------------------------------------------------------- */ Datum --- 112,127 ---- /*------------------------------------------------------------------------- * datumCopy * ! * Make a copy of a non-NULL datum. * * If the datatype is pass-by-reference, memory is obtained with palloc(). + * + * If the value is a reference to an expanded object, we flatten into memory + * obtained with palloc(). We need to copy because one of the main uses of + * this function is to copy a datum out of a transient memory context that's + * about to be destroyed, and the expanded object is probably in a child + * context that will also go away. Moreover, many callers assume that the + * result is a single pfree-able chunk. *------------------------------------------------------------------------- */ Datum *************** datumCopy(Datum value, bool typByVal, in *** 118,161 **** if (typByVal) res = value; else { Size realSize; ! char *s; ! ! if (DatumGetPointer(value) == NULL) ! return PointerGetDatum(NULL); realSize = datumGetSize(value, typByVal, typLen); ! s = (char *) palloc(realSize); ! memcpy(s, DatumGetPointer(value), realSize); ! res = PointerGetDatum(s); } return res; } /*------------------------------------------------------------------------- ! * datumFree * ! * Free the space occupied by a datum CREATED BY "datumCopy" * ! * NOTE: DO NOT USE THIS ROUTINE with datums returned by heap_getattr() etc. ! * ONLY datums created by "datumCopy" can be freed! *------------------------------------------------------------------------- */ ! #ifdef NOT_USED ! void ! datumFree(Datum value, bool typByVal, int typLen) { ! if (!typByVal) ! { ! Pointer s = DatumGetPointer(value); ! ! pfree(s); ! } } - #endif /*------------------------------------------------------------------------- * datumIsEqual --- 131,201 ---- if (typByVal) res = value; + else if (typLen == -1) + { + /* It is a varlena datatype */ + struct varlena *vl = (struct varlena *) DatumGetPointer(value); + + if (VARATT_IS_EXTERNAL_EXPANDED(vl)) + { + /* Flatten into the caller's memory context */ + ExpandedObjectHeader *eoh = DatumGetEOHP(value); + Size resultsize; + char *resultptr; + + resultsize = EOH_get_flat_size(eoh); + resultptr = (char *) palloc(resultsize); + EOH_flatten_into(eoh, (void *) resultptr, resultsize); + res = PointerGetDatum(resultptr); + } + else + { + /* Otherwise, just copy the varlena datum verbatim */ + Size realSize; + char *resultptr; + + realSize = (Size) VARSIZE_ANY(vl); + resultptr = (char *) palloc(realSize); + memcpy(resultptr, vl, realSize); + res = PointerGetDatum(resultptr); + } + } else { + /* Pass by reference, but not varlena, so not toasted */ Size realSize; ! char *resultptr; realSize = datumGetSize(value, typByVal, typLen); ! resultptr = (char *) palloc(realSize); ! memcpy(resultptr, DatumGetPointer(value), realSize); ! res = PointerGetDatum(resultptr); } return res; } /*------------------------------------------------------------------------- ! * datumTransfer * ! * Transfer a non-NULL datum into the current memory context. * ! * This is equivalent to datumCopy() except when the datum is a read-write ! * pointer to an expanded object. In that case we merely reparent the object ! * into the current context, and return its standard R/W pointer (in case the ! * given one is a transient pointer of shorter lifespan). *------------------------------------------------------------------------- */ ! Datum ! datumTransfer(Datum value, bool typByVal, int typLen) { ! if (!typByVal && typLen == -1 && ! VARATT_IS_EXTERNAL_EXPANDED_RW(DatumGetPointer(value))) ! value = TransferExpandedObject(value, CurrentMemoryContext); ! else ! value = datumCopy(value, typByVal, typLen); ! return value; } /*------------------------------------------------------------------------- * datumIsEqual diff --git a/src/backend/utils/adt/expandeddatum.c b/src/backend/utils/adt/expandeddatum.c index ...d43f437 . *** a/src/backend/utils/adt/expandeddatum.c --- b/src/backend/utils/adt/expandeddatum.c *************** *** 0 **** --- 1,162 ---- + /*------------------------------------------------------------------------- + * + * expandeddatum.c + * Support functions for "expanded" value representations. + * + * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group + * Portions Copyright (c) 1994, Regents of the University of California + * + * + * IDENTIFICATION + * src/backend/utils/adt/expandeddatum.c + * + *------------------------------------------------------------------------- + */ + #include "postgres.h" + + #include "utils/expandeddatum.h" + #include "utils/memutils.h" + + /* + * DatumGetEOHP + * + * Given a Datum that is an expanded-object reference, extract the pointer. + * + * This is a bit tedious since the pointer may not be properly aligned; + * compare VARATT_EXTERNAL_GET_POINTER(). + */ + ExpandedObjectHeader * + DatumGetEOHP(Datum d) + { + varattrib_1b_e *datum = (varattrib_1b_e *) DatumGetPointer(d); + varatt_expanded ptr; + + Assert(VARATT_IS_EXTERNAL_EXPANDED(datum)); + memcpy(&ptr, VARDATA_EXTERNAL(datum), sizeof(ptr)); + return ptr.eohptr; + } + + /* + * EOH_init_header + * + * Initialize the common header of an expanded object. + * + * The main thing this encapsulates is initializing the TOAST pointers. + */ + void + EOH_init_header(ExpandedObjectHeader *eohptr, + const ExpandedObjectMethods *methods, + MemoryContext obj_context) + { + varatt_expanded ptr; + + eohptr->vl_len_ = EOH_HEADER_MAGIC; + eohptr->eoh_methods = methods; + eohptr->eoh_context = obj_context; + + ptr.eohptr = eohptr; + + SET_VARTAG_EXTERNAL(eohptr->eoh_rw_ptr, VARTAG_EXPANDED_RW); + memcpy(VARDATA_EXTERNAL(eohptr->eoh_rw_ptr), &ptr, sizeof(ptr)); + + SET_VARTAG_EXTERNAL(eohptr->eoh_ro_ptr, VARTAG_EXPANDED_RO); + memcpy(VARDATA_EXTERNAL(eohptr->eoh_ro_ptr), &ptr, sizeof(ptr)); + } + + /* + * EOH_get_flat_size + * EOH_flatten_into + * + * Convenience functions for invoking the "methods" of an expanded object. + */ + + Size + EOH_get_flat_size(ExpandedObjectHeader *eohptr) + { + return (*eohptr->eoh_methods->get_flat_size) (eohptr); + } + + void + EOH_flatten_into(ExpandedObjectHeader *eohptr, + void *result, Size allocated_size) + { + (*eohptr->eoh_methods->flatten_into) (eohptr, result, allocated_size); + } + + /* + * Does the Datum represent a writable expanded object? + */ + bool + DatumIsReadWriteExpandedObject(Datum d, bool isnull, int16 typlen) + { + /* Reject if it's NULL or not a varlena type */ + if (isnull || typlen != -1) + return false; + + /* Reject if not a read-write expanded-object pointer */ + if (!VARATT_IS_EXTERNAL_EXPANDED_RW(DatumGetPointer(d))) + return false; + + return true; + } + + /* + * If the Datum represents a R/W expanded object, change it to R/O. + * Otherwise return the original Datum. + */ + Datum + MakeExpandedObjectReadOnly(Datum d, bool isnull, int16 typlen) + { + ExpandedObjectHeader *eohptr; + + /* Nothing to do if it's NULL or not a varlena type */ + if (isnull || typlen != -1) + return d; + + /* Nothing to do if not a read-write expanded-object pointer */ + if (!VARATT_IS_EXTERNAL_EXPANDED_RW(DatumGetPointer(d))) + return d; + + /* Now safe to extract the object pointer */ + eohptr = DatumGetEOHP(d); + + /* Return the built-in read-only pointer instead of given pointer */ + return PointerGetDatum(eohptr->eoh_ro_ptr); + } + + /* + * Transfer ownership of an expanded object to a new parent memory context. + * The object must be referenced by a R/W pointer, and what we return is + * always its "standard" R/W pointer, which is certain to have the same + * lifespan as the object itself. (The passed-in pointer might not, and + * in any case wouldn't provide a unique identifier if it's not that one.) + */ + Datum + TransferExpandedObject(Datum d, MemoryContext new_parent) + { + ExpandedObjectHeader *eohptr = DatumGetEOHP(d); + + /* Assert caller gave a R/W pointer */ + Assert(VARATT_IS_EXTERNAL_EXPANDED_RW(DatumGetPointer(d))); + + /* Transfer ownership */ + MemoryContextSetParent(eohptr->eoh_context, new_parent); + + /* Return the object's standard read-write pointer */ + return PointerGetDatum(eohptr->eoh_rw_ptr); + } + + /* + * Delete an expanded object (must be referenced by a R/W pointer). + */ + void + DeleteExpandedObject(Datum d) + { + ExpandedObjectHeader *eohptr = DatumGetEOHP(d); + + /* Assert caller gave a R/W pointer */ + Assert(VARATT_IS_EXTERNAL_EXPANDED_RW(DatumGetPointer(d))); + + /* Kill it */ + MemoryContextDelete(eohptr->eoh_context); + } diff --git a/src/backend/utils/mmgr/mcxt.c b/src/backend/utils/mmgr/mcxt.c index 202bc78..4b24066 100644 *** a/src/backend/utils/mmgr/mcxt.c --- b/src/backend/utils/mmgr/mcxt.c *************** MemoryContextSetParent(MemoryContext con *** 266,271 **** --- 266,275 ---- AssertArg(MemoryContextIsValid(context)); AssertArg(context != new_parent); + /* Fast path if it's got correct parent already */ + if (new_parent == context->parent) + return; + /* Delink from existing parent, if any */ if (context->parent) { diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h index 40fde83..a98a7af 100644 *** a/src/include/executor/executor.h --- b/src/include/executor/executor.h *************** extern void FreeExprContext(ExprContext *** 312,318 **** extern void ReScanExprContext(ExprContext *econtext); #define ResetExprContext(econtext) \ ! MemoryContextReset((econtext)->ecxt_per_tuple_memory) extern ExprContext *MakePerTupleExprContext(EState *estate); --- 312,318 ---- extern void ReScanExprContext(ExprContext *econtext); #define ResetExprContext(econtext) \ ! MemoryContextResetAndDeleteChildren((econtext)->ecxt_per_tuple_memory) extern ExprContext *MakePerTupleExprContext(EState *estate); diff --git a/src/include/executor/spi.h b/src/include/executor/spi.h index 9e912ba..fbcae0c 100644 *** a/src/include/executor/spi.h --- b/src/include/executor/spi.h *************** extern char *SPI_getnspname(Relation rel *** 124,129 **** --- 124,130 ---- extern void *SPI_palloc(Size size); extern void *SPI_repalloc(void *pointer, Size size); extern void SPI_pfree(void *pointer); + extern Datum SPI_datumTransfer(Datum value, bool typByVal, int typLen); extern void SPI_freetuple(HeapTuple pointer); extern void SPI_freetuptable(SPITupleTable *tuptable); diff --git a/src/include/executor/tuptable.h b/src/include/executor/tuptable.h index 48f84bf..00686b0 100644 *** a/src/include/executor/tuptable.h --- b/src/include/executor/tuptable.h *************** extern Datum ExecFetchSlotTupleDatum(Tup *** 163,168 **** --- 163,169 ---- extern HeapTuple ExecMaterializeSlot(TupleTableSlot *slot); extern TupleTableSlot *ExecCopySlot(TupleTableSlot *dstslot, TupleTableSlot *srcslot); + extern TupleTableSlot *ExecMakeSlotContentsReadOnly(TupleTableSlot *slot); /* in access/common/heaptuple.c */ extern Datum slot_getattr(TupleTableSlot *slot, int attnum, bool *isnull); diff --git a/src/include/nodes/primnodes.h b/src/include/nodes/primnodes.h index 1d06f42..932a96b 100644 *** a/src/include/nodes/primnodes.h --- b/src/include/nodes/primnodes.h *************** typedef struct WindowFunc *** 310,315 **** --- 310,319 ---- * Note: the result datatype is the element type when fetching a single * element; but it is the array type when doing subarray fetch or either * type of store. + * + * Note: for the cases where an array is returned, if refexpr yields a R/W + * expanded array, then the implementation is allowed to modify that object + * in-place and return the same object.) * ---------------- */ typedef struct ArrayRef diff --git a/src/include/postgres.h b/src/include/postgres.h index 082c75b..5dd897a 100644 *** a/src/include/postgres.h --- b/src/include/postgres.h *************** typedef struct varatt_indirect *** 88,93 **** --- 88,110 ---- } varatt_indirect; /* + * struct varatt_expanded is a "TOAST pointer" representing an out-of-line + * Datum that is stored in memory, in some type-specific, not necessarily + * physically contiguous format that is convenient for computation not + * storage. APIs for this, in particular the definition of struct + * ExpandedObjectHeader, are in src/include/utils/expandeddatum.h. + * + * Note that just as for struct varatt_external, this struct is stored + * unaligned within any containing tuple. + */ + typedef struct ExpandedObjectHeader ExpandedObjectHeader; + + typedef struct varatt_expanded + { + ExpandedObjectHeader *eohptr; + } varatt_expanded; + + /* * Type tag for the various sorts of "TOAST pointer" datums. The peculiar * value for VARTAG_ONDISK comes from a requirement for on-disk compatibility * with a previous notion that the tag field was the pointer datum's length. *************** typedef struct varatt_indirect *** 95,105 **** --- 112,129 ---- typedef enum vartag_external { VARTAG_INDIRECT = 1, + VARTAG_EXPANDED_RO = 2, + VARTAG_EXPANDED_RW = 3, VARTAG_ONDISK = 18 } vartag_external; + /* this test relies on the specific tag values above */ + #define VARTAG_IS_EXPANDED(tag) \ + (((tag) & ~1) == VARTAG_EXPANDED_RO) + #define VARTAG_SIZE(tag) \ ((tag) == VARTAG_INDIRECT ? sizeof(varatt_indirect) : \ + VARTAG_IS_EXPANDED(tag) ? sizeof(varatt_expanded) : \ (tag) == VARTAG_ONDISK ? sizeof(varatt_external) : \ TrapMacro(true, "unrecognized TOAST vartag")) *************** typedef struct *** 294,299 **** --- 318,329 ---- (VARATT_IS_EXTERNAL(PTR) && VARTAG_EXTERNAL(PTR) == VARTAG_ONDISK) #define VARATT_IS_EXTERNAL_INDIRECT(PTR) \ (VARATT_IS_EXTERNAL(PTR) && VARTAG_EXTERNAL(PTR) == VARTAG_INDIRECT) + #define VARATT_IS_EXTERNAL_EXPANDED_RO(PTR) \ + (VARATT_IS_EXTERNAL(PTR) && VARTAG_EXTERNAL(PTR) == VARTAG_EXPANDED_RO) + #define VARATT_IS_EXTERNAL_EXPANDED_RW(PTR) \ + (VARATT_IS_EXTERNAL(PTR) && VARTAG_EXTERNAL(PTR) == VARTAG_EXPANDED_RW) + #define VARATT_IS_EXTERNAL_EXPANDED(PTR) \ + (VARATT_IS_EXTERNAL(PTR) && VARTAG_IS_EXPANDED(VARTAG_EXTERNAL(PTR))) #define VARATT_IS_SHORT(PTR) VARATT_IS_1B(PTR) #define VARATT_IS_EXTENDED(PTR) (!VARATT_IS_4B_U(PTR)) diff --git a/src/include/utils/array.h b/src/include/utils/array.h index 694bce7..7360f7d 100644 *** a/src/include/utils/array.h --- b/src/include/utils/array.h *************** extern Datum array_remove(PG_FUNCTION_AR *** 248,253 **** --- 248,261 ---- extern Datum array_replace(PG_FUNCTION_ARGS); extern Datum width_bucket_array(PG_FUNCTION_ARGS); + extern void CopyArrayEls(ArrayType *array, + Datum *values, + bool *nulls, + int nitems, + int typlen, + bool typbyval, + char typalign, + bool freedata); extern Datum array_ref(ArrayType *array, int nSubscripts, int *indx, int arraytyplen, int elmlen, bool elmbyval, char elmalign, bool *isNull); *************** extern Datum array_agg_array_transfn(PG_ *** 349,354 **** --- 357,380 ---- extern Datum array_agg_array_finalfn(PG_FUNCTION_ARGS); /* + * prototypes for functions defined in array_expanded.c + */ + extern Datum expand_array(Datum arraydatum, MemoryContext parentcontext, + int elmlen, bool elmbyval, char elmalign); + extern Datum array_ref_expanded(Datum arraydatum, + int nSubscripts, int *indx, + int arraytyplen, + int elmlen, bool elmbyval, char elmalign, + bool *isNull); + extern Datum array_set_expanded(Datum arraydatum, + int nSubscripts, int *indx, + Datum dataValue, bool isNull, + int arraytyplen, + int elmlen, bool elmbyval, char elmalign); + extern Datum array_push_expanded(PG_FUNCTION_ARGS); + extern Datum array_dims_expanded(PG_FUNCTION_ARGS); + + /* * prototypes for functions defined in array_typanalyze.c */ extern Datum array_typanalyze(PG_FUNCTION_ARGS); diff --git a/src/include/utils/datum.h b/src/include/utils/datum.h index 663414b..c572f79 100644 *** a/src/include/utils/datum.h --- b/src/include/utils/datum.h *************** *** 24,41 **** extern Size datumGetSize(Datum value, bool typByVal, int typLen); /* ! * datumCopy - make a copy of a datum. * * If the datatype is pass-by-reference, memory is obtained with palloc(). */ extern Datum datumCopy(Datum value, bool typByVal, int typLen); /* ! * datumFree - free a datum previously allocated by datumCopy, if any. * ! * Does nothing if datatype is pass-by-value. */ ! extern void datumFree(Datum value, bool typByVal, int typLen); /* * datumIsEqual --- 24,41 ---- extern Size datumGetSize(Datum value, bool typByVal, int typLen); /* ! * datumCopy - make a copy of a non-NULL datum. * * If the datatype is pass-by-reference, memory is obtained with palloc(). */ extern Datum datumCopy(Datum value, bool typByVal, int typLen); /* ! * datumTransfer - transfer a non-NULL datum into the current memory context. * ! * Differs from datumCopy() in its handling of read-write expanded objects. */ ! extern Datum datumTransfer(Datum value, bool typByVal, int typLen); /* * datumIsEqual diff --git a/src/include/utils/expandeddatum.h b/src/include/utils/expandeddatum.h index ...584d0c6 . *** a/src/include/utils/expandeddatum.h --- b/src/include/utils/expandeddatum.h *************** *** 0 **** --- 1,146 ---- + /*------------------------------------------------------------------------- + * + * expandeddatum.h + * Declarations for access to "expanded" value representations. + * + * Complex data types, particularly container types such as arrays and + * records, usually have on-disk representations that are compact but not + * especially convenient to modify. What's more, when we do modify them, + * having to recopy all the rest of the value can be extremely inefficient. + * Therefore, we provide a notion of an "expanded" representation that is used + * only in memory and is optimized more for computation than storage. + * The format appearing on disk is called the data type's "flattened" + * representation, since it is required to be a contiguous blob of bytes -- + * but the type can have an expanded representation that is not. Data types + * must provide means to translate an expanded representation back to + * flattened form. + * + * An expanded object is meant to survive across multiple operations, but + * not to be enormously long-lived; for example it might be a local variable + * in a PL/pgSQL procedure. So its extra bulk compared to the on-disk format + * is a worthwhile trade-off. + * + * References to expanded objects are a type of TOAST pointer. + * Because of longstanding conventions in Postgres, this means that the + * flattened form of such an object must always be a varlena object. + * Fortunately that's no restriction in practice. + * + * There are actually two kinds of TOAST pointers for expanded objects: + * read-only and read-write pointers. Possession of one of the latter + * authorizes a function to modify the value in-place rather than copying it + * as would normally be required. Functions should always return a read-write + * pointer to any new expanded object they create. Functions that modify an + * argument value in-place must take care that they do not corrupt the old + * value if they fail partway through. + * + * + * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group + * Portions Copyright (c) 1994, Regents of the University of California + * + * src/include/utils/expandeddatum.h + * + *------------------------------------------------------------------------- + */ + #ifndef EXPANDEDDATUM_H + #define EXPANDEDDATUM_H + + /* Size of an EXTERNAL datum that contains a pointer to an expanded object */ + #define EXPANDED_POINTER_SIZE (VARHDRSZ_EXTERNAL + sizeof(varatt_expanded)) + + /* + * "Methods" that must be provided for any expanded object. + * + * get_flat_size: compute space needed for flattened representation (which + * must be a valid in-line, non-compressed, 4-byte-header varlena object). + * + * flatten_into: construct flattened representation in the caller-allocated + * space at *result, of size allocated_size (which will always be the result + * of a preceding get_flat_size call; it's passed for cross-checking). + * + * Note: construction of a heap tuple from an expanded datum calls + * get_flat_size twice, so it's worthwhile to make sure that that doesn't + * incur too much overhead. + */ + typedef Size (*EOM_get_flat_size_method) (ExpandedObjectHeader *eohptr); + typedef void (*EOM_flatten_into_method) (ExpandedObjectHeader *eohptr, + void *result, Size allocated_size); + + /* Struct of function pointers for an expanded object's methods */ + typedef struct ExpandedObjectMethods + { + EOM_get_flat_size_method get_flat_size; + EOM_flatten_into_method flatten_into; + } ExpandedObjectMethods; + + /* + * Every expanded object must contain this header; typically the header + * is embedded in some larger struct that adds type-specific fields. + * + * It is presumed that the header object and all subsidiary data are stored + * in eoh_context, so that the object can be freed by deleting that context, + * or its storage lifespan can be altered by reparenting the context. + * (In principle the object could own additional resources, such as malloc'd + * storage, and use a memory context reset callback to free them upon reset or + * deletion of eoh_context.) + * + * We set up two TOAST pointers within the standard header, one read-write + * and one read-only. This allows functions to return either kind of pointer + * without making an additional allocation, and in particular without worrying + * whether a separately palloc'd object would have sufficient lifespan. + * But note that these pointers are just a convenience; a pointer object + * appearing somewhere else would still be legal. + * + * The typedef declaration for this appears in postgres.h. + */ + struct ExpandedObjectHeader + { + /* Phony varlena header */ + int32 vl_len_; /* always EOH_HEADER_MAGIC, see below */ + + /* Pointer to methods required for object type */ + const ExpandedObjectMethods *eoh_methods; + + /* Memory context containing this header and subsidiary data */ + MemoryContext eoh_context; + + /* Standard R/W TOAST pointer for this object is kept here */ + char eoh_rw_ptr[EXPANDED_POINTER_SIZE]; + + /* Standard R/O TOAST pointer for this object is kept here */ + char eoh_ro_ptr[EXPANDED_POINTER_SIZE]; + }; + + /* + * Particularly for read-only functions, it is handy to be able to work with + * either regular "flat" varlena inputs or expanded inputs of the same data + * type. To allow determining which case an argument-fetching macro has + * returned, the first int32 of an ExpandedObjectHeader always contains -1 + * (EOH_HEADER_MAGIC to the code). This works since no 4-byte-header varlena + * could have that as its first 4 bytes. Caution: we could not reliably tell + * the difference between an ExpandedObjectHeader and a short-header object + * with this trick. However, it works fine for cases where the argument + * fetching code will return either a fully-uncompressed flat object or a + * expanded object. + */ + #define EOH_HEADER_MAGIC (-1) + #define VARATT_IS_EXPANDED_HEADER(PTR) \ + (((ExpandedObjectHeader *) (PTR))->vl_len_ == EOH_HEADER_MAGIC) + + /* + * Generic support functions for expanded objects. + * (Some of these might be worth inlining later.) + */ + + extern ExpandedObjectHeader *DatumGetEOHP(Datum d); + extern void EOH_init_header(ExpandedObjectHeader *eohptr, + const ExpandedObjectMethods *methods, + MemoryContext obj_context); + extern Size EOH_get_flat_size(ExpandedObjectHeader *eohptr); + extern void EOH_flatten_into(ExpandedObjectHeader *eohptr, + void *result, Size allocated_size); + extern bool DatumIsReadWriteExpandedObject(Datum d, bool isnull, int16 typlen); + extern Datum MakeExpandedObjectReadOnly(Datum d, bool isnull, int16 typlen); + extern Datum TransferExpandedObject(Datum d, MemoryContext new_parent); + extern void DeleteExpandedObject(Datum d); + + #endif /* EXPANDEDDATUM_H */ diff --git a/src/pl/plpgsql/src/pl_comp.c b/src/pl/plpgsql/src/pl_comp.c index f364ce4..d021145 100644 *** a/src/pl/plpgsql/src/pl_comp.c --- b/src/pl/plpgsql/src/pl_comp.c *************** build_datatype(HeapTuple typeTup, int32 *** 2202,2207 **** --- 2202,2223 ---- typ->typbyval = typeStruct->typbyval; typ->typrelid = typeStruct->typrelid; typ->typioparam = getTypeIOParam(typeTup); + /* Detect if type is true array, or domain thereof */ + /* NB: this is only used to decide whether to apply expand_array */ + if (typeStruct->typtype == TYPTYPE_BASE) + { + /* this test should match what get_element_type() checks */ + typ->typisarray = (typeStruct->typlen == -1 && + OidIsValid(typeStruct->typelem)); + } + else if (typeStruct->typtype == TYPTYPE_DOMAIN) + { + /* we can short-circuit looking up base types if it's not varlena */ + typ->typisarray = (typeStruct->typlen == -1 && + OidIsValid(get_base_element_type(typeStruct->typbasetype))); + } + else + typ->typisarray = false; typ->collation = typeStruct->typcollation; if (OidIsValid(collation) && OidIsValid(typ->collation)) typ->collation = collation; diff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c index ae5421f..ae68afd 100644 *** a/src/pl/plpgsql/src/pl_exec.c --- b/src/pl/plpgsql/src/pl_exec.c *************** *** 32,37 **** --- 32,38 ---- #include "utils/array.h" #include "utils/builtins.h" #include "utils/datum.h" + #include "utils/expandeddatum.h" #include "utils/lsyscache.h" #include "utils/memutils.h" #include "utils/rel.h" *************** static void exec_assign_value(PLpgSQL_ex *** 171,176 **** --- 172,178 ---- Datum value, Oid valtype, bool *isNull); static void exec_eval_datum(PLpgSQL_execstate *estate, PLpgSQL_datum *datum, + bool getrwpointer, Oid *typeid, int32 *typetypmod, Datum *value, *************** plpgsql_exec_function(PLpgSQL_function * *** 295,300 **** --- 297,310 ---- var->value = fcinfo->arg[i]; var->isnull = fcinfo->argnull[i]; var->freeval = false; + /* Hack to force array parameters into expanded form */ + if (!var->isnull && var->datatype->typisarray) + { + var->value = expand_array(var->value, + CurrentMemoryContext, + 0, 0, 0); + var->freeval = true; + } } break; *************** plpgsql_exec_function(PLpgSQL_function * *** 461,478 **** /* * If the function's return type isn't by value, copy the value ! * into upper executor memory context. */ if (!fcinfo->isnull && !func->fn_retbyval) ! { ! Size len; ! void *tmp; ! ! len = datumGetSize(estate.retval, false, func->fn_rettyplen); ! tmp = SPI_palloc(len); ! memcpy(tmp, DatumGetPointer(estate.retval), len); ! estate.retval = PointerGetDatum(tmp); ! } } } --- 471,484 ---- /* * If the function's return type isn't by value, copy the value ! * into upper executor memory context. However, if we have a R/W ! * expanded datum, we can just transfer its ownership out to the ! * upper executor context. */ if (!fcinfo->isnull && !func->fn_retbyval) ! estate.retval = SPI_datumTransfer(estate.retval, ! false, ! func->fn_rettyplen); } } *************** exec_assign_value(PLpgSQL_execstate *est *** 4059,4084 **** /* * If type is by-reference, copy the new value (which is * probably in the eval_econtext) into the procedure's memory ! * context. */ if (!var->datatype->typbyval && !*isNull) ! newvalue = datumCopy(newvalue, ! false, ! var->datatype->typlen); /* ! * Now free the old value. (We can't do this any earlier ! * because of the possibility that we are assigning the var's ! * old value to it, eg "foo := foo". We could optimize out ! * the assignment altogether in such cases, but it's too ! * infrequent to be worth testing for.) */ ! free_var(var); var->value = newvalue; var->isnull = *isNull; ! if (!var->datatype->typbyval && !*isNull) ! var->freeval = true; break; } --- 4065,4115 ---- /* * If type is by-reference, copy the new value (which is * probably in the eval_econtext) into the procedure's memory ! * context. But if it's a read/write reference to an expanded ! * object, no physical copy needs to happen; at most we need ! * to reparent the object's memory context. */ if (!var->datatype->typbyval && !*isNull) ! { ! newvalue = datumTransfer(newvalue, ! false, ! var->datatype->typlen); ! ! /* ! * If it's an array, force the value to be stored in ! * expanded form. This wins if the function later does, ! * eg, a lot of array subscripting operations on the ! * variable, and otherwise might lose badly. We might ! * need to use a different heuristic, but it's too soon to ! * tell. Also, what of cases where it'd be useful to ! * force non-array values into expanded form? ! */ ! if (var->datatype->typisarray && ! !VARATT_IS_EXTERNAL_EXPANDED_RW(DatumGetPointer(newvalue))) ! { ! Datum avalue; ! ! avalue = expand_array(newvalue, CurrentMemoryContext, ! 0, 0, 0); ! pfree(DatumGetPointer(newvalue)); ! newvalue = avalue; ! } ! } /* ! * Now free the old value, unless it's the same as the new ! * value (ie, we're doing "foo := foo"). Note that for ! * expanded objects, this test is necessary and cannot ! * reliably be made any earlier; we have to be looking at the ! * object's standard R/W pointer to be sure pointer equality ! * is meaningful. */ ! if (var->value != newvalue || var->isnull || *isNull) ! free_var(var); var->value = newvalue; var->isnull = *isNull; ! var->freeval = (!var->datatype->typbyval && !*isNull); break; } *************** exec_assign_value(PLpgSQL_execstate *est *** 4276,4282 **** } while (target->dtype == PLPGSQL_DTYPE_ARRAYELEM); /* Fetch current value of array datum */ ! exec_eval_datum(estate, target, &parenttypoid, &parenttypmod, &oldarraydatum, &oldarrayisnull); --- 4307,4313 ---- } while (target->dtype == PLPGSQL_DTYPE_ARRAYELEM); /* Fetch current value of array datum */ ! exec_eval_datum(estate, target, true, &parenttypoid, &parenttypmod, &oldarraydatum, &oldarrayisnull); *************** exec_assign_value(PLpgSQL_execstate *est *** 4424,4439 **** * * The type oid, typmod, value in Datum format, and null flag are returned. * * At present this doesn't handle PLpgSQL_expr or PLpgSQL_arrayelem datums. * ! * NOTE: caller must not modify the returned value, since it points right ! * at the stored value in the case of pass-by-reference datatypes. In some ! * cases we have to palloc a return value, and in such cases we put it into ! * the estate's short-term memory context. */ static void exec_eval_datum(PLpgSQL_execstate *estate, PLpgSQL_datum *datum, Oid *typeid, int32 *typetypmod, Datum *value, --- 4455,4474 ---- * * The type oid, typmod, value in Datum format, and null flag are returned. * + * If getrwpointer is TRUE, we'll return a R/W pointer to any variable that + * is an expanded object; otherwise we return a R/O pointer. + * * At present this doesn't handle PLpgSQL_expr or PLpgSQL_arrayelem datums. * ! * NOTE: in most cases caller must not modify the returned value, since ! * it points right at the stored value in the case of pass-by-reference ! * datatypes. In some cases we have to palloc a return value, and in such ! * cases we put it into the estate's short-term memory context. */ static void exec_eval_datum(PLpgSQL_execstate *estate, PLpgSQL_datum *datum, + bool getrwpointer, Oid *typeid, int32 *typetypmod, Datum *value, *************** exec_eval_datum(PLpgSQL_execstate *estat *** 4449,4455 **** *typeid = var->datatype->typoid; *typetypmod = var->datatype->atttypmod; ! *value = var->value; *isnull = var->isnull; break; } --- 4484,4495 ---- *typeid = var->datatype->typoid; *typetypmod = var->datatype->atttypmod; ! if (getrwpointer) ! *value = var->value; ! else ! *value = MakeExpandedObjectReadOnly(var->value, ! var->isnull, ! var->datatype->typlen); *isnull = var->isnull; break; } *************** setup_param_list(PLpgSQL_execstate *esta *** 5285,5291 **** PLpgSQL_var *var = (PLpgSQL_var *) datum; ParamExternData *prm = ¶mLI->params[dno]; ! prm->value = var->value; prm->isnull = var->isnull; prm->pflags = PARAM_FLAG_CONST; prm->ptype = var->datatype->typoid; --- 5325,5333 ---- PLpgSQL_var *var = (PLpgSQL_var *) datum; ParamExternData *prm = ¶mLI->params[dno]; ! prm->value = MakeExpandedObjectReadOnly(var->value, ! var->isnull, ! var->datatype->typlen); prm->isnull = var->isnull; prm->pflags = PARAM_FLAG_CONST; prm->ptype = var->datatype->typoid; *************** plpgsql_param_fetch(ParamListInfo params *** 5351,5357 **** /* OK, evaluate the value and store into the appropriate paramlist slot */ datum = estate->datums[dno]; prm = ¶ms->params[dno]; ! exec_eval_datum(estate, datum, &prm->ptype, &prmtypmod, &prm->value, &prm->isnull); } --- 5393,5399 ---- /* OK, evaluate the value and store into the appropriate paramlist slot */ datum = estate->datums[dno]; prm = ¶ms->params[dno]; ! exec_eval_datum(estate, datum, false, &prm->ptype, &prmtypmod, &prm->value, &prm->isnull); } *************** make_tuple_from_row(PLpgSQL_execstate *e *** 5543,5549 **** if (row->varnos[i] < 0) /* should not happen */ elog(ERROR, "dropped rowtype entry for non-dropped column"); ! exec_eval_datum(estate, estate->datums[row->varnos[i]], &fieldtypeid, &fieldtypmod, &dvalues[i], &nulls[i]); if (fieldtypeid != tupdesc->attrs[i]->atttypid) --- 5585,5591 ---- if (row->varnos[i] < 0) /* should not happen */ elog(ERROR, "dropped rowtype entry for non-dropped column"); ! exec_eval_datum(estate, estate->datums[row->varnos[i]], false, &fieldtypeid, &fieldtypmod, &dvalues[i], &nulls[i]); if (fieldtypeid != tupdesc->attrs[i]->atttypid) *************** free_var(PLpgSQL_var *var) *** 6336,6342 **** { if (var->freeval) { ! pfree(DatumGetPointer(var->value)); var->freeval = false; } } --- 6378,6389 ---- { if (var->freeval) { ! if (DatumIsReadWriteExpandedObject(var->value, ! var->isnull, ! var->datatype->typlen)) ! DeleteExpandedObject(var->value); ! else ! pfree(DatumGetPointer(var->value)); var->freeval = false; } } *************** format_expr_params(PLpgSQL_execstate *es *** 6543,6550 **** curvar = (PLpgSQL_var *) estate->datums[dno]; ! exec_eval_datum(estate, (PLpgSQL_datum *) curvar, ¶mtypeid, ! ¶mtypmod, ¶mdatum, ¶misnull); appendStringInfo(¶mstr, "%s%s = ", paramno > 0 ? ", " : "", --- 6590,6598 ---- curvar = (PLpgSQL_var *) estate->datums[dno]; ! exec_eval_datum(estate, (PLpgSQL_datum *) curvar, false, ! ¶mtypeid, ¶mtypmod, ! ¶mdatum, ¶misnull); appendStringInfo(¶mstr, "%s%s = ", paramno > 0 ? ", " : "", diff --git a/src/pl/plpgsql/src/plpgsql.h b/src/pl/plpgsql/src/plpgsql.h index 00f2f77..c95087c 100644 *** a/src/pl/plpgsql/src/plpgsql.h --- b/src/pl/plpgsql/src/plpgsql.h *************** typedef struct *** 180,185 **** --- 180,186 ---- bool typbyval; Oid typrelid; Oid typioparam; + bool typisarray; /* is "true" array, or domain over one */ Oid collation; /* from pg_type, but can be overridden */ FmgrInfo typinput; /* lookup info for typinput function */ int32 atttypmod; /* typmod (taken from someplace else) */
On Sun, Feb 15, 2015 at 6:41 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I'm going to stick this into the commitfest even though it's not really > close to being committable; I see some other people doing likewise with > their pet patches ;-). What it could particularly do with some reviewing > help on is exploring the performance changes it creates; what cases does > it make substantially worse? It's perfectly reasonable to add stuff that isn't committable yet to the CF app; the point of the CF app is to track what needs reviewing. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Attached is an 0.3 version, rebased over today's HEAD changes (applies to commit 9e3ad1aac52454569393a947c06be0d301749362 or later), and with some better logic for transferring expanded array values into and out of plpgsql functions. Using this example: create or replace function arraysetnum(n int) returns numeric[] as $$ declare res numeric[] := '{}'; begin for i in 1 .. n loop res[i] := i; end loop; return res; end $$ language plpgsql strict; create or replace function arraysumnum(arr numeric[]) returns numeric as $$ declare res numeric := 0; begin for i in array_lower(arr, 1) .. array_upper(arr, 1) loop res := res + arr[i]; end loop; return res; end $$ language plpgsql strict; create or replace function arraytimenum(n int) returns numeric as $$ declare tmp numeric[]; begin tmp := arraysetnum(n); return arraysumnum(tmp); end $$ language plpgsql strict; either of the test cases select arraysumnum(arraysetnum(100000)); select arraytimenum(100000); involve exactly one coercion from flat to expanded array (during the initial assignment of the '{}' constant to the "res" variable), no coercions from expanded to flat, and no bulk copy operations. So I'm starting to feel fairly good about this. Obviously there's a nontrivial amount of work to do with integrating the array-code changes and teaching the rest of the array functions about expanded arrays (or at least as many of them as seem performance-critical). But that looks like just a few days of basically-mechanical effort. A larger question is what we ought to do about extending the array-favoring hacks in plpgsql to support this type of optimization for non-built-in types. Realize that what this patch is able to improve are basically two types of cases: * nests of function calls that take and return the same complex datatype, think foo(bar(baz(x))), where x is stored in some flat format but foo() bar() and baz() all agree on an expanded format that's easier to process. * plpgsql variables stored in an expanded format that's easier to process for most functions that might work with their values. The first case can be implemented by mutual agreement among the functions of the datatype; it does not need any additional help beyond what's in this patch. But the second case does not work very well unless plpgsql takes some proactive step to force variable values into the expanded format. Otherwise you get a win only if the last assignment to the variable happened to come from a source that supplied a read-write expanded value. You can make that happen with appropriate coding in the plpgsql function, of course, but it's klugy to have to do that. I would not be ashamed to ship this in 9.5 as just an array optimization and leave the larger question for next time ... but it does feel a bit unfinished like this. OTOH, I'm not sure whether the PostGIS folk care all that much about the intermediate-values-in-plpgsql-variables scenario. They didn't bring it up in the discussion a year or so back about their requirements. We do know very well that plpgsql array variables are a performance pain point, so maybe fixing that is enough of a goal for 9.5. (BTW, the nested-function-calls case sure seems like it's dead center of the wheelhouse for JSONB. Just sayin'. I do not myself have time to think about applying this technology to JSONB right now, but does anyone else want to step up?) regards, tom lane diff --git a/src/backend/access/common/heaptuple.c b/src/backend/access/common/heaptuple.c index 867035d..860ad78 100644 *** a/src/backend/access/common/heaptuple.c --- b/src/backend/access/common/heaptuple.c *************** *** 60,65 **** --- 60,66 ---- #include "access/sysattr.h" #include "access/tuptoaster.h" #include "executor/tuptable.h" + #include "utils/expandeddatum.h" /* Does att's datatype allow packing into the 1-byte-header varlena format? */ *************** heap_compute_data_size(TupleDesc tupleDe *** 93,105 **** for (i = 0; i < numberOfAttributes; i++) { Datum val; if (isnull[i]) continue; val = values[i]; ! if (ATT_IS_PACKABLE(att[i]) && VARATT_CAN_MAKE_SHORT(DatumGetPointer(val))) { /* --- 94,108 ---- for (i = 0; i < numberOfAttributes; i++) { Datum val; + Form_pg_attribute atti; if (isnull[i]) continue; val = values[i]; + atti = att[i]; ! if (ATT_IS_PACKABLE(atti) && VARATT_CAN_MAKE_SHORT(DatumGetPointer(val))) { /* *************** heap_compute_data_size(TupleDesc tupleDe *** 108,118 **** */ data_length += VARATT_CONVERTED_SHORT_SIZE(DatumGetPointer(val)); } else { ! data_length = att_align_datum(data_length, att[i]->attalign, ! att[i]->attlen, val); ! data_length = att_addlength_datum(data_length, att[i]->attlen, val); } } --- 111,131 ---- */ data_length += VARATT_CONVERTED_SHORT_SIZE(DatumGetPointer(val)); } + else if (atti->attlen == -1 && + VARATT_IS_EXTERNAL_EXPANDED(DatumGetPointer(val))) + { + /* + * we want to flatten the expanded value so that the constructed + * tuple doesn't depend on it + */ + data_length = att_align_nominal(data_length, atti->attalign); + data_length += EOH_get_flat_size(DatumGetEOHP(val)); + } else { ! data_length = att_align_datum(data_length, atti->attalign, ! atti->attlen, val); ! data_length = att_addlength_datum(data_length, atti->attlen, val); } } *************** heap_fill_tuple(TupleDesc tupleDesc, *** 203,212 **** *infomask |= HEAP_HASVARWIDTH; if (VARATT_IS_EXTERNAL(val)) { ! *infomask |= HEAP_HASEXTERNAL; ! /* no alignment, since it's short by definition */ ! data_length = VARSIZE_EXTERNAL(val); ! memcpy(data, val, data_length); } else if (VARATT_IS_SHORT(val)) { --- 216,241 ---- *infomask |= HEAP_HASVARWIDTH; if (VARATT_IS_EXTERNAL(val)) { ! if (VARATT_IS_EXTERNAL_EXPANDED(val)) ! { ! /* ! * we want to flatten the expanded value so that the ! * constructed tuple doesn't depend on it ! */ ! ExpandedObjectHeader *eoh = DatumGetEOHP(values[i]); ! ! data = (char *) att_align_nominal(data, ! att[i]->attalign); ! data_length = EOH_get_flat_size(eoh); ! EOH_flatten_into(eoh, data, data_length); ! } ! else ! { ! *infomask |= HEAP_HASEXTERNAL; ! /* no alignment, since it's short by definition */ ! data_length = VARSIZE_EXTERNAL(val); ! memcpy(data, val, data_length); ! } } else if (VARATT_IS_SHORT(val)) { diff --git a/src/backend/access/heap/tuptoaster.c b/src/backend/access/heap/tuptoaster.c index f8c1401..ebcbbc4 100644 *** a/src/backend/access/heap/tuptoaster.c --- b/src/backend/access/heap/tuptoaster.c *************** *** 37,42 **** --- 37,43 ---- #include "catalog/catalog.h" #include "common/pg_lzcompress.h" #include "miscadmin.h" + #include "utils/expandeddatum.h" #include "utils/fmgroids.h" #include "utils/rel.h" #include "utils/typcache.h" *************** heap_tuple_fetch_attr(struct varlena * a *** 130,135 **** --- 131,149 ---- result = (struct varlena *) palloc(VARSIZE_ANY(attr)); memcpy(result, attr, VARSIZE_ANY(attr)); } + else if (VARATT_IS_EXTERNAL_EXPANDED(attr)) + { + /* + * This is an expanded-object pointer --- get flat format + */ + ExpandedObjectHeader *eoh; + Size resultsize; + + eoh = DatumGetEOHP(PointerGetDatum(attr)); + resultsize = EOH_get_flat_size(eoh); + result = (struct varlena *) palloc(resultsize); + EOH_flatten_into(eoh, (void *) result, resultsize); + } else { /* *************** heap_tuple_untoast_attr(struct varlena * *** 196,201 **** --- 210,224 ---- attr = result; } } + else if (VARATT_IS_EXTERNAL_EXPANDED(attr)) + { + /* + * This is an expanded-object pointer --- get flat format + */ + attr = heap_tuple_fetch_attr(attr); + /* flatteners are not allowed to produce compressed/short output */ + Assert(!VARATT_IS_EXTENDED(attr)); + } else if (VARATT_IS_COMPRESSED(attr)) { /* *************** heap_tuple_untoast_attr_slice(struct var *** 263,268 **** --- 286,296 ---- return heap_tuple_untoast_attr_slice(redirect.pointer, sliceoffset, slicelength); } + else if (VARATT_IS_EXTERNAL_EXPANDED(attr)) + { + /* pass it off to heap_tuple_fetch_attr to flatten */ + preslice = heap_tuple_fetch_attr(attr); + } else preslice = attr; *************** toast_raw_datum_size(Datum value) *** 344,349 **** --- 372,381 ---- return toast_raw_datum_size(PointerGetDatum(toast_pointer.pointer)); } + else if (VARATT_IS_EXTERNAL_EXPANDED(attr)) + { + result = EOH_get_flat_size(DatumGetEOHP(value)); + } else if (VARATT_IS_COMPRESSED(attr)) { /* here, va_rawsize is just the payload size */ *************** toast_datum_size(Datum value) *** 400,405 **** --- 432,441 ---- return toast_datum_size(PointerGetDatum(toast_pointer.pointer)); } + else if (VARATT_IS_EXTERNAL_EXPANDED(attr)) + { + result = EOH_get_flat_size(DatumGetEOHP(value)); + } else if (VARATT_IS_SHORT(attr)) { result = VARSIZE_SHORT(attr); diff --git a/src/backend/executor/execTuples.c b/src/backend/executor/execTuples.c index 753754d..a05d8b1 100644 *** a/src/backend/executor/execTuples.c --- b/src/backend/executor/execTuples.c *************** *** 88,93 **** --- 88,94 ---- #include "nodes/nodeFuncs.h" #include "storage/bufmgr.h" #include "utils/builtins.h" + #include "utils/expandeddatum.h" #include "utils/lsyscache.h" #include "utils/typcache.h" *************** ExecCopySlot(TupleTableSlot *dstslot, Tu *** 812,817 **** --- 813,864 ---- return ExecStoreTuple(newTuple, dstslot, InvalidBuffer, true); } + /* -------------------------------- + * ExecMakeSlotContentsReadOnly + * Mark any R/W expanded datums in the slot as read-only. + * + * This is needed when a slot that might contain R/W datum references is to be + * used as input for general expression evaluation. Since the expression(s) + * might contain more than one Var referencing the same R/W datum, we could + * get wrong answers if functions acting on those Vars thought they could + * modify the expanded value in-place. + * + * For notational reasons, we return the same slot passed in. + * -------------------------------- + */ + TupleTableSlot * + ExecMakeSlotContentsReadOnly(TupleTableSlot *slot) + { + /* + * sanity checks + */ + Assert(slot != NULL); + Assert(slot->tts_tupleDescriptor != NULL); + Assert(!slot->tts_isempty); + + /* + * If the slot contains a physical tuple, it can't contain any expanded + * datums, because we flatten those when making a physical tuple. This + * might change later; but for now, we need do nothing unless the slot is + * virtual. + */ + if (slot->tts_tuple == NULL) + { + Form_pg_attribute *att = slot->tts_tupleDescriptor->attrs; + int attnum; + + for (attnum = 0; attnum < slot->tts_nvalid; attnum++) + { + slot->tts_values[attnum] = + MakeExpandedObjectReadOnly(slot->tts_values[attnum], + slot->tts_isnull[attnum], + att[attnum]->attlen); + } + } + + return slot; + } + /* ---------------------------------------------------------------- * convenience initialization routines diff --git a/src/backend/executor/nodeSubqueryscan.c b/src/backend/executor/nodeSubqueryscan.c index 3f66e24..e5d1e54 100644 *** a/src/backend/executor/nodeSubqueryscan.c --- b/src/backend/executor/nodeSubqueryscan.c *************** SubqueryNext(SubqueryScanState *node) *** 56,62 **** --- 56,70 ---- * We just return the subplan's result slot, rather than expending extra * cycles for ExecCopySlot(). (Our own ScanTupleSlot is used only for * EvalPlanQual rechecks.) + * + * We do need to mark the slot contents read-only to prevent interference + * between different functions reading the same datum from the slot. It's + * a bit hokey to do this to the subplan's slot, but should be safe + * enough. */ + if (!TupIsNull(slot)) + slot = ExecMakeSlotContentsReadOnly(slot); + return slot; } diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c index 4b86e91..daa9f69 100644 *** a/src/backend/executor/spi.c --- b/src/backend/executor/spi.c *************** SPI_pfree(void *pointer) *** 1014,1019 **** --- 1014,1040 ---- pfree(pointer); } + Datum + SPI_datumTransfer(Datum value, bool typByVal, int typLen) + { + MemoryContext oldcxt = NULL; + Datum result; + + if (_SPI_curid + 1 == _SPI_connected) /* connected */ + { + if (_SPI_current != &(_SPI_stack[_SPI_curid + 1])) + elog(ERROR, "SPI stack corrupted"); + oldcxt = MemoryContextSwitchTo(_SPI_current->savedcxt); + } + + result = datumTransfer(value, typByVal, typLen); + + if (oldcxt) + MemoryContextSwitchTo(oldcxt); + + return result; + } + void SPI_freetuple(HeapTuple tuple) { diff --git a/src/backend/utils/adt/Makefile b/src/backend/utils/adt/Makefile index 20e5ff1..d1ed33f 100644 *** a/src/backend/utils/adt/Makefile --- b/src/backend/utils/adt/Makefile *************** endif *** 16,25 **** endif # keep this list arranged alphabetically or it gets to be a mess ! OBJS = acl.o arrayfuncs.o array_selfuncs.o array_typanalyze.o \ ! array_userfuncs.o arrayutils.o ascii.o bool.o \ ! cash.o char.o date.o datetime.o datum.o dbsize.o domains.o \ ! encode.o enum.o float.o format_type.o formatting.o genfile.o \ geo_ops.o geo_selfuncs.o inet_cidr_ntop.o inet_net_pton.o int.o \ int8.o json.o jsonb.o jsonb_gin.o jsonb_op.o jsonb_util.o \ jsonfuncs.o like.o lockfuncs.o mac.o misc.o nabstime.o name.o \ --- 16,26 ---- endif # keep this list arranged alphabetically or it gets to be a mess ! OBJS = acl.o arrayfuncs.o array_expanded.o array_selfuncs.o \ ! array_typanalyze.o array_userfuncs.o arrayutils.o ascii.o \ ! bool.o cash.o char.o date.o datetime.o datum.o dbsize.o domains.o \ ! encode.o enum.o expandeddatum.o \ ! float.o format_type.o formatting.o genfile.o \ geo_ops.o geo_selfuncs.o inet_cidr_ntop.o inet_net_pton.o int.o \ int8.o json.o jsonb.o jsonb_gin.o jsonb_op.o jsonb_util.o \ jsonfuncs.o like.o lockfuncs.o mac.o misc.o nabstime.o name.o \ diff --git a/src/backend/utils/adt/array_expanded.c b/src/backend/utils/adt/array_expanded.c index ...4879a75 . *** a/src/backend/utils/adt/array_expanded.c --- b/src/backend/utils/adt/array_expanded.c *************** *** 0 **** --- 1,1101 ---- + /*------------------------------------------------------------------------- + * + * array_expanded.c + * Functions for manipulating expanded arrays. + * + * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group + * Portions Copyright (c) 1994, Regents of the University of California + * + * + * IDENTIFICATION + * src/backend/utils/adt/array_expanded.c + * + *------------------------------------------------------------------------- + */ + #include "postgres.h" + + #include "access/tupmacs.h" + #include "utils/array.h" + #include "utils/builtins.h" + #include "utils/datum.h" + #include "utils/expandeddatum.h" + #include "utils/lsyscache.h" + #include "utils/memutils.h" + + + /* + * An expanded array is contained within a private memory context (as + * all expanded objects must be) and has a control structure as below. + * + * The expanded array might contain a regular "flat" array if that was the + * original input and we've not modified it significantly. Otherwise, the + * contents are represented by Datum/isnull arrays plus dimensionality and + * type information. We could also have both forms, if we've deconstructed + * the original array for access purposes but not yet changed it. For pass- + * by-reference element types, the Datums would point into the flat array in + * this situation. Once we start modifying array elements, new pass-by-ref + * elements are separately palloc'd within the memory context. + */ + #define EA_MAGIC 689375833 /* ID for debugging crosschecks */ + + typedef struct ExpandedArrayHeader + { + /* Standard header for expanded objects */ + ExpandedObjectHeader hdr; + + /* Magic value identifying an expanded array (for debugging only) */ + int ea_magic; + + /* Dimensionality info (always valid) */ + int ndims; /* # of dimensions */ + int *dims; /* array dimensions */ + int *lbound; /* index lower bounds for each dimension */ + + /* Element type info (always valid) */ + Oid element_type; /* element type OID */ + int16 typlen; /* needed info about element datatype */ + bool typbyval; + char typalign; + + /* + * If we have a Datum-array representation of the array, it's kept here; + * else dvalues/dnulls are NULL. The dvalues and dnulls arrays are always + * palloc'd within the object private context, but may change size from + * time to time. For pass-by-ref element types, dvalues entries might + * point either into the fstartptr..fendptr area, or to separately + * palloc'd chunks. Elements should always be fully detoasted, as they + * are in the standard flat representation. + * + * Even when dvalues is valid, dnulls can be NULL if there are no null + * elements. + */ + Datum *dvalues; /* array of Datums */ + bool *dnulls; /* array of is-null flags for Datums */ + int dvalueslen; /* allocated length of above arrays */ + int nelems; /* number of valid entries in above arrays */ + + /* + * flat_size is the current space requirement for the flat equivalent of + * the expanded array, if known; otherwise it's 0. We store this to make + * consecutive calls of get_flat_size cheap. + */ + Size flat_size; + + /* + * fvalue points to the flat representation if it is valid, else it is + * NULL. If we have or ever had a flat representation then + * fstartptr/fendptr point to the start and end+1 of its data area; this + * is so that we can tell which Datum pointers point into the flat + * representation rather than being pointers to separately palloc'd data. + */ + ArrayType *fvalue; /* must be a fully detoasted array */ + char *fstartptr; /* start of its data area */ + char *fendptr; /* end+1 of its data area */ + } ExpandedArrayHeader; + + /* "Methods" required for an expanded object */ + static Size EA_get_flat_size(ExpandedObjectHeader *eohptr); + static void EA_flatten_into(ExpandedObjectHeader *eohptr, + void *result, Size allocated_size); + + static const ExpandedObjectMethods EA_methods = + { + EA_get_flat_size, + EA_flatten_into + }; + + /* + * Functions that can handle either a "flat" varlena array or an expanded + * array use this union to work with their input. + */ + typedef union AnyArrayType + { + ArrayType flt; + ExpandedArrayHeader xpn; + } AnyArrayType; + + /* + * Macros for working with AnyArrayType inputs. Beware multiple references! + */ + #define AARR_NDIM(a) \ + (VARATT_IS_EXPANDED_HEADER(a) ? (a)->xpn.ndims : ARR_NDIM(&(a)->flt)) + #define AARR_HASNULL(a) \ + (VARATT_IS_EXPANDED_HEADER(a) ? \ + ((a)->xpn.dvalues != NULL ? (a)->xpn.dnulls != NULL : ARR_HASNULL((a)->xpn.fvalue)) : \ + ARR_HASNULL(&(a)->flt)) + #define AARR_ELEMTYPE(a) \ + (VARATT_IS_EXPANDED_HEADER(a) ? (a)->xpn.element_type : ARR_ELEMTYPE(&(a)->flt)) + #define AARR_DIMS(a) \ + (VARATT_IS_EXPANDED_HEADER(a) ? (a)->xpn.dims : ARR_DIMS(&(a)->flt)) + #define AARR_LBOUND(a) \ + (VARATT_IS_EXPANDED_HEADER(a) ? (a)->xpn.lbound : ARR_LBOUND(&(a)->flt)) + + + /* + * expand_array: convert an array Datum into an expanded array + * + * The expanded object will be a child of parentcontext. + * + * Caller can provide element type's representational data; we do that because + * caller is often in a position to cache it across repeated calls. If the + * caller can't do that, pass zeroes for elmlen/elmbyval/elmalign. + */ + Datum + expand_array(Datum arraydatum, MemoryContext parentcontext, + int elmlen, bool elmbyval, char elmalign) + { + ArrayType *array; + ExpandedArrayHeader *eah; + MemoryContext objcxt; + MemoryContext oldcxt; + + /* allocate private context for expanded object */ + /* TODO: should we use some other memory context size parameters? */ + objcxt = AllocSetContextCreate(parentcontext, + "expanded array", + ALLOCSET_DEFAULT_MINSIZE, + ALLOCSET_DEFAULT_INITSIZE, + ALLOCSET_DEFAULT_MAXSIZE); + + /* set up expanded array header */ + eah = (ExpandedArrayHeader *) + MemoryContextAlloc(objcxt, sizeof(ExpandedArrayHeader)); + + EOH_init_header(&eah->hdr, &EA_methods, objcxt); + eah->ea_magic = EA_MAGIC; + + /* + * Detoast and copy original array into private context, as a flat array. + * We flatten it even if it's in expanded form; it's not clear that adding + * a special-case path for that would be worth the trouble. + * + * Note that this coding risks leaking some memory in the private context + * if we have to fetch data from a TOAST table; however, experimentation + * says that the leak is minimal. Doing it this way saves a copy step, + * which seems worthwhile, especially if the array is large enough to need + * external storage. + */ + oldcxt = MemoryContextSwitchTo(objcxt); + array = DatumGetArrayTypePCopy(arraydatum); + MemoryContextSwitchTo(oldcxt); + + eah->ndims = ARR_NDIM(array); + /* note these pointers point into the fvalue header! */ + eah->dims = ARR_DIMS(array); + eah->lbound = ARR_LBOUND(array); + + /* save array's element-type data for possible use later */ + eah->element_type = ARR_ELEMTYPE(array); + if (elmlen) + { + /* Caller provided representational data */ + eah->typlen = elmlen; + eah->typbyval = elmbyval; + eah->typalign = elmalign; + } + else + { + /* No, so look it up */ + get_typlenbyvalalign(eah->element_type, + &eah->typlen, + &eah->typbyval, + &eah->typalign); + } + + /* we don't make a deconstructed representation now */ + eah->dvalues = NULL; + eah->dnulls = NULL; + eah->dvalueslen = 0; + eah->nelems = 0; + eah->flat_size = 0; + + /* remember we have a flat representation */ + eah->fvalue = array; + eah->fstartptr = ARR_DATA_PTR(array); + eah->fendptr = ((char *) array) + ARR_SIZE(array); + + /* return a R/W pointer to the expanded array */ + return PointerGetDatum(eah->hdr.eoh_rw_ptr); + } + + /* + * construct_empty_expanded_array: make an empty expanded array + * given only type information. (elmlen etc can be zeroes.) + */ + static ExpandedArrayHeader * + construct_empty_expanded_array(Oid element_type, + MemoryContext parentcontext, + int elmlen, bool elmbyval, char elmalign) + { + ArrayType *array = construct_empty_array(element_type); + Datum d; + + d = expand_array(PointerGetDatum(array), parentcontext, + elmlen, elmbyval, elmalign); + return (ExpandedArrayHeader *) DatumGetEOHP(d); + } + + + /* + * get_flat_size method for expanded arrays + */ + static Size + EA_get_flat_size(ExpandedObjectHeader *eohptr) + { + ExpandedArrayHeader *eah = (ExpandedArrayHeader *) eohptr; + int nelems; + int ndims; + Datum *dvalues; + bool *dnulls; + Size nbytes; + int i; + + Assert(eah->ea_magic == EA_MAGIC); + + /* Easy if we have a valid flattened value */ + if (eah->fvalue) + return ARR_SIZE(eah->fvalue); + + /* If we have a cached size value, believe that */ + if (eah->flat_size) + return eah->flat_size; + + /* + * Compute space needed by examining dvalues/dnulls. Note that the result + * array will have a nulls bitmap if dnulls isn't NULL, even if the array + * doesn't actually contain any nulls now. + */ + nelems = eah->nelems; + ndims = eah->ndims; + Assert(nelems == ArrayGetNItems(ndims, eah->dims)); + dvalues = eah->dvalues; + dnulls = eah->dnulls; + nbytes = 0; + for (i = 0; i < nelems; i++) + { + if (dnulls && dnulls[i]) + continue; + nbytes = att_addlength_datum(nbytes, eah->typlen, dvalues[i]); + nbytes = att_align_nominal(nbytes, eah->typalign); + /* check for overflow of total request */ + if (!AllocSizeIsValid(nbytes)) + ereport(ERROR, + (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED), + errmsg("array size exceeds the maximum allowed (%d)", + (int) MaxAllocSize))); + } + + if (dnulls) + nbytes += ARR_OVERHEAD_WITHNULLS(ndims, nelems); + else + nbytes += ARR_OVERHEAD_NONULLS(ndims); + + /* cache for next time */ + eah->flat_size = nbytes; + + return nbytes; + } + + /* + * flatten_into method for expanded arrays + */ + static void + EA_flatten_into(ExpandedObjectHeader *eohptr, + void *result, Size allocated_size) + { + ExpandedArrayHeader *eah = (ExpandedArrayHeader *) eohptr; + ArrayType *aresult = (ArrayType *) result; + int nelems; + int ndims; + int32 dataoffset; + + Assert(eah->ea_magic == EA_MAGIC); + + /* Easy if we have a valid flattened value */ + if (eah->fvalue) + { + Assert(allocated_size == ARR_SIZE(eah->fvalue)); + memcpy(result, eah->fvalue, allocated_size); + return; + } + + /* Else allocation should match previous get_flat_size result */ + Assert(allocated_size == eah->flat_size); + + /* Fill result array from dvalues/dnulls */ + nelems = eah->nelems; + ndims = eah->ndims; + + if (eah->dnulls) + dataoffset = ARR_OVERHEAD_WITHNULLS(ndims, nelems); + else + dataoffset = 0; /* marker for no null bitmap */ + + /* We must ensure that any pad space is zero-filled */ + memset(aresult, 0, allocated_size); + + SET_VARSIZE(aresult, allocated_size); + aresult->ndim = ndims; + aresult->dataoffset = dataoffset; + aresult->elemtype = eah->element_type; + memcpy(ARR_DIMS(aresult), eah->dims, ndims * sizeof(int)); + memcpy(ARR_LBOUND(aresult), eah->lbound, ndims * sizeof(int)); + + CopyArrayEls(aresult, + eah->dvalues, eah->dnulls, nelems, + eah->typlen, eah->typbyval, eah->typalign, + false); + } + + /* + * Argument fetching support code + */ + + #ifdef NOT_YET_USED + + /* + * DatumGetExpandedArray: get a writable expanded array from an input argument + */ + static ExpandedArrayHeader * + DatumGetExpandedArray(Datum d) + { + ExpandedArrayHeader *eah; + + /* If it's a writable expanded array already, just return it */ + if (VARATT_IS_EXTERNAL_EXPANDED_RW(DatumGetPointer(d))) + { + eah = (ExpandedArrayHeader *) DatumGetEOHP(d); + Assert(eah->ea_magic == EA_MAGIC); + return eah; + } + + /* + * If it's a non-writable expanded array, copy it, extracting the element + * representational data to save a catalog lookup. + */ + if (VARATT_IS_EXTERNAL_EXPANDED_RO(DatumGetPointer(d))) + { + eah = (ExpandedArrayHeader *) DatumGetEOHP(d); + Assert(eah->ea_magic == EA_MAGIC); + d = expand_array(d, CurrentMemoryContext, + eah->typlen, eah->typbyval, eah->typalign); + return (ExpandedArrayHeader *) DatumGetEOHP(d); + } + + /* Else expand the hard way */ + d = expand_array(d, CurrentMemoryContext, 0, 0, 0); + return (ExpandedArrayHeader *) DatumGetEOHP(d); + } + + #define PG_GETARG_EXPANDED_ARRAY(n) DatumGetExpandedArray(PG_GETARG_DATUM(n)) + + #endif + + /* + * As above, when caller has the ability to cache element type info + */ + static ExpandedArrayHeader * + DatumGetExpandedArrayX(Datum d, + int elmlen, bool elmbyval, char elmalign) + { + ExpandedArrayHeader *eah; + + /* If it's a writable expanded array already, just return it */ + if (VARATT_IS_EXTERNAL_EXPANDED_RW(DatumGetPointer(d))) + { + eah = (ExpandedArrayHeader *) DatumGetEOHP(d); + Assert(eah->ea_magic == EA_MAGIC); + Assert(eah->typlen == elmlen); + Assert(eah->typbyval == elmbyval); + Assert(eah->typalign == elmalign); + return eah; + } + + /* Else expand using caller's data */ + d = expand_array(d, CurrentMemoryContext, elmlen, elmbyval, elmalign); + return (ExpandedArrayHeader *) DatumGetEOHP(d); + } + + #define PG_GETARG_EXPANDED_ARRAYX(n, elmlen, elmbyval, elmalign) \ + DatumGetExpandedArrayX(PG_GETARG_DATUM(n), elmlen, elmbyval, elmalign) + + /* + * DatumGetAnyArray: return either an expanded array or a detoasted varlena + * array. The result must not be modified in-place. + */ + static AnyArrayType * + DatumGetAnyArray(Datum d) + { + ExpandedArrayHeader *eah; + + /* + * If it's an expanded array (RW or RO), return the header pointer. + */ + if (VARATT_IS_EXTERNAL_EXPANDED(DatumGetPointer(d))) + { + eah = (ExpandedArrayHeader *) DatumGetEOHP(d); + Assert(eah->ea_magic == EA_MAGIC); + return (AnyArrayType *) eah; + } + + /* Else do regular detoasting as needed */ + return (AnyArrayType *) PG_DETOAST_DATUM(d); + } + + #define PG_GETARG_ANY_ARRAY(n) DatumGetAnyArray(PG_GETARG_DATUM(n)) + + /* + * Create the Datum/isnull representation if we didn't do so previously + */ + static void + deconstruct_expanded_array(ExpandedArrayHeader *eah) + { + if (eah->dvalues == NULL) + { + MemoryContext oldcxt = MemoryContextSwitchTo(eah->hdr.eoh_context); + Datum *dvalues; + bool *dnulls; + int nelems; + + dnulls = NULL; + deconstruct_array(eah->fvalue, + eah->element_type, + eah->typlen, eah->typbyval, eah->typalign, + &dvalues, + ARR_HASNULL(eah->fvalue) ? &dnulls : NULL, + &nelems); + + /* + * Update header only after successful completion of this step. If + * deconstruct_array fails partway through, worst consequence is some + * leaked memory in the object's context. If the caller fails at a + * later point, that's fine, since the deconstructed representation is + * valid anyhow. + */ + eah->dvalues = dvalues; + eah->dnulls = dnulls; + eah->dvalueslen = eah->nelems = nelems; + MemoryContextSwitchTo(oldcxt); + } + } + + /* + * Equivalent of array_get_element() for an expanded array + */ + Datum + array_get_element_expanded(Datum arraydatum, + int nSubscripts, int *indx, + int arraytyplen, + int elmlen, bool elmbyval, char elmalign, + bool *isNull) + { + ExpandedArrayHeader *eah; + int i, + ndim, + *dim, + *lb, + offset; + Datum *dvalues; + bool *dnulls; + + eah = (ExpandedArrayHeader *) DatumGetEOHP(arraydatum); + Assert(eah->ea_magic == EA_MAGIC); + + /* sanity-check caller's info against object */ + Assert(arraytyplen == -1); + Assert(elmlen == eah->typlen); + Assert(elmbyval == eah->typbyval); + Assert(elmalign == eah->typalign); + + ndim = eah->ndims; + dim = eah->dims; + lb = eah->lbound; + + /* + * Return NULL for invalid subscript + */ + if (ndim != nSubscripts || ndim <= 0 || ndim > MAXDIM) + { + *isNull = true; + return (Datum) 0; + } + for (i = 0; i < ndim; i++) + { + if (indx[i] < lb[i] || indx[i] >= (dim[i] + lb[i])) + { + *isNull = true; + return (Datum) 0; + } + } + + /* + * Calculate the element number + */ + offset = ArrayGetOffset(nSubscripts, dim, lb, indx); + + /* + * Deconstruct array if we didn't already. Note that we apply this even + * if the input is nominally read-only: it should be safe enough. + */ + deconstruct_expanded_array(eah); + + dvalues = eah->dvalues; + dnulls = eah->dnulls; + + /* + * Check for NULL array element + */ + if (dnulls && dnulls[offset]) + { + *isNull = true; + return (Datum) 0; + } + + /* + * OK, get the element. It's OK to return a pass-by-ref value as a + * pointer into the expanded array, for the same reason that + * array_get_element can return a pointer into flat arrays: the value is + * assumed not to change for as long as the Datum reference can exist. + */ + *isNull = false; + return dvalues[offset]; + } + + /* + * Equivalent of array_set_element() for an expanded array + * + * array_set_element took care of detoasting dataValue, the rest is up to us + * + * Note: as with any operation on a read/write expanded object, we must + * take pains not to leave the object in a corrupt state if we fail partway + * through. + */ + Datum + array_set_element_expanded(Datum arraydatum, + int nSubscripts, int *indx, + Datum dataValue, bool isNull, + int arraytyplen, + int elmlen, bool elmbyval, char elmalign) + { + ExpandedArrayHeader *eah; + Datum *dvalues; + bool *dnulls; + int i, + ndim, + dim[MAXDIM], + lb[MAXDIM], + offset; + bool dimschanged, + newhasnulls; + int addedbefore, + addedafter; + char *oldValue; + + /* Convert to R/W object if not so already */ + if (!VARATT_IS_EXTERNAL_EXPANDED_RW(DatumGetPointer(arraydatum))) + arraydatum = expand_array(arraydatum, CurrentMemoryContext, + elmlen, elmbyval, elmalign); + + eah = (ExpandedArrayHeader *) DatumGetEOHP(arraydatum); + Assert(eah->ea_magic == EA_MAGIC); + + /* sanity-check caller's info against object */ + Assert(arraytyplen == -1); + Assert(elmlen == eah->typlen); + Assert(elmbyval == eah->typbyval); + Assert(elmalign == eah->typalign); + + /* + * Copy dimension info into local storage. This allows us to modify the + * dimensions if needed, while not messing up the expanded value if we + * fail partway through. + */ + ndim = eah->ndims; + Assert(ndim >= 0 && ndim <= MAXDIM); + memcpy(dim, eah->dims, ndim * sizeof(int)); + memcpy(lb, eah->lbound, ndim * sizeof(int)); + dimschanged = false; + + /* + * if number of dims is zero, i.e. an empty array, create an array with + * nSubscripts dimensions, and set the lower bounds to the supplied + * subscripts. + */ + if (ndim == 0) + { + /* + * Allocate adequate space for new dimension info. This is harmless + * if we fail later. + */ + Assert(nSubscripts > 0 && nSubscripts <= MAXDIM); + eah->dims = (int *) MemoryContextAllocZero(eah->hdr.eoh_context, + nSubscripts * sizeof(int)); + eah->lbound = (int *) MemoryContextAllocZero(eah->hdr.eoh_context, + nSubscripts * sizeof(int)); + + /* Update local copies of dimension info */ + ndim = nSubscripts; + for (i = 0; i < nSubscripts; i++) + { + dim[i] = 0; + lb[i] = indx[i]; + } + dimschanged = true; + } + else if (ndim != nSubscripts) + ereport(ERROR, + (errcode(ERRCODE_ARRAY_SUBSCRIPT_ERROR), + errmsg("wrong number of array subscripts"))); + + /* + * Deconstruct array if we didn't already. (Someday maybe add a special + * case path for fixed-length, no-nulls cases, where we can overwrite an + * element in place without ever deconstructing. But today is not that + * day.) + */ + deconstruct_expanded_array(eah); + + /* + * Copy new element into array's context, if needed (we assume it's + * already detoasted, so no junk should be created). If we fail further + * down, this memory is leaked, but that's reasonably harmless. + */ + if (!eah->typbyval && !isNull) + { + MemoryContext oldcxt = MemoryContextSwitchTo(eah->hdr.eoh_context); + + dataValue = datumCopy(dataValue, false, eah->typlen); + MemoryContextSwitchTo(oldcxt); + } + + dvalues = eah->dvalues; + dnulls = eah->dnulls; + + newhasnulls = ((dnulls != NULL) || isNull); + addedbefore = addedafter = 0; + + /* + * Check subscripts (this logic matches original array_set_element) + */ + if (ndim == 1) + { + if (indx[0] < lb[0]) + { + addedbefore = lb[0] - indx[0]; + dim[0] += addedbefore; + lb[0] = indx[0]; + dimschanged = true; + if (addedbefore > 1) + newhasnulls = true; /* will insert nulls */ + } + if (indx[0] >= (dim[0] + lb[0])) + { + addedafter = indx[0] - (dim[0] + lb[0]) + 1; + dim[0] += addedafter; + dimschanged = true; + if (addedafter > 1) + newhasnulls = true; /* will insert nulls */ + } + } + else + { + /* + * XXX currently we do not support extending multi-dimensional arrays + * during assignment + */ + for (i = 0; i < ndim; i++) + { + if (indx[i] < lb[i] || + indx[i] >= (dim[i] + lb[i])) + ereport(ERROR, + (errcode(ERRCODE_ARRAY_SUBSCRIPT_ERROR), + errmsg("array subscript out of range"))); + } + } + + /* Now we can calculate linear offset of target item in array */ + offset = ArrayGetOffset(nSubscripts, dim, lb, indx); + + /* Physically enlarge existing dvalues/dnulls arrays if needed */ + if (dim[0] > eah->dvalueslen) + { + /* We want some extra space if we're enlarging */ + int newlen = dim[0] + dim[0] / 8; + + eah->dvalues = dvalues = (Datum *) + repalloc(dvalues, newlen * sizeof(Datum)); + if (dnulls) + eah->dnulls = dnulls = (bool *) + repalloc(dnulls, newlen * sizeof(bool)); + eah->dvalueslen = newlen; + } + + /* + * If we need a nulls bitmap and don't already have one, create it, being + * sure to mark all existing entries as not null. + */ + if (newhasnulls && dnulls == NULL) + eah->dnulls = dnulls = (bool *) + MemoryContextAllocZero(eah->hdr.eoh_context, + eah->dvalueslen * sizeof(bool)); + + /* + * We now have all the needed space allocated, so we're ready to make + * irreversible changes. Be very wary of allowing failure below here. + */ + + /* Flattened value will no longer represent array accurately */ + eah->fvalue = NULL; + /* And we don't know the flattened size either */ + eah->flat_size = 0; + + /* Update dimensionality info if needed */ + if (dimschanged) + { + eah->ndims = ndim; + memcpy(eah->dims, dim, ndim * sizeof(int)); + memcpy(eah->lbound, lb, ndim * sizeof(int)); + } + + /* Reposition items if needed, and fill addedbefore items with nulls */ + if (addedbefore > 0) + { + memmove(dvalues + addedbefore, dvalues, eah->nelems * sizeof(Datum)); + for (i = 0; i < addedbefore; i++) + dvalues[i] = (Datum) 0; + if (dnulls) + { + memmove(dnulls + addedbefore, dnulls, eah->nelems * sizeof(bool)); + for (i = 0; i < addedbefore; i++) + dnulls[i] = true; + } + eah->nelems += addedbefore; + } + + /* fill addedafter items with nulls */ + if (addedafter > 0) + { + for (i = 0; i < addedafter; i++) + dvalues[eah->nelems + i] = (Datum) 0; + if (dnulls) + { + for (i = 0; i < addedafter; i++) + dnulls[eah->nelems + i] = true; + } + eah->nelems += addedafter; + } + + /* Grab old element value for pfree'ing, if needed. */ + if (!eah->typbyval && (dnulls == NULL || !dnulls[offset])) + oldValue = (char *) DatumGetPointer(dvalues[offset]); + else + oldValue = NULL; + + /* And finally we can insert the new element. */ + dvalues[offset] = dataValue; + if (dnulls) + dnulls[offset] = isNull; + + /* + * Free old element if needed; this keeps repeated element replacements + * from bloating the array's storage. If the pfree somehow fails, it + * won't corrupt the array. + */ + if (oldValue) + { + /* Don't try to pfree a part of the original flat array */ + if (oldValue < eah->fstartptr || oldValue >= eah->fendptr) + pfree(oldValue); + } + + /* Done, return standard TOAST pointer for object */ + return PointerGetDatum(eah->hdr.eoh_rw_ptr); + } + + /* + * Reimplementation of array_push for expanded arrays + */ + typedef struct ArrayPushState + { + Oid arg0_typeid; + Oid arg1_typeid; + bool array_on_left; + Oid element_type; + int16 typlen; + bool typbyval; + char typalign; + } ArrayPushState; + + Datum + array_push_expanded(PG_FUNCTION_ARGS) + { + ExpandedArrayHeader *eah; + Datum result; + Datum newelem; + bool isNull; + int *dimv, + *lb; + int indx; + int lb0; + Oid element_type; + int16 typlen; + bool typbyval; + char typalign; + Oid arg0_typeid = get_fn_expr_argtype(fcinfo->flinfo, 0); + Oid arg1_typeid = get_fn_expr_argtype(fcinfo->flinfo, 1); + ArrayPushState *my_extra; + + if (arg0_typeid == InvalidOid || arg1_typeid == InvalidOid) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("could not determine input data types"))); + + /* + * We arrange to look up info about element type only once per series of + * calls, assuming the element type doesn't change underneath us. + */ + my_extra = (ArrayPushState *) fcinfo->flinfo->fn_extra; + if (my_extra == NULL) + { + fcinfo->flinfo->fn_extra = MemoryContextAlloc(fcinfo->flinfo->fn_mcxt, + sizeof(ArrayPushState)); + my_extra = (ArrayPushState *) fcinfo->flinfo->fn_extra; + my_extra->arg0_typeid = InvalidOid; + } + + if (my_extra->arg0_typeid != arg0_typeid || + my_extra->arg1_typeid != arg1_typeid) + { + /* Determine which input is the array */ + Oid arg0_elemid = get_element_type(arg0_typeid); + Oid arg1_elemid = get_element_type(arg1_typeid); + + if (arg0_elemid != InvalidOid) + { + my_extra->array_on_left = true; + element_type = arg0_elemid; + } + else if (arg1_elemid != InvalidOid) + { + my_extra->array_on_left = false; + element_type = arg1_elemid; + } + else + { + /* Shouldn't get here given proper type checking in parser */ + ereport(ERROR, + (errcode(ERRCODE_DATATYPE_MISMATCH), + errmsg("neither input type is an array"))); + PG_RETURN_NULL(); /* keep compiler quiet */ + } + + my_extra->arg0_typeid = arg0_typeid; + my_extra->arg1_typeid = arg1_typeid; + + /* Get info about element type */ + get_typlenbyvalalign(element_type, + &my_extra->typlen, + &my_extra->typbyval, + &my_extra->typalign); + my_extra->element_type = element_type; + } + + element_type = my_extra->element_type; + typlen = my_extra->typlen; + typbyval = my_extra->typbyval; + typalign = my_extra->typalign; + + /* + * Now we can fetch the arguments, using cached type info if needed + */ + if (my_extra->array_on_left) + { + if (PG_ARGISNULL(0)) + eah = construct_empty_expanded_array(element_type, + CurrentMemoryContext, + typlen, typbyval, typalign); + else + eah = PG_GETARG_EXPANDED_ARRAYX(0, typlen, typbyval, typalign); + isNull = PG_ARGISNULL(1); + if (isNull) + newelem = (Datum) 0; + else + newelem = PG_GETARG_DATUM(1); + } + else + { + if (PG_ARGISNULL(1)) + eah = construct_empty_expanded_array(element_type, + CurrentMemoryContext, + typlen, typbyval, typalign); + else + eah = PG_GETARG_EXPANDED_ARRAYX(1, typlen, typbyval, typalign); + isNull = PG_ARGISNULL(0); + if (isNull) + newelem = (Datum) 0; + else + newelem = PG_GETARG_DATUM(0); + } + + Assert(element_type == eah->element_type); + + /* + * Perform push (this logic is basically unchanged from original) + */ + if (eah->ndims == 1) + { + lb = eah->lbound; + dimv = eah->dims; + + if (my_extra->array_on_left) + { + /* append newelem */ + int ub = dimv[0] + lb[0] - 1; + + indx = ub + 1; + /* overflow? */ + if (indx < ub) + ereport(ERROR, + (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE), + errmsg("integer out of range"))); + } + else + { + /* prepend newelem */ + indx = lb[0] - 1; + /* overflow? */ + if (indx > lb[0]) + ereport(ERROR, + (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE), + errmsg("integer out of range"))); + } + lb0 = lb[0]; + } + else if (eah->ndims == 0) + { + indx = 1; + lb0 = 1; + } + else + ereport(ERROR, + (errcode(ERRCODE_DATA_EXCEPTION), + errmsg("argument must be empty or one-dimensional array"))); + + result = array_set_element_expanded(PointerGetDatum(eah->hdr.eoh_rw_ptr), + 1, &indx, newelem, isNull, + -1, typlen, typbyval, typalign); + + Assert(result == PointerGetDatum(eah->hdr.eoh_rw_ptr)); + + /* + * Readjust result's LB to match the input's. We need do nothing in the + * append case, but it's the simplest way to implement the prepend case. + */ + if (eah->ndims == 1 && !my_extra->array_on_left) + { + /* This is ok whether we've deconstructed or not */ + eah->lbound[0] = lb0; + } + + PG_RETURN_DATUM(result); + } + + /* + * array_dims : + * returns the dimensions of the array pointed to by "v", as a "text" + * + * This is here as an example of handling either flat or expanded inputs. + */ + Datum + array_dims_expanded(PG_FUNCTION_ARGS) + { + AnyArrayType *v = PG_GETARG_ANY_ARRAY(0); + char *p; + int i; + int *dimv, + *lb; + + /* + * 33 since we assume 15 digits per number + ':' +'[]' + * + * +1 for trailing null + */ + char buf[MAXDIM * 33 + 1]; + + /* Sanity check: does it look like an array at all? */ + if (AARR_NDIM(v) <= 0 || AARR_NDIM(v) > MAXDIM) + PG_RETURN_NULL(); + + dimv = AARR_DIMS(v); + lb = AARR_LBOUND(v); + + p = buf; + for (i = 0; i < AARR_NDIM(v); i++) + { + sprintf(p, "[%d:%d]", lb[i], dimv[i] + lb[i] - 1); + p += strlen(p); + } + + PG_RETURN_TEXT_P(cstring_to_text(buf)); + } + + /* + * array_lower : + * returns the lower dimension, of the DIM requested, for + * the array pointed to by "v", as an int4 + * + * This is here as an example of handling either flat or expanded inputs. + */ + Datum + array_lower_expanded(PG_FUNCTION_ARGS) + { + AnyArrayType *v = PG_GETARG_ANY_ARRAY(0); + int reqdim = PG_GETARG_INT32(1); + int *lb; + int result; + + /* Sanity check: does it look like an array at all? */ + if (AARR_NDIM(v) <= 0 || AARR_NDIM(v) > MAXDIM) + PG_RETURN_NULL(); + + /* Sanity check: was the requested dim valid */ + if (reqdim <= 0 || reqdim > AARR_NDIM(v)) + PG_RETURN_NULL(); + + lb = AARR_LBOUND(v); + result = lb[reqdim - 1]; + + PG_RETURN_INT32(result); + } + + /* + * array_upper : + * returns the upper dimension, of the DIM requested, for + * the array pointed to by "v", as an int4 + * + * This is here as an example of handling either flat or expanded inputs. + */ + Datum + array_upper_expanded(PG_FUNCTION_ARGS) + { + AnyArrayType *v = PG_GETARG_ANY_ARRAY(0); + int reqdim = PG_GETARG_INT32(1); + int *dimv, + *lb; + int result; + + /* Sanity check: does it look like an array at all? */ + if (AARR_NDIM(v) <= 0 || AARR_NDIM(v) > MAXDIM) + PG_RETURN_NULL(); + + /* Sanity check: was the requested dim valid */ + if (reqdim <= 0 || reqdim > AARR_NDIM(v)) + PG_RETURN_NULL(); + + lb = AARR_LBOUND(v); + dimv = AARR_DIMS(v); + + result = dimv[reqdim - 1] + lb[reqdim - 1] - 1; + + PG_RETURN_INT32(result); + } diff --git a/src/backend/utils/adt/array_userfuncs.c b/src/backend/utils/adt/array_userfuncs.c index 600646e..7eee40c 100644 *** a/src/backend/utils/adt/array_userfuncs.c --- b/src/backend/utils/adt/array_userfuncs.c *************** *** 25,30 **** --- 25,33 ---- Datum array_push(PG_FUNCTION_ARGS) { + #if 1 + return array_push_expanded(fcinfo); + #else ArrayType *v; Datum newelem; bool isNull; *************** array_push(PG_FUNCTION_ARGS) *** 157,162 **** --- 160,166 ---- ARR_LBOUND(result)[0] = ARR_LBOUND(v)[0]; PG_RETURN_ARRAYTYPE_P(result); + #endif } /*----------------------------------------------------------------------------- diff --git a/src/backend/utils/adt/arrayfuncs.c b/src/backend/utils/adt/arrayfuncs.c index 79aefaf..de107b6 100644 *** a/src/backend/utils/adt/arrayfuncs.c --- b/src/backend/utils/adt/arrayfuncs.c *************** *** 27,32 **** --- 27,33 ---- #include "utils/array.h" #include "utils/builtins.h" #include "utils/datum.h" + #include "utils/expandeddatum.h" #include "utils/lsyscache.h" #include "utils/memutils.h" #include "utils/typcache.h" *************** static void ReadArrayBinary(StringInfo b *** 93,102 **** int typlen, bool typbyval, char typalign, Datum *values, bool *nulls, bool *hasnulls, int32 *nbytes); - static void CopyArrayEls(ArrayType *array, - Datum *values, bool *nulls, int nitems, - int typlen, bool typbyval, char typalign, - bool freedata); static bool array_get_isnull(const bits8 *nullbitmap, int offset); static void array_set_isnull(bits8 *nullbitmap, int offset, bool isNull); static Datum ArrayCast(char *value, bool byval, int len); --- 94,99 ---- *************** ReadArrayStr(char *arrayStr, *** 939,945 **** * the values are not toasted. (Doing it here doesn't work since the * caller has already allocated space for the array...) */ ! static void CopyArrayEls(ArrayType *array, Datum *values, bool *nulls, --- 936,942 ---- * the values are not toasted. (Doing it here doesn't work since the * caller has already allocated space for the array...) */ ! void CopyArrayEls(ArrayType *array, Datum *values, bool *nulls, *************** array_ndims(PG_FUNCTION_ARGS) *** 1666,1671 **** --- 1663,1671 ---- Datum array_dims(PG_FUNCTION_ARGS) { + #if 1 + return array_dims_expanded(fcinfo); + #else ArrayType *v = PG_GETARG_ARRAYTYPE_P(0); char *p; int i; *************** array_dims(PG_FUNCTION_ARGS) *** 1694,1699 **** --- 1694,1700 ---- } PG_RETURN_TEXT_P(cstring_to_text(buf)); + #endif } /* *************** array_dims(PG_FUNCTION_ARGS) *** 1704,1709 **** --- 1705,1713 ---- Datum array_lower(PG_FUNCTION_ARGS) { + #if 1 + return array_lower_expanded(fcinfo); + #else ArrayType *v = PG_GETARG_ARRAYTYPE_P(0); int reqdim = PG_GETARG_INT32(1); int *lb; *************** array_lower(PG_FUNCTION_ARGS) *** 1721,1726 **** --- 1725,1731 ---- result = lb[reqdim - 1]; PG_RETURN_INT32(result); + #endif } /* *************** array_lower(PG_FUNCTION_ARGS) *** 1731,1736 **** --- 1736,1744 ---- Datum array_upper(PG_FUNCTION_ARGS) { + #if 1 + return array_upper_expanded(fcinfo); + #else ArrayType *v = PG_GETARG_ARRAYTYPE_P(0); int reqdim = PG_GETARG_INT32(1); int *dimv, *************** array_upper(PG_FUNCTION_ARGS) *** 1751,1756 **** --- 1759,1765 ---- result = dimv[reqdim - 1] + lb[reqdim - 1] - 1; PG_RETURN_INT32(result); + #endif } /* *************** array_get_element(Datum arraydatum, *** 1850,1855 **** --- 1859,1876 ---- arraydataptr = (char *) DatumGetPointer(arraydatum); arraynullsptr = NULL; } + else if (VARATT_IS_EXTERNAL_EXPANDED(DatumGetPointer(arraydatum))) + { + /* hand off to array_expanded.c */ + return array_get_element_expanded(arraydatum, + nSubscripts, + indx, + arraytyplen, + elmlen, + elmbyval, + elmalign, + isNull); + } else { /* detoast input array if necessary */ *************** array_get_slice(Datum arraydatum, *** 2083,2089 **** * * Result: * A new array is returned, just like the old except for the one ! * modified entry. The original array object is not changed. * * For one-dimensional arrays only, we allow the array to be extended * by assigning to a position outside the existing subscript range; any --- 2104,2112 ---- * * Result: * A new array is returned, just like the old except for the one ! * modified entry. The original array object is not changed, ! * unless what is passed is a read-write reference to an expanded ! * array object; in that case the expanded array is updated in-place. * * For one-dimensional arrays only, we allow the array to be extended * by assigning to a position outside the existing subscript range; any *************** array_set_element(Datum arraydatum, *** 2166,2171 **** --- 2189,2206 ---- if (elmlen == -1 && !isNull) dataValue = PointerGetDatum(PG_DETOAST_DATUM(dataValue)); + /* if array is in expanded form, hand off to array_expanded.c */ + if (VARATT_IS_EXTERNAL_EXPANDED(DatumGetPointer(arraydatum))) + return array_set_element_expanded(arraydatum, + nSubscripts, + indx, + dataValue, + isNull, + arraytyplen, + elmlen, + elmbyval, + elmalign); + /* detoast input array if necessary */ array = DatumGetArrayTypeP(arraydatum); diff --git a/src/backend/utils/adt/datum.c b/src/backend/utils/adt/datum.c index 014eca5..e8af030 100644 *** a/src/backend/utils/adt/datum.c --- b/src/backend/utils/adt/datum.c *************** *** 12,19 **** * *------------------------------------------------------------------------- */ /* ! * In the implementation of the next routines we assume the following: * * A) if a type is "byVal" then all the information is stored in the * Datum itself (i.e. no pointers involved!). In this case the --- 12,20 ---- * *------------------------------------------------------------------------- */ + /* ! * In the implementation of these routines we assume the following: * * A) if a type is "byVal" then all the information is stored in the * Datum itself (i.e. no pointers involved!). In this case the *************** *** 34,44 **** --- 35,49 ---- * * Note that we do not treat "toasted" datums specially; therefore what * will be copied or compared is the compressed data or toast reference. + * An exception is made for datumCopy() of an expanded object, however, + * because most callers expect to get a simple contiguous (and pfree'able) + * result from datumCopy(). See also datumTransfer(). */ #include "postgres.h" #include "utils/datum.h" + #include "utils/expandeddatum.h" /*------------------------------------------------------------------------- *************** *** 46,51 **** --- 51,57 ---- * * Find the "real" size of a datum, given the datum value, * whether it is a "by value", and the declared type length. + * (For TOAST pointer datums, this is the size of the pointer datum.) * * This is essentially an out-of-line version of the att_addlength_datum() * macro in access/tupmacs.h. We do a tad more error checking though. *************** datumGetSize(Datum value, bool typByVal, *** 106,114 **** /*------------------------------------------------------------------------- * datumCopy * ! * make a copy of a datum * * If the datatype is pass-by-reference, memory is obtained with palloc(). *------------------------------------------------------------------------- */ Datum --- 112,127 ---- /*------------------------------------------------------------------------- * datumCopy * ! * Make a copy of a non-NULL datum. * * If the datatype is pass-by-reference, memory is obtained with palloc(). + * + * If the value is a reference to an expanded object, we flatten into memory + * obtained with palloc(). We need to copy because one of the main uses of + * this function is to copy a datum out of a transient memory context that's + * about to be destroyed, and the expanded object is probably in a child + * context that will also go away. Moreover, many callers assume that the + * result is a single pfree-able chunk. *------------------------------------------------------------------------- */ Datum *************** datumCopy(Datum value, bool typByVal, in *** 118,161 **** if (typByVal) res = value; else { Size realSize; ! char *s; ! ! if (DatumGetPointer(value) == NULL) ! return PointerGetDatum(NULL); realSize = datumGetSize(value, typByVal, typLen); ! s = (char *) palloc(realSize); ! memcpy(s, DatumGetPointer(value), realSize); ! res = PointerGetDatum(s); } return res; } /*------------------------------------------------------------------------- ! * datumFree * ! * Free the space occupied by a datum CREATED BY "datumCopy" * ! * NOTE: DO NOT USE THIS ROUTINE with datums returned by heap_getattr() etc. ! * ONLY datums created by "datumCopy" can be freed! *------------------------------------------------------------------------- */ ! #ifdef NOT_USED ! void ! datumFree(Datum value, bool typByVal, int typLen) { ! if (!typByVal) ! { ! Pointer s = DatumGetPointer(value); ! ! pfree(s); ! } } - #endif /*------------------------------------------------------------------------- * datumIsEqual --- 131,201 ---- if (typByVal) res = value; + else if (typLen == -1) + { + /* It is a varlena datatype */ + struct varlena *vl = (struct varlena *) DatumGetPointer(value); + + if (VARATT_IS_EXTERNAL_EXPANDED(vl)) + { + /* Flatten into the caller's memory context */ + ExpandedObjectHeader *eoh = DatumGetEOHP(value); + Size resultsize; + char *resultptr; + + resultsize = EOH_get_flat_size(eoh); + resultptr = (char *) palloc(resultsize); + EOH_flatten_into(eoh, (void *) resultptr, resultsize); + res = PointerGetDatum(resultptr); + } + else + { + /* Otherwise, just copy the varlena datum verbatim */ + Size realSize; + char *resultptr; + + realSize = (Size) VARSIZE_ANY(vl); + resultptr = (char *) palloc(realSize); + memcpy(resultptr, vl, realSize); + res = PointerGetDatum(resultptr); + } + } else { + /* Pass by reference, but not varlena, so not toasted */ Size realSize; ! char *resultptr; realSize = datumGetSize(value, typByVal, typLen); ! resultptr = (char *) palloc(realSize); ! memcpy(resultptr, DatumGetPointer(value), realSize); ! res = PointerGetDatum(resultptr); } return res; } /*------------------------------------------------------------------------- ! * datumTransfer * ! * Transfer a non-NULL datum into the current memory context. * ! * This is equivalent to datumCopy() except when the datum is a read-write ! * pointer to an expanded object. In that case we merely reparent the object ! * into the current context, and return its standard R/W pointer (in case the ! * given one is a transient pointer of shorter lifespan). *------------------------------------------------------------------------- */ ! Datum ! datumTransfer(Datum value, bool typByVal, int typLen) { ! if (!typByVal && typLen == -1 && ! VARATT_IS_EXTERNAL_EXPANDED_RW(DatumGetPointer(value))) ! value = TransferExpandedObject(value, CurrentMemoryContext); ! else ! value = datumCopy(value, typByVal, typLen); ! return value; } /*------------------------------------------------------------------------- * datumIsEqual diff --git a/src/backend/utils/adt/expandeddatum.c b/src/backend/utils/adt/expandeddatum.c index ...d43f437 . *** a/src/backend/utils/adt/expandeddatum.c --- b/src/backend/utils/adt/expandeddatum.c *************** *** 0 **** --- 1,162 ---- + /*------------------------------------------------------------------------- + * + * expandeddatum.c + * Support functions for "expanded" value representations. + * + * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group + * Portions Copyright (c) 1994, Regents of the University of California + * + * + * IDENTIFICATION + * src/backend/utils/adt/expandeddatum.c + * + *------------------------------------------------------------------------- + */ + #include "postgres.h" + + #include "utils/expandeddatum.h" + #include "utils/memutils.h" + + /* + * DatumGetEOHP + * + * Given a Datum that is an expanded-object reference, extract the pointer. + * + * This is a bit tedious since the pointer may not be properly aligned; + * compare VARATT_EXTERNAL_GET_POINTER(). + */ + ExpandedObjectHeader * + DatumGetEOHP(Datum d) + { + varattrib_1b_e *datum = (varattrib_1b_e *) DatumGetPointer(d); + varatt_expanded ptr; + + Assert(VARATT_IS_EXTERNAL_EXPANDED(datum)); + memcpy(&ptr, VARDATA_EXTERNAL(datum), sizeof(ptr)); + return ptr.eohptr; + } + + /* + * EOH_init_header + * + * Initialize the common header of an expanded object. + * + * The main thing this encapsulates is initializing the TOAST pointers. + */ + void + EOH_init_header(ExpandedObjectHeader *eohptr, + const ExpandedObjectMethods *methods, + MemoryContext obj_context) + { + varatt_expanded ptr; + + eohptr->vl_len_ = EOH_HEADER_MAGIC; + eohptr->eoh_methods = methods; + eohptr->eoh_context = obj_context; + + ptr.eohptr = eohptr; + + SET_VARTAG_EXTERNAL(eohptr->eoh_rw_ptr, VARTAG_EXPANDED_RW); + memcpy(VARDATA_EXTERNAL(eohptr->eoh_rw_ptr), &ptr, sizeof(ptr)); + + SET_VARTAG_EXTERNAL(eohptr->eoh_ro_ptr, VARTAG_EXPANDED_RO); + memcpy(VARDATA_EXTERNAL(eohptr->eoh_ro_ptr), &ptr, sizeof(ptr)); + } + + /* + * EOH_get_flat_size + * EOH_flatten_into + * + * Convenience functions for invoking the "methods" of an expanded object. + */ + + Size + EOH_get_flat_size(ExpandedObjectHeader *eohptr) + { + return (*eohptr->eoh_methods->get_flat_size) (eohptr); + } + + void + EOH_flatten_into(ExpandedObjectHeader *eohptr, + void *result, Size allocated_size) + { + (*eohptr->eoh_methods->flatten_into) (eohptr, result, allocated_size); + } + + /* + * Does the Datum represent a writable expanded object? + */ + bool + DatumIsReadWriteExpandedObject(Datum d, bool isnull, int16 typlen) + { + /* Reject if it's NULL or not a varlena type */ + if (isnull || typlen != -1) + return false; + + /* Reject if not a read-write expanded-object pointer */ + if (!VARATT_IS_EXTERNAL_EXPANDED_RW(DatumGetPointer(d))) + return false; + + return true; + } + + /* + * If the Datum represents a R/W expanded object, change it to R/O. + * Otherwise return the original Datum. + */ + Datum + MakeExpandedObjectReadOnly(Datum d, bool isnull, int16 typlen) + { + ExpandedObjectHeader *eohptr; + + /* Nothing to do if it's NULL or not a varlena type */ + if (isnull || typlen != -1) + return d; + + /* Nothing to do if not a read-write expanded-object pointer */ + if (!VARATT_IS_EXTERNAL_EXPANDED_RW(DatumGetPointer(d))) + return d; + + /* Now safe to extract the object pointer */ + eohptr = DatumGetEOHP(d); + + /* Return the built-in read-only pointer instead of given pointer */ + return PointerGetDatum(eohptr->eoh_ro_ptr); + } + + /* + * Transfer ownership of an expanded object to a new parent memory context. + * The object must be referenced by a R/W pointer, and what we return is + * always its "standard" R/W pointer, which is certain to have the same + * lifespan as the object itself. (The passed-in pointer might not, and + * in any case wouldn't provide a unique identifier if it's not that one.) + */ + Datum + TransferExpandedObject(Datum d, MemoryContext new_parent) + { + ExpandedObjectHeader *eohptr = DatumGetEOHP(d); + + /* Assert caller gave a R/W pointer */ + Assert(VARATT_IS_EXTERNAL_EXPANDED_RW(DatumGetPointer(d))); + + /* Transfer ownership */ + MemoryContextSetParent(eohptr->eoh_context, new_parent); + + /* Return the object's standard read-write pointer */ + return PointerGetDatum(eohptr->eoh_rw_ptr); + } + + /* + * Delete an expanded object (must be referenced by a R/W pointer). + */ + void + DeleteExpandedObject(Datum d) + { + ExpandedObjectHeader *eohptr = DatumGetEOHP(d); + + /* Assert caller gave a R/W pointer */ + Assert(VARATT_IS_EXTERNAL_EXPANDED_RW(DatumGetPointer(d))); + + /* Kill it */ + MemoryContextDelete(eohptr->eoh_context); + } diff --git a/src/backend/utils/mmgr/mcxt.c b/src/backend/utils/mmgr/mcxt.c index 202bc78..4b24066 100644 *** a/src/backend/utils/mmgr/mcxt.c --- b/src/backend/utils/mmgr/mcxt.c *************** MemoryContextSetParent(MemoryContext con *** 266,271 **** --- 266,275 ---- AssertArg(MemoryContextIsValid(context)); AssertArg(context != new_parent); + /* Fast path if it's got correct parent already */ + if (new_parent == context->parent) + return; + /* Delink from existing parent, if any */ if (context->parent) { diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h index 40fde83..a98a7af 100644 *** a/src/include/executor/executor.h --- b/src/include/executor/executor.h *************** extern void FreeExprContext(ExprContext *** 312,318 **** extern void ReScanExprContext(ExprContext *econtext); #define ResetExprContext(econtext) \ ! MemoryContextReset((econtext)->ecxt_per_tuple_memory) extern ExprContext *MakePerTupleExprContext(EState *estate); --- 312,318 ---- extern void ReScanExprContext(ExprContext *econtext); #define ResetExprContext(econtext) \ ! MemoryContextResetAndDeleteChildren((econtext)->ecxt_per_tuple_memory) extern ExprContext *MakePerTupleExprContext(EState *estate); diff --git a/src/include/executor/spi.h b/src/include/executor/spi.h index 9e912ba..fbcae0c 100644 *** a/src/include/executor/spi.h --- b/src/include/executor/spi.h *************** extern char *SPI_getnspname(Relation rel *** 124,129 **** --- 124,130 ---- extern void *SPI_palloc(Size size); extern void *SPI_repalloc(void *pointer, Size size); extern void SPI_pfree(void *pointer); + extern Datum SPI_datumTransfer(Datum value, bool typByVal, int typLen); extern void SPI_freetuple(HeapTuple pointer); extern void SPI_freetuptable(SPITupleTable *tuptable); diff --git a/src/include/executor/tuptable.h b/src/include/executor/tuptable.h index 48f84bf..00686b0 100644 *** a/src/include/executor/tuptable.h --- b/src/include/executor/tuptable.h *************** extern Datum ExecFetchSlotTupleDatum(Tup *** 163,168 **** --- 163,169 ---- extern HeapTuple ExecMaterializeSlot(TupleTableSlot *slot); extern TupleTableSlot *ExecCopySlot(TupleTableSlot *dstslot, TupleTableSlot *srcslot); + extern TupleTableSlot *ExecMakeSlotContentsReadOnly(TupleTableSlot *slot); /* in access/common/heaptuple.c */ extern Datum slot_getattr(TupleTableSlot *slot, int attnum, bool *isnull); diff --git a/src/include/nodes/primnodes.h b/src/include/nodes/primnodes.h index 1d06f42..932a96b 100644 *** a/src/include/nodes/primnodes.h --- b/src/include/nodes/primnodes.h *************** typedef struct WindowFunc *** 310,315 **** --- 310,319 ---- * Note: the result datatype is the element type when fetching a single * element; but it is the array type when doing subarray fetch or either * type of store. + * + * Note: for the cases where an array is returned, if refexpr yields a R/W + * expanded array, then the implementation is allowed to modify that object + * in-place and return the same object.) * ---------------- */ typedef struct ArrayRef diff --git a/src/include/postgres.h b/src/include/postgres.h index 082c75b..5dd897a 100644 *** a/src/include/postgres.h --- b/src/include/postgres.h *************** typedef struct varatt_indirect *** 88,93 **** --- 88,110 ---- } varatt_indirect; /* + * struct varatt_expanded is a "TOAST pointer" representing an out-of-line + * Datum that is stored in memory, in some type-specific, not necessarily + * physically contiguous format that is convenient for computation not + * storage. APIs for this, in particular the definition of struct + * ExpandedObjectHeader, are in src/include/utils/expandeddatum.h. + * + * Note that just as for struct varatt_external, this struct is stored + * unaligned within any containing tuple. + */ + typedef struct ExpandedObjectHeader ExpandedObjectHeader; + + typedef struct varatt_expanded + { + ExpandedObjectHeader *eohptr; + } varatt_expanded; + + /* * Type tag for the various sorts of "TOAST pointer" datums. The peculiar * value for VARTAG_ONDISK comes from a requirement for on-disk compatibility * with a previous notion that the tag field was the pointer datum's length. *************** typedef struct varatt_indirect *** 95,105 **** --- 112,129 ---- typedef enum vartag_external { VARTAG_INDIRECT = 1, + VARTAG_EXPANDED_RO = 2, + VARTAG_EXPANDED_RW = 3, VARTAG_ONDISK = 18 } vartag_external; + /* this test relies on the specific tag values above */ + #define VARTAG_IS_EXPANDED(tag) \ + (((tag) & ~1) == VARTAG_EXPANDED_RO) + #define VARTAG_SIZE(tag) \ ((tag) == VARTAG_INDIRECT ? sizeof(varatt_indirect) : \ + VARTAG_IS_EXPANDED(tag) ? sizeof(varatt_expanded) : \ (tag) == VARTAG_ONDISK ? sizeof(varatt_external) : \ TrapMacro(true, "unrecognized TOAST vartag")) *************** typedef struct *** 294,299 **** --- 318,329 ---- (VARATT_IS_EXTERNAL(PTR) && VARTAG_EXTERNAL(PTR) == VARTAG_ONDISK) #define VARATT_IS_EXTERNAL_INDIRECT(PTR) \ (VARATT_IS_EXTERNAL(PTR) && VARTAG_EXTERNAL(PTR) == VARTAG_INDIRECT) + #define VARATT_IS_EXTERNAL_EXPANDED_RO(PTR) \ + (VARATT_IS_EXTERNAL(PTR) && VARTAG_EXTERNAL(PTR) == VARTAG_EXPANDED_RO) + #define VARATT_IS_EXTERNAL_EXPANDED_RW(PTR) \ + (VARATT_IS_EXTERNAL(PTR) && VARTAG_EXTERNAL(PTR) == VARTAG_EXPANDED_RW) + #define VARATT_IS_EXTERNAL_EXPANDED(PTR) \ + (VARATT_IS_EXTERNAL(PTR) && VARTAG_IS_EXPANDED(VARTAG_EXTERNAL(PTR))) #define VARATT_IS_SHORT(PTR) VARATT_IS_1B(PTR) #define VARATT_IS_EXTENDED(PTR) (!VARATT_IS_4B_U(PTR)) diff --git a/src/include/utils/array.h b/src/include/utils/array.h index dff69eb..1f18161 100644 *** a/src/include/utils/array.h --- b/src/include/utils/array.h *************** extern Datum array_remove(PG_FUNCTION_AR *** 248,253 **** --- 248,262 ---- extern Datum array_replace(PG_FUNCTION_ARGS); extern Datum width_bucket_array(PG_FUNCTION_ARGS); + extern void CopyArrayEls(ArrayType *array, + Datum *values, + bool *nulls, + int nitems, + int typlen, + bool typbyval, + char typalign, + bool freedata); + extern Datum array_get_element(Datum arraydatum, int nSubscripts, int *indx, int arraytyplen, int elmlen, bool elmbyval, char elmalign, bool *isNull); *************** extern Datum array_agg_array_transfn(PG_ *** 356,361 **** --- 365,390 ---- extern Datum array_agg_array_finalfn(PG_FUNCTION_ARGS); /* + * prototypes for functions defined in array_expanded.c + */ + extern Datum expand_array(Datum arraydatum, MemoryContext parentcontext, + int elmlen, bool elmbyval, char elmalign); + extern Datum array_get_element_expanded(Datum arraydatum, + int nSubscripts, int *indx, + int arraytyplen, + int elmlen, bool elmbyval, char elmalign, + bool *isNull); + extern Datum array_set_element_expanded(Datum arraydatum, + int nSubscripts, int *indx, + Datum dataValue, bool isNull, + int arraytyplen, + int elmlen, bool elmbyval, char elmalign); + extern Datum array_push_expanded(PG_FUNCTION_ARGS); + extern Datum array_dims_expanded(PG_FUNCTION_ARGS); + extern Datum array_lower_expanded(PG_FUNCTION_ARGS); + extern Datum array_upper_expanded(PG_FUNCTION_ARGS); + + /* * prototypes for functions defined in array_typanalyze.c */ extern Datum array_typanalyze(PG_FUNCTION_ARGS); diff --git a/src/include/utils/datum.h b/src/include/utils/datum.h index 663414b..c572f79 100644 *** a/src/include/utils/datum.h --- b/src/include/utils/datum.h *************** *** 24,41 **** extern Size datumGetSize(Datum value, bool typByVal, int typLen); /* ! * datumCopy - make a copy of a datum. * * If the datatype is pass-by-reference, memory is obtained with palloc(). */ extern Datum datumCopy(Datum value, bool typByVal, int typLen); /* ! * datumFree - free a datum previously allocated by datumCopy, if any. * ! * Does nothing if datatype is pass-by-value. */ ! extern void datumFree(Datum value, bool typByVal, int typLen); /* * datumIsEqual --- 24,41 ---- extern Size datumGetSize(Datum value, bool typByVal, int typLen); /* ! * datumCopy - make a copy of a non-NULL datum. * * If the datatype is pass-by-reference, memory is obtained with palloc(). */ extern Datum datumCopy(Datum value, bool typByVal, int typLen); /* ! * datumTransfer - transfer a non-NULL datum into the current memory context. * ! * Differs from datumCopy() in its handling of read-write expanded objects. */ ! extern Datum datumTransfer(Datum value, bool typByVal, int typLen); /* * datumIsEqual diff --git a/src/include/utils/expandeddatum.h b/src/include/utils/expandeddatum.h index ...584d0c6 . *** a/src/include/utils/expandeddatum.h --- b/src/include/utils/expandeddatum.h *************** *** 0 **** --- 1,146 ---- + /*------------------------------------------------------------------------- + * + * expandeddatum.h + * Declarations for access to "expanded" value representations. + * + * Complex data types, particularly container types such as arrays and + * records, usually have on-disk representations that are compact but not + * especially convenient to modify. What's more, when we do modify them, + * having to recopy all the rest of the value can be extremely inefficient. + * Therefore, we provide a notion of an "expanded" representation that is used + * only in memory and is optimized more for computation than storage. + * The format appearing on disk is called the data type's "flattened" + * representation, since it is required to be a contiguous blob of bytes -- + * but the type can have an expanded representation that is not. Data types + * must provide means to translate an expanded representation back to + * flattened form. + * + * An expanded object is meant to survive across multiple operations, but + * not to be enormously long-lived; for example it might be a local variable + * in a PL/pgSQL procedure. So its extra bulk compared to the on-disk format + * is a worthwhile trade-off. + * + * References to expanded objects are a type of TOAST pointer. + * Because of longstanding conventions in Postgres, this means that the + * flattened form of such an object must always be a varlena object. + * Fortunately that's no restriction in practice. + * + * There are actually two kinds of TOAST pointers for expanded objects: + * read-only and read-write pointers. Possession of one of the latter + * authorizes a function to modify the value in-place rather than copying it + * as would normally be required. Functions should always return a read-write + * pointer to any new expanded object they create. Functions that modify an + * argument value in-place must take care that they do not corrupt the old + * value if they fail partway through. + * + * + * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group + * Portions Copyright (c) 1994, Regents of the University of California + * + * src/include/utils/expandeddatum.h + * + *------------------------------------------------------------------------- + */ + #ifndef EXPANDEDDATUM_H + #define EXPANDEDDATUM_H + + /* Size of an EXTERNAL datum that contains a pointer to an expanded object */ + #define EXPANDED_POINTER_SIZE (VARHDRSZ_EXTERNAL + sizeof(varatt_expanded)) + + /* + * "Methods" that must be provided for any expanded object. + * + * get_flat_size: compute space needed for flattened representation (which + * must be a valid in-line, non-compressed, 4-byte-header varlena object). + * + * flatten_into: construct flattened representation in the caller-allocated + * space at *result, of size allocated_size (which will always be the result + * of a preceding get_flat_size call; it's passed for cross-checking). + * + * Note: construction of a heap tuple from an expanded datum calls + * get_flat_size twice, so it's worthwhile to make sure that that doesn't + * incur too much overhead. + */ + typedef Size (*EOM_get_flat_size_method) (ExpandedObjectHeader *eohptr); + typedef void (*EOM_flatten_into_method) (ExpandedObjectHeader *eohptr, + void *result, Size allocated_size); + + /* Struct of function pointers for an expanded object's methods */ + typedef struct ExpandedObjectMethods + { + EOM_get_flat_size_method get_flat_size; + EOM_flatten_into_method flatten_into; + } ExpandedObjectMethods; + + /* + * Every expanded object must contain this header; typically the header + * is embedded in some larger struct that adds type-specific fields. + * + * It is presumed that the header object and all subsidiary data are stored + * in eoh_context, so that the object can be freed by deleting that context, + * or its storage lifespan can be altered by reparenting the context. + * (In principle the object could own additional resources, such as malloc'd + * storage, and use a memory context reset callback to free them upon reset or + * deletion of eoh_context.) + * + * We set up two TOAST pointers within the standard header, one read-write + * and one read-only. This allows functions to return either kind of pointer + * without making an additional allocation, and in particular without worrying + * whether a separately palloc'd object would have sufficient lifespan. + * But note that these pointers are just a convenience; a pointer object + * appearing somewhere else would still be legal. + * + * The typedef declaration for this appears in postgres.h. + */ + struct ExpandedObjectHeader + { + /* Phony varlena header */ + int32 vl_len_; /* always EOH_HEADER_MAGIC, see below */ + + /* Pointer to methods required for object type */ + const ExpandedObjectMethods *eoh_methods; + + /* Memory context containing this header and subsidiary data */ + MemoryContext eoh_context; + + /* Standard R/W TOAST pointer for this object is kept here */ + char eoh_rw_ptr[EXPANDED_POINTER_SIZE]; + + /* Standard R/O TOAST pointer for this object is kept here */ + char eoh_ro_ptr[EXPANDED_POINTER_SIZE]; + }; + + /* + * Particularly for read-only functions, it is handy to be able to work with + * either regular "flat" varlena inputs or expanded inputs of the same data + * type. To allow determining which case an argument-fetching macro has + * returned, the first int32 of an ExpandedObjectHeader always contains -1 + * (EOH_HEADER_MAGIC to the code). This works since no 4-byte-header varlena + * could have that as its first 4 bytes. Caution: we could not reliably tell + * the difference between an ExpandedObjectHeader and a short-header object + * with this trick. However, it works fine for cases where the argument + * fetching code will return either a fully-uncompressed flat object or a + * expanded object. + */ + #define EOH_HEADER_MAGIC (-1) + #define VARATT_IS_EXPANDED_HEADER(PTR) \ + (((ExpandedObjectHeader *) (PTR))->vl_len_ == EOH_HEADER_MAGIC) + + /* + * Generic support functions for expanded objects. + * (Some of these might be worth inlining later.) + */ + + extern ExpandedObjectHeader *DatumGetEOHP(Datum d); + extern void EOH_init_header(ExpandedObjectHeader *eohptr, + const ExpandedObjectMethods *methods, + MemoryContext obj_context); + extern Size EOH_get_flat_size(ExpandedObjectHeader *eohptr); + extern void EOH_flatten_into(ExpandedObjectHeader *eohptr, + void *result, Size allocated_size); + extern bool DatumIsReadWriteExpandedObject(Datum d, bool isnull, int16 typlen); + extern Datum MakeExpandedObjectReadOnly(Datum d, bool isnull, int16 typlen); + extern Datum TransferExpandedObject(Datum d, MemoryContext new_parent); + extern void DeleteExpandedObject(Datum d); + + #endif /* EXPANDEDDATUM_H */ diff --git a/src/pl/plpgsql/src/pl_comp.c b/src/pl/plpgsql/src/pl_comp.c index f364ce4..d021145 100644 *** a/src/pl/plpgsql/src/pl_comp.c --- b/src/pl/plpgsql/src/pl_comp.c *************** build_datatype(HeapTuple typeTup, int32 *** 2202,2207 **** --- 2202,2223 ---- typ->typbyval = typeStruct->typbyval; typ->typrelid = typeStruct->typrelid; typ->typioparam = getTypeIOParam(typeTup); + /* Detect if type is true array, or domain thereof */ + /* NB: this is only used to decide whether to apply expand_array */ + if (typeStruct->typtype == TYPTYPE_BASE) + { + /* this test should match what get_element_type() checks */ + typ->typisarray = (typeStruct->typlen == -1 && + OidIsValid(typeStruct->typelem)); + } + else if (typeStruct->typtype == TYPTYPE_DOMAIN) + { + /* we can short-circuit looking up base types if it's not varlena */ + typ->typisarray = (typeStruct->typlen == -1 && + OidIsValid(get_base_element_type(typeStruct->typbasetype))); + } + else + typ->typisarray = false; typ->collation = typeStruct->typcollation; if (OidIsValid(collation) && OidIsValid(typ->collation)) typ->collation = collation; diff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c index b7e3bc4..25ca236 100644 *** a/src/pl/plpgsql/src/pl_exec.c --- b/src/pl/plpgsql/src/pl_exec.c *************** *** 32,37 **** --- 32,38 ---- #include "utils/array.h" #include "utils/builtins.h" #include "utils/datum.h" + #include "utils/expandeddatum.h" #include "utils/lsyscache.h" #include "utils/memutils.h" #include "utils/rel.h" *************** static void exec_assign_value(PLpgSQL_ex *** 171,176 **** --- 172,178 ---- Datum value, Oid valtype, bool *isNull); static void exec_eval_datum(PLpgSQL_execstate *estate, PLpgSQL_datum *datum, + bool getrwpointer, Oid *typeid, int32 *typetypmod, Datum *value, *************** plpgsql_exec_function(PLpgSQL_function * *** 295,300 **** --- 297,332 ---- var->value = fcinfo->arg[i]; var->isnull = fcinfo->argnull[i]; var->freeval = false; + + /* + * Hack to force array parameters into expanded form. + * Special cases: If passed a R/W expanded pointer, assume + * we can commandeer the object rather than having to copy + * it. If passed a R/O expanded pointer, just keep it as + * the value of the variable for the moment. + */ + if (!var->isnull && var->datatype->typisarray) + { + if (VARATT_IS_EXTERNAL_EXPANDED_RW(DatumGetPointer(var->value))) + { + /* take ownership of R/W object */ + var->value = TransferExpandedObject(var->value, + CurrentMemoryContext); + var->freeval = true; + } + else if (VARATT_IS_EXTERNAL_EXPANDED_RO(DatumGetPointer(var->value))) + { + /* R/O pointer, keep it as-is until assigned to */ + } + else + { + /* flat array, so force to expanded form */ + var->value = expand_array(var->value, + CurrentMemoryContext, + 0, 0, 0); + var->freeval = true; + } + } } break; *************** plpgsql_exec_function(PLpgSQL_function * *** 461,478 **** /* * If the function's return type isn't by value, copy the value ! * into upper executor memory context. */ if (!fcinfo->isnull && !func->fn_retbyval) ! { ! Size len; ! void *tmp; ! ! len = datumGetSize(estate.retval, false, func->fn_rettyplen); ! tmp = SPI_palloc(len); ! memcpy(tmp, DatumGetPointer(estate.retval), len); ! estate.retval = PointerGetDatum(tmp); ! } } } --- 493,506 ---- /* * If the function's return type isn't by value, copy the value ! * into upper executor memory context. However, if we have a R/W ! * expanded datum, we can just transfer its ownership out to the ! * upper executor context. */ if (!fcinfo->isnull && !func->fn_retbyval) ! estate.retval = SPI_datumTransfer(estate.retval, ! false, ! func->fn_rettyplen); } } *************** exec_assign_value(PLpgSQL_execstate *est *** 4061,4086 **** /* * If type is by-reference, copy the new value (which is * probably in the eval_econtext) into the procedure's memory ! * context. */ if (!var->datatype->typbyval && !*isNull) ! newvalue = datumCopy(newvalue, ! false, ! var->datatype->typlen); /* ! * Now free the old value. (We can't do this any earlier ! * because of the possibility that we are assigning the var's ! * old value to it, eg "foo := foo". We could optimize out ! * the assignment altogether in such cases, but it's too ! * infrequent to be worth testing for.) */ ! free_var(var); var->value = newvalue; var->isnull = *isNull; ! if (!var->datatype->typbyval && !*isNull) ! var->freeval = true; break; } --- 4089,4139 ---- /* * If type is by-reference, copy the new value (which is * probably in the eval_econtext) into the procedure's memory ! * context. But if it's a read/write reference to an expanded ! * object, no physical copy needs to happen; at most we need ! * to reparent the object's memory context. */ if (!var->datatype->typbyval && !*isNull) ! { ! newvalue = datumTransfer(newvalue, ! false, ! var->datatype->typlen); ! ! /* ! * If it's an array, force the value to be stored in ! * expanded form. This wins if the function later does, ! * eg, a lot of array subscripting operations on the ! * variable, and otherwise might lose badly. We might ! * need to use a different heuristic, but it's too soon to ! * tell. Also, what of cases where it'd be useful to ! * force non-array values into expanded form? ! */ ! if (var->datatype->typisarray && ! !VARATT_IS_EXTERNAL_EXPANDED_RW(DatumGetPointer(newvalue))) ! { ! Datum avalue; ! ! avalue = expand_array(newvalue, CurrentMemoryContext, ! 0, 0, 0); ! pfree(DatumGetPointer(newvalue)); ! newvalue = avalue; ! } ! } /* ! * Now free the old value, unless it's the same as the new ! * value (ie, we're doing "foo := foo"). Note that for ! * expanded objects, this test is necessary and cannot ! * reliably be made any earlier; we have to be looking at the ! * object's standard R/W pointer to be sure pointer equality ! * is meaningful. */ ! if (var->value != newvalue || var->isnull || *isNull) ! free_var(var); var->value = newvalue; var->isnull = *isNull; ! var->freeval = (!var->datatype->typbyval && !*isNull); break; } *************** exec_assign_value(PLpgSQL_execstate *est *** 4277,4283 **** } while (target->dtype == PLPGSQL_DTYPE_ARRAYELEM); /* Fetch current value of array datum */ ! exec_eval_datum(estate, target, &parenttypoid, &parenttypmod, &oldarraydatum, &oldarrayisnull); --- 4330,4336 ---- } while (target->dtype == PLPGSQL_DTYPE_ARRAYELEM); /* Fetch current value of array datum */ ! exec_eval_datum(estate, target, true, &parenttypoid, &parenttypmod, &oldarraydatum, &oldarrayisnull); *************** exec_assign_value(PLpgSQL_execstate *est *** 4423,4438 **** * * The type oid, typmod, value in Datum format, and null flag are returned. * * At present this doesn't handle PLpgSQL_expr or PLpgSQL_arrayelem datums. * ! * NOTE: caller must not modify the returned value, since it points right ! * at the stored value in the case of pass-by-reference datatypes. In some ! * cases we have to palloc a return value, and in such cases we put it into ! * the estate's short-term memory context. */ static void exec_eval_datum(PLpgSQL_execstate *estate, PLpgSQL_datum *datum, Oid *typeid, int32 *typetypmod, Datum *value, --- 4476,4495 ---- * * The type oid, typmod, value in Datum format, and null flag are returned. * + * If getrwpointer is TRUE, we'll return a R/W pointer to any variable that + * is an expanded object; otherwise we return a R/O pointer. + * * At present this doesn't handle PLpgSQL_expr or PLpgSQL_arrayelem datums. * ! * NOTE: in most cases caller must not modify the returned value, since ! * it points right at the stored value in the case of pass-by-reference ! * datatypes. In some cases we have to palloc a return value, and in such ! * cases we put it into the estate's short-term memory context. */ static void exec_eval_datum(PLpgSQL_execstate *estate, PLpgSQL_datum *datum, + bool getrwpointer, Oid *typeid, int32 *typetypmod, Datum *value, *************** exec_eval_datum(PLpgSQL_execstate *estat *** 4448,4454 **** *typeid = var->datatype->typoid; *typetypmod = var->datatype->atttypmod; ! *value = var->value; *isnull = var->isnull; break; } --- 4505,4516 ---- *typeid = var->datatype->typoid; *typetypmod = var->datatype->atttypmod; ! if (getrwpointer) ! *value = var->value; ! else ! *value = MakeExpandedObjectReadOnly(var->value, ! var->isnull, ! var->datatype->typlen); *isnull = var->isnull; break; } *************** setup_param_list(PLpgSQL_execstate *esta *** 5284,5290 **** PLpgSQL_var *var = (PLpgSQL_var *) datum; ParamExternData *prm = ¶mLI->params[dno]; ! prm->value = var->value; prm->isnull = var->isnull; prm->pflags = PARAM_FLAG_CONST; prm->ptype = var->datatype->typoid; --- 5346,5354 ---- PLpgSQL_var *var = (PLpgSQL_var *) datum; ParamExternData *prm = ¶mLI->params[dno]; ! prm->value = MakeExpandedObjectReadOnly(var->value, ! var->isnull, ! var->datatype->typlen); prm->isnull = var->isnull; prm->pflags = PARAM_FLAG_CONST; prm->ptype = var->datatype->typoid; *************** plpgsql_param_fetch(ParamListInfo params *** 5350,5356 **** /* OK, evaluate the value and store into the appropriate paramlist slot */ datum = estate->datums[dno]; prm = ¶ms->params[dno]; ! exec_eval_datum(estate, datum, &prm->ptype, &prmtypmod, &prm->value, &prm->isnull); } --- 5414,5420 ---- /* OK, evaluate the value and store into the appropriate paramlist slot */ datum = estate->datums[dno]; prm = ¶ms->params[dno]; ! exec_eval_datum(estate, datum, false, &prm->ptype, &prmtypmod, &prm->value, &prm->isnull); } *************** make_tuple_from_row(PLpgSQL_execstate *e *** 5542,5548 **** if (row->varnos[i] < 0) /* should not happen */ elog(ERROR, "dropped rowtype entry for non-dropped column"); ! exec_eval_datum(estate, estate->datums[row->varnos[i]], &fieldtypeid, &fieldtypmod, &dvalues[i], &nulls[i]); if (fieldtypeid != tupdesc->attrs[i]->atttypid) --- 5606,5612 ---- if (row->varnos[i] < 0) /* should not happen */ elog(ERROR, "dropped rowtype entry for non-dropped column"); ! exec_eval_datum(estate, estate->datums[row->varnos[i]], false, &fieldtypeid, &fieldtypmod, &dvalues[i], &nulls[i]); if (fieldtypeid != tupdesc->attrs[i]->atttypid) *************** free_var(PLpgSQL_var *var) *** 6335,6341 **** { if (var->freeval) { ! pfree(DatumGetPointer(var->value)); var->freeval = false; } } --- 6399,6410 ---- { if (var->freeval) { ! if (DatumIsReadWriteExpandedObject(var->value, ! var->isnull, ! var->datatype->typlen)) ! DeleteExpandedObject(var->value); ! else ! pfree(DatumGetPointer(var->value)); var->freeval = false; } } *************** format_expr_params(PLpgSQL_execstate *es *** 6542,6549 **** curvar = (PLpgSQL_var *) estate->datums[dno]; ! exec_eval_datum(estate, (PLpgSQL_datum *) curvar, ¶mtypeid, ! ¶mtypmod, ¶mdatum, ¶misnull); appendStringInfo(¶mstr, "%s%s = ", paramno > 0 ? ", " : "", --- 6611,6619 ---- curvar = (PLpgSQL_var *) estate->datums[dno]; ! exec_eval_datum(estate, (PLpgSQL_datum *) curvar, false, ! ¶mtypeid, ¶mtypmod, ! ¶mdatum, ¶misnull); appendStringInfo(¶mstr, "%s%s = ", paramno > 0 ? ", " : "", diff --git a/src/pl/plpgsql/src/plpgsql.h b/src/pl/plpgsql/src/plpgsql.h index 00f2f77..c95087c 100644 *** a/src/pl/plpgsql/src/plpgsql.h --- b/src/pl/plpgsql/src/plpgsql.h *************** typedef struct *** 180,185 **** --- 180,186 ---- bool typbyval; Oid typrelid; Oid typioparam; + bool typisarray; /* is "true" array, or domain over one */ Oid collation; /* from pg_type, but can be overridden */ FmgrInfo typinput; /* lookup info for typinput function */ int32 atttypmod; /* typmod (taken from someplace else) */
Here's an 0.4 version, in which I've written some user docs, refactored the array-specific code into a more reasonable arrangement, and adjusted a lot of the built-in array functions to support expanded arrays directly. This is about as far as I feel a need to take the latter activity, at least for now; there are a few remaining operations that might be worth converting but it's not clear they'd really offer much benefit. I think this is actually now a serious candidate to be committed as-is, not just a prototype. What we lack though is a clear understanding of the performance characteristics. regards, tom lane diff --git a/doc/src/sgml/storage.sgml b/doc/src/sgml/storage.sgml index d8c5287..e5b7b4b 100644 *** a/doc/src/sgml/storage.sgml --- b/doc/src/sgml/storage.sgml *************** comparison table, in which all the HTML *** 503,510 **** <acronym>TOAST</> pointers can point to data that is not on disk, but is elsewhere in the memory of the current server process. Such pointers obviously cannot be long-lived, but they are nonetheless useful. There ! is currently just one sub-case: ! pointers to <firstterm>indirect</> data. </para> <para> --- 503,511 ---- <acronym>TOAST</> pointers can point to data that is not on disk, but is elsewhere in the memory of the current server process. Such pointers obviously cannot be long-lived, but they are nonetheless useful. There ! are currently two sub-cases: ! pointers to <firstterm>indirect</> data and ! pointers to <firstterm>expanded</> data. </para> <para> *************** and there is no infrastructure to help w *** 519,524 **** --- 520,562 ---- </para> <para> + Expanded <acronym>TOAST</> pointers are useful for complex data types + whose on-disk representation is not especially suited for computational + purposes. As an example, the standard varlena representation of a + <productname>PostgreSQL</> array includes dimensionality information, a + nulls bitmap if there are any null elements, then the values of all the + elements in order. When the element type itself is variable-length, the + only way to find the <replaceable>N</>'th element is to scan through all the + preceding elements. This representation is appropriate for on-disk storage + because of its compactness, but for computations with the array it's much + nicer to have an <quote>expanded</> or <quote>deconstructed</> + representation in which all the element starting locations have been + identified. The <acronym>TOAST</> pointer mechanism supports this need by + allowing a pass-by-reference Datum to point to either a standard varlena + value (the on-disk representation) or a <acronym>TOAST</> pointer that + points to an expanded representation somewhere in memory. The details of + this expanded representation are up to the data type, though it must have + a standard header and meet the other API requirements given + in <filename>src/include/utils/expandeddatum.h</>. C-level functions + working with the data type can choose to handle either representation. + Functions that do not know about the expanded representation, but simply + apply <function>PG_DETOAST_DATUM</> to their inputs, will automatically + receive the traditional varlena representation; so support for an expanded + representation can be introduced incrementally, one function at a time. + </para> + + <para> + <acronym>TOAST</> pointers to expanded values are further broken down + into <firstterm>read-write</> and <firstterm>read-only</> pointers. + The pointed-to representation is the same either way, but a function that + receives a read-write pointer is allowed to modify the referenced value + in-place, whereas one that receives a read-only pointer must not; it must + first create a copy if it wants to make a modified version of the value. + This distinction and some associated conventions make it possible to avoid + unnecessary copying of expanded values during query execution. + </para> + + <para> For all types of in-memory <acronym>TOAST</> pointer, the <acronym>TOAST</> management code ensures that no such pointer datum can accidentally get stored on disk. In-memory <acronym>TOAST</> pointers are automatically diff --git a/doc/src/sgml/xtypes.sgml b/doc/src/sgml/xtypes.sgml index 2459616..ac0b8a2 100644 *** a/doc/src/sgml/xtypes.sgml --- b/doc/src/sgml/xtypes.sgml *************** CREATE TYPE complex ( *** 300,305 **** --- 300,376 ---- </para> </note> + <para> + Another feature that's enabled by <acronym>TOAST</> support is the + possibility of having an <firstterm>expanded</> in-memory data + representation that is more convenient to work with than the format that + is stored on disk. The regular or <quote>flat</> varlena storage format + is ultimately just a blob of bytes; it cannot for example contain + pointers, since it may get copied to other locations in memory. + For complex data types, the flat format may be quite expensive to work + with, so <productname>PostgreSQL</> provides a way to <quote>expand</> + the flat format into a representation that is more suited to computation, + and then pass that format in-memory between functions of the data type. + </para> + + <para> + To use expanded storage, a data type must define an expanded format that + follows the rules given in <filename>src/include/utils/expandeddatum.h</>, + and provide functions to <quote>expand</> a flat varlena value into + expanded format and <quote>flatten</> the expanded format back to the + regular varlena representation. Then ensure that all C functions for + the data type can accept either representation, possibly by converting + one into the other immediately upon receipt. This does not require fixing + all existing functions for the data type at once, because the standard + <function>PG_DETOAST_DATUM</> macro is defined to convert expanded inputs + into regular flat format. Therefore, existing functions that work with + the flat varlena format will continue to work, though slightly + inefficiently, with expanded inputs; they need not be converted until and + unless better performance is important. + </para> + + <para> + C functions that know how to work with an expanded representation + typically fall into two categories: those that can only handle expanded + format, and those that can handle either expanded or flat varlena inputs. + The former are easier to write but may be less efficient overall, because + converting a flat input to expanded form for use by a single function may + cost more than is saved by operating on the expanded format. + When only expanded format need be handled, conversion of flat inputs to + expanded form can be hidden inside an argument-fetching macro, so that + the function appears no more complex than one working with traditional + varlena input. + To handle both types of input, write an argument-fetching function that + will detoast external, short-header, and compressed varlena inputs, but + not expanded inputs. Such a function can be defined as returning a + pointer to a union of the flat varlena format and the expanded format. + Callers can use the <function>VARATT_IS_EXPANDED_HEADER()</> macro to + determine which format they received. + </para> + + <para> + The <acronym>TOAST</> infrastructure not only allows regular varlena + values to be distinguished from expanded values, but also + distinguishes <quote>read-write</> and <quote>read-only</> pointers to + expanded values. C functions that only need to examine an expanded + value, or will only change it in safe and non-semantically-visible ways, + need not care which type of pointer they receive. C functions that + produce a modified version of an input value are allowed to modify an + expanded input value in-place if they receive a read-write pointer, but + must not modify the input if they receive a read-only pointer; in that + case they have to copy the value first, producing a new value to modify. + A C function that has constructed a new expanded value should always + return a read-write pointer to it. Also, a C function that is modifying + a read-write expanded value in-place should take care to leave the value + in a sane state if it fails partway through. + </para> + + <para> + For examples of working with expanded values, see the standard array + infrastructure, particularly + <filename>src/backend/utils/adt/array_expanded.c</>. + </para> + </sect2> </sect1> diff --git a/src/backend/access/common/heaptuple.c b/src/backend/access/common/heaptuple.c index 867035d..860ad78 100644 *** a/src/backend/access/common/heaptuple.c --- b/src/backend/access/common/heaptuple.c *************** *** 60,65 **** --- 60,66 ---- #include "access/sysattr.h" #include "access/tuptoaster.h" #include "executor/tuptable.h" + #include "utils/expandeddatum.h" /* Does att's datatype allow packing into the 1-byte-header varlena format? */ *************** heap_compute_data_size(TupleDesc tupleDe *** 93,105 **** for (i = 0; i < numberOfAttributes; i++) { Datum val; if (isnull[i]) continue; val = values[i]; ! if (ATT_IS_PACKABLE(att[i]) && VARATT_CAN_MAKE_SHORT(DatumGetPointer(val))) { /* --- 94,108 ---- for (i = 0; i < numberOfAttributes; i++) { Datum val; + Form_pg_attribute atti; if (isnull[i]) continue; val = values[i]; + atti = att[i]; ! if (ATT_IS_PACKABLE(atti) && VARATT_CAN_MAKE_SHORT(DatumGetPointer(val))) { /* *************** heap_compute_data_size(TupleDesc tupleDe *** 108,118 **** */ data_length += VARATT_CONVERTED_SHORT_SIZE(DatumGetPointer(val)); } else { ! data_length = att_align_datum(data_length, att[i]->attalign, ! att[i]->attlen, val); ! data_length = att_addlength_datum(data_length, att[i]->attlen, val); } } --- 111,131 ---- */ data_length += VARATT_CONVERTED_SHORT_SIZE(DatumGetPointer(val)); } + else if (atti->attlen == -1 && + VARATT_IS_EXTERNAL_EXPANDED(DatumGetPointer(val))) + { + /* + * we want to flatten the expanded value so that the constructed + * tuple doesn't depend on it + */ + data_length = att_align_nominal(data_length, atti->attalign); + data_length += EOH_get_flat_size(DatumGetEOHP(val)); + } else { ! data_length = att_align_datum(data_length, atti->attalign, ! atti->attlen, val); ! data_length = att_addlength_datum(data_length, atti->attlen, val); } } *************** heap_fill_tuple(TupleDesc tupleDesc, *** 203,212 **** *infomask |= HEAP_HASVARWIDTH; if (VARATT_IS_EXTERNAL(val)) { ! *infomask |= HEAP_HASEXTERNAL; ! /* no alignment, since it's short by definition */ ! data_length = VARSIZE_EXTERNAL(val); ! memcpy(data, val, data_length); } else if (VARATT_IS_SHORT(val)) { --- 216,241 ---- *infomask |= HEAP_HASVARWIDTH; if (VARATT_IS_EXTERNAL(val)) { ! if (VARATT_IS_EXTERNAL_EXPANDED(val)) ! { ! /* ! * we want to flatten the expanded value so that the ! * constructed tuple doesn't depend on it ! */ ! ExpandedObjectHeader *eoh = DatumGetEOHP(values[i]); ! ! data = (char *) att_align_nominal(data, ! att[i]->attalign); ! data_length = EOH_get_flat_size(eoh); ! EOH_flatten_into(eoh, data, data_length); ! } ! else ! { ! *infomask |= HEAP_HASEXTERNAL; ! /* no alignment, since it's short by definition */ ! data_length = VARSIZE_EXTERNAL(val); ! memcpy(data, val, data_length); ! } } else if (VARATT_IS_SHORT(val)) { diff --git a/src/backend/access/heap/tuptoaster.c b/src/backend/access/heap/tuptoaster.c index f8c1401..ebcbbc4 100644 *** a/src/backend/access/heap/tuptoaster.c --- b/src/backend/access/heap/tuptoaster.c *************** *** 37,42 **** --- 37,43 ---- #include "catalog/catalog.h" #include "common/pg_lzcompress.h" #include "miscadmin.h" + #include "utils/expandeddatum.h" #include "utils/fmgroids.h" #include "utils/rel.h" #include "utils/typcache.h" *************** heap_tuple_fetch_attr(struct varlena * a *** 130,135 **** --- 131,149 ---- result = (struct varlena *) palloc(VARSIZE_ANY(attr)); memcpy(result, attr, VARSIZE_ANY(attr)); } + else if (VARATT_IS_EXTERNAL_EXPANDED(attr)) + { + /* + * This is an expanded-object pointer --- get flat format + */ + ExpandedObjectHeader *eoh; + Size resultsize; + + eoh = DatumGetEOHP(PointerGetDatum(attr)); + resultsize = EOH_get_flat_size(eoh); + result = (struct varlena *) palloc(resultsize); + EOH_flatten_into(eoh, (void *) result, resultsize); + } else { /* *************** heap_tuple_untoast_attr(struct varlena * *** 196,201 **** --- 210,224 ---- attr = result; } } + else if (VARATT_IS_EXTERNAL_EXPANDED(attr)) + { + /* + * This is an expanded-object pointer --- get flat format + */ + attr = heap_tuple_fetch_attr(attr); + /* flatteners are not allowed to produce compressed/short output */ + Assert(!VARATT_IS_EXTENDED(attr)); + } else if (VARATT_IS_COMPRESSED(attr)) { /* *************** heap_tuple_untoast_attr_slice(struct var *** 263,268 **** --- 286,296 ---- return heap_tuple_untoast_attr_slice(redirect.pointer, sliceoffset, slicelength); } + else if (VARATT_IS_EXTERNAL_EXPANDED(attr)) + { + /* pass it off to heap_tuple_fetch_attr to flatten */ + preslice = heap_tuple_fetch_attr(attr); + } else preslice = attr; *************** toast_raw_datum_size(Datum value) *** 344,349 **** --- 372,381 ---- return toast_raw_datum_size(PointerGetDatum(toast_pointer.pointer)); } + else if (VARATT_IS_EXTERNAL_EXPANDED(attr)) + { + result = EOH_get_flat_size(DatumGetEOHP(value)); + } else if (VARATT_IS_COMPRESSED(attr)) { /* here, va_rawsize is just the payload size */ *************** toast_datum_size(Datum value) *** 400,405 **** --- 432,441 ---- return toast_datum_size(PointerGetDatum(toast_pointer.pointer)); } + else if (VARATT_IS_EXTERNAL_EXPANDED(attr)) + { + result = EOH_get_flat_size(DatumGetEOHP(value)); + } else if (VARATT_IS_SHORT(attr)) { result = VARSIZE_SHORT(attr); diff --git a/src/backend/executor/execQual.c b/src/backend/executor/execQual.c index fec76d4..7bdc201 100644 *** a/src/backend/executor/execQual.c --- b/src/backend/executor/execQual.c *************** ExecEvalArrayCoerceExpr(ArrayCoerceExprS *** 4246,4252 **** { ArrayCoerceExpr *acoerce = (ArrayCoerceExpr *) astate->xprstate.expr; Datum result; - ArrayType *array; FunctionCallInfoData locfcinfo; result = ExecEvalExpr(astate->arg, econtext, isNull, isDone); --- 4246,4251 ---- *************** ExecEvalArrayCoerceExpr(ArrayCoerceExprS *** 4263,4276 **** if (!OidIsValid(acoerce->elemfuncid)) { /* Detoast input array if necessary, and copy in any case */ ! array = DatumGetArrayTypePCopy(result); ARR_ELEMTYPE(array) = astate->resultelemtype; PG_RETURN_ARRAYTYPE_P(array); } - /* Detoast input array if necessary, but don't make a useless copy */ - array = DatumGetArrayTypeP(result); - /* Initialize function cache if first time through */ if (astate->elemfunc.fn_oid == InvalidOid) { --- 4262,4273 ---- if (!OidIsValid(acoerce->elemfuncid)) { /* Detoast input array if necessary, and copy in any case */ ! ArrayType *array = DatumGetArrayTypePCopy(result); ! ARR_ELEMTYPE(array) = astate->resultelemtype; PG_RETURN_ARRAYTYPE_P(array); } /* Initialize function cache if first time through */ if (astate->elemfunc.fn_oid == InvalidOid) { *************** ExecEvalArrayCoerceExpr(ArrayCoerceExprS *** 4300,4314 **** */ InitFunctionCallInfoData(locfcinfo, &(astate->elemfunc), 3, InvalidOid, NULL, NULL); ! locfcinfo.arg[0] = PointerGetDatum(array); locfcinfo.arg[1] = Int32GetDatum(acoerce->resulttypmod); locfcinfo.arg[2] = BoolGetDatum(acoerce->isExplicit); locfcinfo.argnull[0] = false; locfcinfo.argnull[1] = false; locfcinfo.argnull[2] = false; ! return array_map(&locfcinfo, ARR_ELEMTYPE(array), astate->resultelemtype, ! astate->amstate); } /* ---------------------------------------------------------------- --- 4297,4310 ---- */ InitFunctionCallInfoData(locfcinfo, &(astate->elemfunc), 3, InvalidOid, NULL, NULL); ! locfcinfo.arg[0] = result; locfcinfo.arg[1] = Int32GetDatum(acoerce->resulttypmod); locfcinfo.arg[2] = BoolGetDatum(acoerce->isExplicit); locfcinfo.argnull[0] = false; locfcinfo.argnull[1] = false; locfcinfo.argnull[2] = false; ! return array_map(&locfcinfo, astate->resultelemtype, astate->amstate); } /* ---------------------------------------------------------------- diff --git a/src/backend/executor/execTuples.c b/src/backend/executor/execTuples.c index 753754d..a05d8b1 100644 *** a/src/backend/executor/execTuples.c --- b/src/backend/executor/execTuples.c *************** *** 88,93 **** --- 88,94 ---- #include "nodes/nodeFuncs.h" #include "storage/bufmgr.h" #include "utils/builtins.h" + #include "utils/expandeddatum.h" #include "utils/lsyscache.h" #include "utils/typcache.h" *************** ExecCopySlot(TupleTableSlot *dstslot, Tu *** 812,817 **** --- 813,864 ---- return ExecStoreTuple(newTuple, dstslot, InvalidBuffer, true); } + /* -------------------------------- + * ExecMakeSlotContentsReadOnly + * Mark any R/W expanded datums in the slot as read-only. + * + * This is needed when a slot that might contain R/W datum references is to be + * used as input for general expression evaluation. Since the expression(s) + * might contain more than one Var referencing the same R/W datum, we could + * get wrong answers if functions acting on those Vars thought they could + * modify the expanded value in-place. + * + * For notational reasons, we return the same slot passed in. + * -------------------------------- + */ + TupleTableSlot * + ExecMakeSlotContentsReadOnly(TupleTableSlot *slot) + { + /* + * sanity checks + */ + Assert(slot != NULL); + Assert(slot->tts_tupleDescriptor != NULL); + Assert(!slot->tts_isempty); + + /* + * If the slot contains a physical tuple, it can't contain any expanded + * datums, because we flatten those when making a physical tuple. This + * might change later; but for now, we need do nothing unless the slot is + * virtual. + */ + if (slot->tts_tuple == NULL) + { + Form_pg_attribute *att = slot->tts_tupleDescriptor->attrs; + int attnum; + + for (attnum = 0; attnum < slot->tts_nvalid; attnum++) + { + slot->tts_values[attnum] = + MakeExpandedObjectReadOnly(slot->tts_values[attnum], + slot->tts_isnull[attnum], + att[attnum]->attlen); + } + } + + return slot; + } + /* ---------------------------------------------------------------- * convenience initialization routines diff --git a/src/backend/executor/nodeSubqueryscan.c b/src/backend/executor/nodeSubqueryscan.c index 3f66e24..e5d1e54 100644 *** a/src/backend/executor/nodeSubqueryscan.c --- b/src/backend/executor/nodeSubqueryscan.c *************** SubqueryNext(SubqueryScanState *node) *** 56,62 **** --- 56,70 ---- * We just return the subplan's result slot, rather than expending extra * cycles for ExecCopySlot(). (Our own ScanTupleSlot is used only for * EvalPlanQual rechecks.) + * + * We do need to mark the slot contents read-only to prevent interference + * between different functions reading the same datum from the slot. It's + * a bit hokey to do this to the subplan's slot, but should be safe + * enough. */ + if (!TupIsNull(slot)) + slot = ExecMakeSlotContentsReadOnly(slot); + return slot; } diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c index 4b86e91..daa9f69 100644 *** a/src/backend/executor/spi.c --- b/src/backend/executor/spi.c *************** SPI_pfree(void *pointer) *** 1014,1019 **** --- 1014,1040 ---- pfree(pointer); } + Datum + SPI_datumTransfer(Datum value, bool typByVal, int typLen) + { + MemoryContext oldcxt = NULL; + Datum result; + + if (_SPI_curid + 1 == _SPI_connected) /* connected */ + { + if (_SPI_current != &(_SPI_stack[_SPI_curid + 1])) + elog(ERROR, "SPI stack corrupted"); + oldcxt = MemoryContextSwitchTo(_SPI_current->savedcxt); + } + + result = datumTransfer(value, typByVal, typLen); + + if (oldcxt) + MemoryContextSwitchTo(oldcxt); + + return result; + } + void SPI_freetuple(HeapTuple tuple) { diff --git a/src/backend/utils/adt/Makefile b/src/backend/utils/adt/Makefile index 20e5ff1..d1ed33f 100644 *** a/src/backend/utils/adt/Makefile --- b/src/backend/utils/adt/Makefile *************** endif *** 16,25 **** endif # keep this list arranged alphabetically or it gets to be a mess ! OBJS = acl.o arrayfuncs.o array_selfuncs.o array_typanalyze.o \ ! array_userfuncs.o arrayutils.o ascii.o bool.o \ ! cash.o char.o date.o datetime.o datum.o dbsize.o domains.o \ ! encode.o enum.o float.o format_type.o formatting.o genfile.o \ geo_ops.o geo_selfuncs.o inet_cidr_ntop.o inet_net_pton.o int.o \ int8.o json.o jsonb.o jsonb_gin.o jsonb_op.o jsonb_util.o \ jsonfuncs.o like.o lockfuncs.o mac.o misc.o nabstime.o name.o \ --- 16,26 ---- endif # keep this list arranged alphabetically or it gets to be a mess ! OBJS = acl.o arrayfuncs.o array_expanded.o array_selfuncs.o \ ! array_typanalyze.o array_userfuncs.o arrayutils.o ascii.o \ ! bool.o cash.o char.o date.o datetime.o datum.o dbsize.o domains.o \ ! encode.o enum.o expandeddatum.o \ ! float.o format_type.o formatting.o genfile.o \ geo_ops.o geo_selfuncs.o inet_cidr_ntop.o inet_net_pton.o int.o \ int8.o json.o jsonb.o jsonb_gin.o jsonb_op.o jsonb_util.o \ jsonfuncs.o like.o lockfuncs.o mac.o misc.o nabstime.o name.o \ diff --git a/src/backend/utils/adt/array_expanded.c b/src/backend/utils/adt/array_expanded.c index ...f89a533 . *** a/src/backend/utils/adt/array_expanded.c --- b/src/backend/utils/adt/array_expanded.c *************** *** 0 **** --- 1,356 ---- + /*------------------------------------------------------------------------- + * + * array_expanded.c + * Basic functions for manipulating expanded arrays. + * + * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group + * Portions Copyright (c) 1994, Regents of the University of California + * + * + * IDENTIFICATION + * src/backend/utils/adt/array_expanded.c + * + *------------------------------------------------------------------------- + */ + #include "postgres.h" + + #include "access/tupmacs.h" + #include "utils/array.h" + #include "utils/lsyscache.h" + #include "utils/memutils.h" + + + /* "Methods" required for an expanded object */ + static Size EA_get_flat_size(ExpandedObjectHeader *eohptr); + static void EA_flatten_into(ExpandedObjectHeader *eohptr, + void *result, Size allocated_size); + + static const ExpandedObjectMethods EA_methods = + { + EA_get_flat_size, + EA_flatten_into + }; + + + /* + * expand_array: convert an array Datum into an expanded array + * + * The expanded object will be a child of parentcontext. + * + * Caller can provide element type's representational data; we do that because + * caller is often in a position to cache it across repeated calls. If the + * caller can't do that, pass zeroes for elmlen/elmbyval/elmalign. + */ + Datum + expand_array(Datum arraydatum, MemoryContext parentcontext, + int elmlen, bool elmbyval, char elmalign) + { + ArrayType *array; + ExpandedArrayHeader *eah; + MemoryContext objcxt; + MemoryContext oldcxt; + + /* + * Allocate private context for expanded object. We start by assuming + * that the array won't be very large; but if it does grow a lot, don't + * constrain aset.c's large-context behavior. + */ + objcxt = AllocSetContextCreate(parentcontext, + "expanded array", + ALLOCSET_SMALL_MINSIZE, + ALLOCSET_SMALL_INITSIZE, + ALLOCSET_DEFAULT_MAXSIZE); + + /* Set up expanded array header */ + eah = (ExpandedArrayHeader *) + MemoryContextAlloc(objcxt, sizeof(ExpandedArrayHeader)); + + EOH_init_header(&eah->hdr, &EA_methods, objcxt); + eah->ea_magic = EA_MAGIC; + + /* + * Detoast and copy original array into private context, as a flat array. + * We flatten it even if it's in expanded form; it's not clear that adding + * a special-case path for that would be worth the trouble. + * + * Note that this coding risks leaking some memory in the private context + * if we have to fetch data from a TOAST table; however, experimentation + * says that the leak is minimal. Doing it this way saves a copy step, + * which seems worthwhile, especially if the array is large enough to need + * external storage. + */ + oldcxt = MemoryContextSwitchTo(objcxt); + array = DatumGetArrayTypePCopy(arraydatum); + MemoryContextSwitchTo(oldcxt); + + eah->ndims = ARR_NDIM(array); + /* note these pointers point into the fvalue header! */ + eah->dims = ARR_DIMS(array); + eah->lbound = ARR_LBOUND(array); + + /* save array's element-type data for possible use later */ + eah->element_type = ARR_ELEMTYPE(array); + if (elmlen) + { + /* Caller provided representational data */ + eah->typlen = elmlen; + eah->typbyval = elmbyval; + eah->typalign = elmalign; + } + else + { + /* No, so look it up */ + get_typlenbyvalalign(eah->element_type, + &eah->typlen, + &eah->typbyval, + &eah->typalign); + } + + /* we don't make a deconstructed representation now */ + eah->dvalues = NULL; + eah->dnulls = NULL; + eah->dvalueslen = 0; + eah->nelems = 0; + eah->flat_size = 0; + + /* remember we have a flat representation */ + eah->fvalue = array; + eah->fstartptr = ARR_DATA_PTR(array); + eah->fendptr = ((char *) array) + ARR_SIZE(array); + + /* return a R/W pointer to the expanded array */ + return EOHPGetRWDatum(&eah->hdr); + } + + /* + * get_flat_size method for expanded arrays + */ + static Size + EA_get_flat_size(ExpandedObjectHeader *eohptr) + { + ExpandedArrayHeader *eah = (ExpandedArrayHeader *) eohptr; + int nelems; + int ndims; + Datum *dvalues; + bool *dnulls; + Size nbytes; + int i; + + Assert(eah->ea_magic == EA_MAGIC); + + /* Easy if we have a valid flattened value */ + if (eah->fvalue) + return ARR_SIZE(eah->fvalue); + + /* If we have a cached size value, believe that */ + if (eah->flat_size) + return eah->flat_size; + + /* + * Compute space needed by examining dvalues/dnulls. Note that the result + * array will have a nulls bitmap if dnulls isn't NULL, even if the array + * doesn't actually contain any nulls now. + */ + nelems = eah->nelems; + ndims = eah->ndims; + Assert(nelems == ArrayGetNItems(ndims, eah->dims)); + dvalues = eah->dvalues; + dnulls = eah->dnulls; + nbytes = 0; + for (i = 0; i < nelems; i++) + { + if (dnulls && dnulls[i]) + continue; + nbytes = att_addlength_datum(nbytes, eah->typlen, dvalues[i]); + nbytes = att_align_nominal(nbytes, eah->typalign); + /* check for overflow of total request */ + if (!AllocSizeIsValid(nbytes)) + ereport(ERROR, + (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED), + errmsg("array size exceeds the maximum allowed (%d)", + (int) MaxAllocSize))); + } + + if (dnulls) + nbytes += ARR_OVERHEAD_WITHNULLS(ndims, nelems); + else + nbytes += ARR_OVERHEAD_NONULLS(ndims); + + /* cache for next time */ + eah->flat_size = nbytes; + + return nbytes; + } + + /* + * flatten_into method for expanded arrays + */ + static void + EA_flatten_into(ExpandedObjectHeader *eohptr, + void *result, Size allocated_size) + { + ExpandedArrayHeader *eah = (ExpandedArrayHeader *) eohptr; + ArrayType *aresult = (ArrayType *) result; + int nelems; + int ndims; + int32 dataoffset; + + Assert(eah->ea_magic == EA_MAGIC); + + /* Easy if we have a valid flattened value */ + if (eah->fvalue) + { + Assert(allocated_size == ARR_SIZE(eah->fvalue)); + memcpy(result, eah->fvalue, allocated_size); + return; + } + + /* Else allocation should match previous get_flat_size result */ + Assert(allocated_size == eah->flat_size); + + /* Fill result array from dvalues/dnulls */ + nelems = eah->nelems; + ndims = eah->ndims; + + if (eah->dnulls) + dataoffset = ARR_OVERHEAD_WITHNULLS(ndims, nelems); + else + dataoffset = 0; /* marker for no null bitmap */ + + /* We must ensure that any pad space is zero-filled */ + memset(aresult, 0, allocated_size); + + SET_VARSIZE(aresult, allocated_size); + aresult->ndim = ndims; + aresult->dataoffset = dataoffset; + aresult->elemtype = eah->element_type; + memcpy(ARR_DIMS(aresult), eah->dims, ndims * sizeof(int)); + memcpy(ARR_LBOUND(aresult), eah->lbound, ndims * sizeof(int)); + + CopyArrayEls(aresult, + eah->dvalues, eah->dnulls, nelems, + eah->typlen, eah->typbyval, eah->typalign, + false); + } + + /* + * Argument fetching support code + */ + + /* + * DatumGetExpandedArray: get a writable expanded array from an input argument + */ + ExpandedArrayHeader * + DatumGetExpandedArray(Datum d) + { + ExpandedArrayHeader *eah; + + /* If it's a writable expanded array already, just return it */ + if (VARATT_IS_EXTERNAL_EXPANDED_RW(DatumGetPointer(d))) + { + eah = (ExpandedArrayHeader *) DatumGetEOHP(d); + Assert(eah->ea_magic == EA_MAGIC); + return eah; + } + + /* + * If it's a non-writable expanded array, copy it, extracting the element + * representational data to save a catalog lookup. + */ + if (VARATT_IS_EXTERNAL_EXPANDED_RO(DatumGetPointer(d))) + { + eah = (ExpandedArrayHeader *) DatumGetEOHP(d); + Assert(eah->ea_magic == EA_MAGIC); + d = expand_array(d, CurrentMemoryContext, + eah->typlen, eah->typbyval, eah->typalign); + return (ExpandedArrayHeader *) DatumGetEOHP(d); + } + + /* Else expand the hard way */ + d = expand_array(d, CurrentMemoryContext, 0, 0, 0); + return (ExpandedArrayHeader *) DatumGetEOHP(d); + } + + /* + * As above, when caller has the ability to cache element type info + */ + ExpandedArrayHeader * + DatumGetExpandedArrayX(Datum d, + int elmlen, bool elmbyval, char elmalign) + { + ExpandedArrayHeader *eah; + + /* If it's a writable expanded array already, just return it */ + if (VARATT_IS_EXTERNAL_EXPANDED_RW(DatumGetPointer(d))) + { + eah = (ExpandedArrayHeader *) DatumGetEOHP(d); + Assert(eah->ea_magic == EA_MAGIC); + Assert(eah->typlen == elmlen); + Assert(eah->typbyval == elmbyval); + Assert(eah->typalign == elmalign); + return eah; + } + + /* Else expand using caller's data */ + d = expand_array(d, CurrentMemoryContext, elmlen, elmbyval, elmalign); + return (ExpandedArrayHeader *) DatumGetEOHP(d); + } + + /* + * DatumGetAnyArray: return either an expanded array or a detoasted varlena + * array. The result must not be modified in-place. + */ + AnyArrayType * + DatumGetAnyArray(Datum d) + { + ExpandedArrayHeader *eah; + + /* + * If it's an expanded array (RW or RO), return the header pointer. + */ + if (VARATT_IS_EXTERNAL_EXPANDED(DatumGetPointer(d))) + { + eah = (ExpandedArrayHeader *) DatumGetEOHP(d); + Assert(eah->ea_magic == EA_MAGIC); + return (AnyArrayType *) eah; + } + + /* Else do regular detoasting as needed */ + return (AnyArrayType *) PG_DETOAST_DATUM(d); + } + + /* + * Create the Datum/isnull representation of an expanded array object + * if we didn't do so previously + */ + void + deconstruct_expanded_array(ExpandedArrayHeader *eah) + { + if (eah->dvalues == NULL) + { + MemoryContext oldcxt = MemoryContextSwitchTo(eah->hdr.eoh_context); + Datum *dvalues; + bool *dnulls; + int nelems; + + dnulls = NULL; + deconstruct_array(eah->fvalue, + eah->element_type, + eah->typlen, eah->typbyval, eah->typalign, + &dvalues, + ARR_HASNULL(eah->fvalue) ? &dnulls : NULL, + &nelems); + + /* + * Update header only after successful completion of this step. If + * deconstruct_array fails partway through, worst consequence is some + * leaked memory in the object's context. If the caller fails at a + * later point, that's fine, since the deconstructed representation is + * valid anyhow. + */ + eah->dvalues = dvalues; + eah->dnulls = dnulls; + eah->dvalueslen = eah->nelems = nelems; + MemoryContextSwitchTo(oldcxt); + } + } diff --git a/src/backend/utils/adt/array_userfuncs.c b/src/backend/utils/adt/array_userfuncs.c index 5c20d0c..755c50e 100644 *** a/src/backend/utils/adt/array_userfuncs.c --- b/src/backend/utils/adt/array_userfuncs.c *************** *** 20,33 **** /* * fetch_array_arg_replace_nulls * ! * Fetch an array-valued argument; if it's null, construct an empty array ! * value of the proper data type. Also cache basic element type information ! * in fn_extra. */ ! static ArrayType * fetch_array_arg_replace_nulls(FunctionCallInfo fcinfo, int argno) { ! ArrayType *v; ArrayMetaState *my_extra; my_extra = (ArrayMetaState *) fcinfo->flinfo->fn_extra; --- 20,33 ---- /* * fetch_array_arg_replace_nulls * ! * Fetch an array-valued argument in expanded form; if it's null, construct an ! * empty array value of the proper data type. Also cache basic element type ! * information in fn_extra. */ ! static ExpandedArrayHeader * fetch_array_arg_replace_nulls(FunctionCallInfo fcinfo, int argno) { ! ExpandedArrayHeader *eah; ArrayMetaState *my_extra; my_extra = (ArrayMetaState *) fcinfo->flinfo->fn_extra; *************** fetch_array_arg_replace_nulls(FunctionCa *** 63,73 **** /* Now we can collect the array value */ if (PG_ARGISNULL(argno)) ! v = construct_empty_array(my_extra->element_type); else ! v = PG_GETARG_ARRAYTYPE_P(argno); ! return v; } /*----------------------------------------------------------------------------- --- 63,80 ---- /* Now we can collect the array value */ if (PG_ARGISNULL(argno)) ! eah = construct_empty_expanded_array(my_extra->element_type, ! CurrentMemoryContext, ! my_extra->typlen, ! my_extra->typbyval, ! my_extra->typalign); else ! eah = PG_GETARG_EXPANDED_ARRAYX(argno, ! my_extra->typlen, ! my_extra->typbyval, ! my_extra->typalign); ! return eah; } /*----------------------------------------------------------------------------- *************** fetch_array_arg_replace_nulls(FunctionCa *** 78,106 **** Datum array_append(PG_FUNCTION_ARGS) { ! ArrayType *v; Datum newelem; bool isNull; ! ArrayType *result; int *dimv, *lb; int indx; ArrayMetaState *my_extra; ! v = fetch_array_arg_replace_nulls(fcinfo, 0); isNull = PG_ARGISNULL(1); if (isNull) newelem = (Datum) 0; else newelem = PG_GETARG_DATUM(1); ! if (ARR_NDIM(v) == 1) { /* append newelem */ int ub; ! lb = ARR_LBOUND(v); ! dimv = ARR_DIMS(v); ub = dimv[0] + lb[0] - 1; indx = ub + 1; --- 85,113 ---- Datum array_append(PG_FUNCTION_ARGS) { ! ExpandedArrayHeader *eah; Datum newelem; bool isNull; ! Datum result; int *dimv, *lb; int indx; ArrayMetaState *my_extra; ! eah = fetch_array_arg_replace_nulls(fcinfo, 0); isNull = PG_ARGISNULL(1); if (isNull) newelem = (Datum) 0; else newelem = PG_GETARG_DATUM(1); ! if (eah->ndims == 1) { /* append newelem */ int ub; ! lb = eah->lbound; ! dimv = eah->dims; ub = dimv[0] + lb[0] - 1; indx = ub + 1; *************** array_append(PG_FUNCTION_ARGS) *** 110,116 **** (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE), errmsg("integer out of range"))); } ! else if (ARR_NDIM(v) == 0) indx = 1; else ereport(ERROR, --- 117,123 ---- (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE), errmsg("integer out of range"))); } ! else if (eah->ndims == 0) indx = 1; else ereport(ERROR, *************** array_append(PG_FUNCTION_ARGS) *** 120,129 **** /* Perform element insertion */ my_extra = (ArrayMetaState *) fcinfo->flinfo->fn_extra; ! result = array_set(v, 1, &indx, newelem, isNull, -1, my_extra->typlen, my_extra->typbyval, my_extra->typalign); ! PG_RETURN_ARRAYTYPE_P(result); } /*----------------------------------------------------------------------------- --- 127,137 ---- /* Perform element insertion */ my_extra = (ArrayMetaState *) fcinfo->flinfo->fn_extra; ! result = array_set_element(EOHPGetRWDatum(&eah->hdr), ! 1, &indx, newelem, isNull, -1, my_extra->typlen, my_extra->typbyval, my_extra->typalign); ! PG_RETURN_DATUM(result); } /*----------------------------------------------------------------------------- *************** array_append(PG_FUNCTION_ARGS) *** 134,146 **** Datum array_prepend(PG_FUNCTION_ARGS) { ! ArrayType *v; Datum newelem; bool isNull; ! ArrayType *result; int *dimv, *lb; int indx; ArrayMetaState *my_extra; isNull = PG_ARGISNULL(0); --- 142,155 ---- Datum array_prepend(PG_FUNCTION_ARGS) { ! ExpandedArrayHeader *eah; Datum newelem; bool isNull; ! Datum result; int *dimv, *lb; int indx; + int lb0; ArrayMetaState *my_extra; isNull = PG_ARGISNULL(0); *************** array_prepend(PG_FUNCTION_ARGS) *** 148,161 **** newelem = (Datum) 0; else newelem = PG_GETARG_DATUM(0); ! v = fetch_array_arg_replace_nulls(fcinfo, 1); ! if (ARR_NDIM(v) == 1) { /* prepend newelem */ ! lb = ARR_LBOUND(v); ! dimv = ARR_DIMS(v); indx = lb[0] - 1; /* overflow? */ if (indx > lb[0]) --- 157,171 ---- newelem = (Datum) 0; else newelem = PG_GETARG_DATUM(0); ! eah = fetch_array_arg_replace_nulls(fcinfo, 1); ! if (eah->ndims == 1) { /* prepend newelem */ ! lb = eah->lbound; ! dimv = eah->dims; indx = lb[0] - 1; + lb0 = lb[0]; /* overflow? */ if (indx > lb[0]) *************** array_prepend(PG_FUNCTION_ARGS) *** 163,170 **** (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE), errmsg("integer out of range"))); } ! else if (ARR_NDIM(v) == 0) indx = 1; else ereport(ERROR, (errcode(ERRCODE_DATA_EXCEPTION), --- 173,183 ---- (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE), errmsg("integer out of range"))); } ! else if (eah->ndims == 0) ! { indx = 1; + lb0 = 1; + } else ereport(ERROR, (errcode(ERRCODE_DATA_EXCEPTION), *************** array_prepend(PG_FUNCTION_ARGS) *** 173,186 **** /* Perform element insertion */ my_extra = (ArrayMetaState *) fcinfo->flinfo->fn_extra; ! result = array_set(v, 1, &indx, newelem, isNull, -1, my_extra->typlen, my_extra->typbyval, my_extra->typalign); /* Readjust result's LB to match the input's, as expected for prepend */ ! if (ARR_NDIM(v) == 1) ! ARR_LBOUND(result)[0] = ARR_LBOUND(v)[0]; ! PG_RETURN_ARRAYTYPE_P(result); } /*----------------------------------------------------------------------------- --- 186,204 ---- /* Perform element insertion */ my_extra = (ArrayMetaState *) fcinfo->flinfo->fn_extra; ! result = array_set_element(EOHPGetRWDatum(&eah->hdr), ! 1, &indx, newelem, isNull, -1, my_extra->typlen, my_extra->typbyval, my_extra->typalign); /* Readjust result's LB to match the input's, as expected for prepend */ ! Assert(result == EOHPGetRWDatum(&eah->hdr)); ! if (eah->ndims == 1) ! { ! /* This is ok whether we've deconstructed or not */ ! eah->lbound[0] = lb0; ! } ! PG_RETURN_DATUM(result); } /*----------------------------------------------------------------------------- diff --git a/src/backend/utils/adt/arrayfuncs.c b/src/backend/utils/adt/arrayfuncs.c index 79aefaf..9b3d58e 100644 *** a/src/backend/utils/adt/arrayfuncs.c --- b/src/backend/utils/adt/arrayfuncs.c *************** bool Array_nulls = true; *** 42,47 **** --- 42,53 ---- */ #define ASSGN "=" + #define AARR_FREE_IF_COPY(array,n) \ + do { \ + if (!VARATT_IS_EXPANDED_HEADER(array)) \ + PG_FREE_IF_COPY(array, n); \ + } while (0) + typedef enum { ARRAY_NO_LEVEL, *************** static void ReadArrayBinary(StringInfo b *** 93,102 **** int typlen, bool typbyval, char typalign, Datum *values, bool *nulls, bool *hasnulls, int32 *nbytes); ! static void CopyArrayEls(ArrayType *array, ! Datum *values, bool *nulls, int nitems, ! int typlen, bool typbyval, char typalign, ! bool freedata); static bool array_get_isnull(const bits8 *nullbitmap, int offset); static void array_set_isnull(bits8 *nullbitmap, int offset, bool isNull); static Datum ArrayCast(char *value, bool byval, int len); --- 99,114 ---- int typlen, bool typbyval, char typalign, Datum *values, bool *nulls, bool *hasnulls, int32 *nbytes); ! static Datum array_get_element_expanded(Datum arraydatum, ! int nSubscripts, int *indx, ! int arraytyplen, ! int elmlen, bool elmbyval, char elmalign, ! bool *isNull); ! static Datum array_set_element_expanded(Datum arraydatum, ! int nSubscripts, int *indx, ! Datum dataValue, bool isNull, ! int arraytyplen, ! int elmlen, bool elmbyval, char elmalign); static bool array_get_isnull(const bits8 *nullbitmap, int offset); static void array_set_isnull(bits8 *nullbitmap, int offset, bool isNull); static Datum ArrayCast(char *value, bool byval, int len); *************** ReadArrayStr(char *arrayStr, *** 939,945 **** * the values are not toasted. (Doing it here doesn't work since the * caller has already allocated space for the array...) */ ! static void CopyArrayEls(ArrayType *array, Datum *values, bool *nulls, --- 951,957 ---- * the values are not toasted. (Doing it here doesn't work since the * caller has already allocated space for the array...) */ ! void CopyArrayEls(ArrayType *array, Datum *values, bool *nulls, *************** CopyArrayEls(ArrayType *array, *** 997,1004 **** Datum array_out(PG_FUNCTION_ARGS) { ! ArrayType *v = PG_GETARG_ARRAYTYPE_P(0); ! Oid element_type = ARR_ELEMTYPE(v); int typlen; bool typbyval; char typalign; --- 1009,1016 ---- Datum array_out(PG_FUNCTION_ARGS) { ! AnyArrayType *v = PG_GETARG_ANY_ARRAY(0); ! Oid element_type = AARR_ELEMTYPE(v); int typlen; bool typbyval; char typalign; *************** array_out(PG_FUNCTION_ARGS) *** 1014,1021 **** * * +2 allows for assignment operator + trailing null */ - bits8 *bitmap; - int bitmask; bool *needquotes, needdims = false; int nitems, --- 1026,1031 ---- *************** array_out(PG_FUNCTION_ARGS) *** 1027,1032 **** --- 1037,1043 ---- int ndim, *dims, *lb; + ARRAY_ITER ARRAY_ITER_VARS(iter); ArrayMetaState *my_extra; /* *************** array_out(PG_FUNCTION_ARGS) *** 1061,1069 **** typalign = my_extra->typalign; typdelim = my_extra->typdelim; ! ndim = ARR_NDIM(v); ! dims = ARR_DIMS(v); ! lb = ARR_LBOUND(v); nitems = ArrayGetNItems(ndim, dims); if (nitems == 0) --- 1072,1080 ---- typalign = my_extra->typalign; typdelim = my_extra->typdelim; ! ndim = AARR_NDIM(v); ! dims = AARR_DIMS(v); ! lb = AARR_LBOUND(v); nitems = ArrayGetNItems(ndim, dims); if (nitems == 0) *************** array_out(PG_FUNCTION_ARGS) *** 1094,1109 **** needquotes = (bool *) palloc(nitems * sizeof(bool)); overall_length = 1; /* don't forget to count \0 at end. */ ! p = ARR_DATA_PTR(v); ! bitmap = ARR_NULLBITMAP(v); ! bitmask = 1; for (i = 0; i < nitems; i++) { bool needquote; /* Get source element, checking for NULL */ ! if (bitmap && (*bitmap & bitmask) == 0) { values[i] = pstrdup("NULL"); overall_length += 4; --- 1105,1122 ---- needquotes = (bool *) palloc(nitems * sizeof(bool)); overall_length = 1; /* don't forget to count \0 at end. */ ! ARRAY_ITER_SETUP(iter, v); for (i = 0; i < nitems; i++) { + Datum itemvalue; + bool isnull; bool needquote; /* Get source element, checking for NULL */ ! ARRAY_ITER_NEXT(iter, i, itemvalue, isnull, typlen, typbyval, typalign); ! ! if (isnull) { values[i] = pstrdup("NULL"); overall_length += 4; *************** array_out(PG_FUNCTION_ARGS) *** 1111,1122 **** } else { - Datum itemvalue; - - itemvalue = fetch_att(p, typbyval, typlen); values[i] = OutputFunctionCall(&my_extra->proc, itemvalue); - p = att_addlength_pointer(p, typlen, p); - p = (char *) att_align_nominal(p, typalign); /* count data plus backslashes; detect chars needing quotes */ if (values[i][0] == '\0') --- 1124,1130 ---- *************** array_out(PG_FUNCTION_ARGS) *** 1149,1165 **** overall_length += 2; /* and the comma */ overall_length += 1; - - /* advance bitmap pointer if any */ - if (bitmap) - { - bitmask <<= 1; - if (bitmask == 0x100) - { - bitmap++; - bitmask = 1; - } - } } /* --- 1157,1162 ---- *************** ReadArrayBinary(StringInfo buf, *** 1534,1552 **** Datum array_send(PG_FUNCTION_ARGS) { ! ArrayType *v = PG_GETARG_ARRAYTYPE_P(0); ! Oid element_type = ARR_ELEMTYPE(v); int typlen; bool typbyval; char typalign; - char *p; - bits8 *bitmap; - int bitmask; int nitems, i; int ndim, ! *dim; StringInfoData buf; ArrayMetaState *my_extra; /* --- 1531,1548 ---- Datum array_send(PG_FUNCTION_ARGS) { ! AnyArrayType *v = PG_GETARG_ANY_ARRAY(0); ! Oid element_type = AARR_ELEMTYPE(v); int typlen; bool typbyval; char typalign; int nitems, i; int ndim, ! *dim, ! *lb; StringInfoData buf; + ARRAY_ITER ARRAY_ITER_VARS(iter); ArrayMetaState *my_extra; /* *************** array_send(PG_FUNCTION_ARGS) *** 1583,1642 **** typbyval = my_extra->typbyval; typalign = my_extra->typalign; ! ndim = ARR_NDIM(v); ! dim = ARR_DIMS(v); nitems = ArrayGetNItems(ndim, dim); pq_begintypsend(&buf); /* Send the array header information */ pq_sendint(&buf, ndim, 4); ! pq_sendint(&buf, ARR_HASNULL(v) ? 1 : 0, 4); pq_sendint(&buf, element_type, sizeof(Oid)); for (i = 0; i < ndim; i++) { ! pq_sendint(&buf, ARR_DIMS(v)[i], 4); ! pq_sendint(&buf, ARR_LBOUND(v)[i], 4); } /* Send the array elements using the element's own sendproc */ ! p = ARR_DATA_PTR(v); ! bitmap = ARR_NULLBITMAP(v); ! bitmask = 1; for (i = 0; i < nitems; i++) { /* Get source element, checking for NULL */ ! if (bitmap && (*bitmap & bitmask) == 0) { /* -1 length means a NULL */ pq_sendint(&buf, -1, 4); } else { - Datum itemvalue; bytea *outputbytes; - itemvalue = fetch_att(p, typbyval, typlen); outputbytes = SendFunctionCall(&my_extra->proc, itemvalue); pq_sendint(&buf, VARSIZE(outputbytes) - VARHDRSZ, 4); pq_sendbytes(&buf, VARDATA(outputbytes), VARSIZE(outputbytes) - VARHDRSZ); pfree(outputbytes); - - p = att_addlength_pointer(p, typlen, p); - p = (char *) att_align_nominal(p, typalign); - } - - /* advance bitmap pointer if any */ - if (bitmap) - { - bitmask <<= 1; - if (bitmask == 0x100) - { - bitmap++; - bitmask = 1; - } } } --- 1579,1626 ---- typbyval = my_extra->typbyval; typalign = my_extra->typalign; ! ndim = AARR_NDIM(v); ! dim = AARR_DIMS(v); ! lb = AARR_LBOUND(v); nitems = ArrayGetNItems(ndim, dim); pq_begintypsend(&buf); /* Send the array header information */ pq_sendint(&buf, ndim, 4); ! pq_sendint(&buf, AARR_HASNULL(v) ? 1 : 0, 4); pq_sendint(&buf, element_type, sizeof(Oid)); for (i = 0; i < ndim; i++) { ! pq_sendint(&buf, dim[i], 4); ! pq_sendint(&buf, lb[i], 4); } /* Send the array elements using the element's own sendproc */ ! ARRAY_ITER_SETUP(iter, v); for (i = 0; i < nitems; i++) { + Datum itemvalue; + bool isnull; + /* Get source element, checking for NULL */ ! ARRAY_ITER_NEXT(iter, i, itemvalue, isnull, typlen, typbyval, typalign); ! ! if (isnull) { /* -1 length means a NULL */ pq_sendint(&buf, -1, 4); } else { bytea *outputbytes; outputbytes = SendFunctionCall(&my_extra->proc, itemvalue); pq_sendint(&buf, VARSIZE(outputbytes) - VARHDRSZ, 4); pq_sendbytes(&buf, VARDATA(outputbytes), VARSIZE(outputbytes) - VARHDRSZ); pfree(outputbytes); } } *************** array_send(PG_FUNCTION_ARGS) *** 1650,1662 **** Datum array_ndims(PG_FUNCTION_ARGS) { ! ArrayType *v = PG_GETARG_ARRAYTYPE_P(0); /* Sanity check: does it look like an array at all? */ ! if (ARR_NDIM(v) <= 0 || ARR_NDIM(v) > MAXDIM) PG_RETURN_NULL(); ! PG_RETURN_INT32(ARR_NDIM(v)); } /* --- 1634,1646 ---- Datum array_ndims(PG_FUNCTION_ARGS) { ! AnyArrayType *v = PG_GETARG_ANY_ARRAY(0); /* Sanity check: does it look like an array at all? */ ! if (AARR_NDIM(v) <= 0 || AARR_NDIM(v) > MAXDIM) PG_RETURN_NULL(); ! PG_RETURN_INT32(AARR_NDIM(v)); } /* *************** array_ndims(PG_FUNCTION_ARGS) *** 1666,1672 **** Datum array_dims(PG_FUNCTION_ARGS) { ! ArrayType *v = PG_GETARG_ARRAYTYPE_P(0); char *p; int i; int *dimv, --- 1650,1656 ---- Datum array_dims(PG_FUNCTION_ARGS) { ! AnyArrayType *v = PG_GETARG_ANY_ARRAY(0); char *p; int i; int *dimv, *************** array_dims(PG_FUNCTION_ARGS) *** 1680,1693 **** char buf[MAXDIM * 33 + 1]; /* Sanity check: does it look like an array at all? */ ! if (ARR_NDIM(v) <= 0 || ARR_NDIM(v) > MAXDIM) PG_RETURN_NULL(); ! dimv = ARR_DIMS(v); ! lb = ARR_LBOUND(v); p = buf; ! for (i = 0; i < ARR_NDIM(v); i++) { sprintf(p, "[%d:%d]", lb[i], dimv[i] + lb[i] - 1); p += strlen(p); --- 1664,1677 ---- char buf[MAXDIM * 33 + 1]; /* Sanity check: does it look like an array at all? */ ! if (AARR_NDIM(v) <= 0 || AARR_NDIM(v) > MAXDIM) PG_RETURN_NULL(); ! dimv = AARR_DIMS(v); ! lb = AARR_LBOUND(v); p = buf; ! for (i = 0; i < AARR_NDIM(v); i++) { sprintf(p, "[%d:%d]", lb[i], dimv[i] + lb[i] - 1); p += strlen(p); *************** array_dims(PG_FUNCTION_ARGS) *** 1704,1723 **** Datum array_lower(PG_FUNCTION_ARGS) { ! ArrayType *v = PG_GETARG_ARRAYTYPE_P(0); int reqdim = PG_GETARG_INT32(1); int *lb; int result; /* Sanity check: does it look like an array at all? */ ! if (ARR_NDIM(v) <= 0 || ARR_NDIM(v) > MAXDIM) PG_RETURN_NULL(); /* Sanity check: was the requested dim valid */ ! if (reqdim <= 0 || reqdim > ARR_NDIM(v)) PG_RETURN_NULL(); ! lb = ARR_LBOUND(v); result = lb[reqdim - 1]; PG_RETURN_INT32(result); --- 1688,1707 ---- Datum array_lower(PG_FUNCTION_ARGS) { ! AnyArrayType *v = PG_GETARG_ANY_ARRAY(0); int reqdim = PG_GETARG_INT32(1); int *lb; int result; /* Sanity check: does it look like an array at all? */ ! if (AARR_NDIM(v) <= 0 || AARR_NDIM(v) > MAXDIM) PG_RETURN_NULL(); /* Sanity check: was the requested dim valid */ ! if (reqdim <= 0 || reqdim > AARR_NDIM(v)) PG_RETURN_NULL(); ! lb = AARR_LBOUND(v); result = lb[reqdim - 1]; PG_RETURN_INT32(result); *************** array_lower(PG_FUNCTION_ARGS) *** 1731,1752 **** Datum array_upper(PG_FUNCTION_ARGS) { ! ArrayType *v = PG_GETARG_ARRAYTYPE_P(0); int reqdim = PG_GETARG_INT32(1); int *dimv, *lb; int result; /* Sanity check: does it look like an array at all? */ ! if (ARR_NDIM(v) <= 0 || ARR_NDIM(v) > MAXDIM) PG_RETURN_NULL(); /* Sanity check: was the requested dim valid */ ! if (reqdim <= 0 || reqdim > ARR_NDIM(v)) PG_RETURN_NULL(); ! lb = ARR_LBOUND(v); ! dimv = ARR_DIMS(v); result = dimv[reqdim - 1] + lb[reqdim - 1] - 1; --- 1715,1736 ---- Datum array_upper(PG_FUNCTION_ARGS) { ! AnyArrayType *v = PG_GETARG_ANY_ARRAY(0); int reqdim = PG_GETARG_INT32(1); int *dimv, *lb; int result; /* Sanity check: does it look like an array at all? */ ! if (AARR_NDIM(v) <= 0 || AARR_NDIM(v) > MAXDIM) PG_RETURN_NULL(); /* Sanity check: was the requested dim valid */ ! if (reqdim <= 0 || reqdim > AARR_NDIM(v)) PG_RETURN_NULL(); ! lb = AARR_LBOUND(v); ! dimv = AARR_DIMS(v); result = dimv[reqdim - 1] + lb[reqdim - 1] - 1; *************** array_upper(PG_FUNCTION_ARGS) *** 1761,1780 **** Datum array_length(PG_FUNCTION_ARGS) { ! ArrayType *v = PG_GETARG_ARRAYTYPE_P(0); int reqdim = PG_GETARG_INT32(1); int *dimv; int result; /* Sanity check: does it look like an array at all? */ ! if (ARR_NDIM(v) <= 0 || ARR_NDIM(v) > MAXDIM) PG_RETURN_NULL(); /* Sanity check: was the requested dim valid */ ! if (reqdim <= 0 || reqdim > ARR_NDIM(v)) PG_RETURN_NULL(); ! dimv = ARR_DIMS(v); result = dimv[reqdim - 1]; --- 1745,1764 ---- Datum array_length(PG_FUNCTION_ARGS) { ! AnyArrayType *v = PG_GETARG_ANY_ARRAY(0); int reqdim = PG_GETARG_INT32(1); int *dimv; int result; /* Sanity check: does it look like an array at all? */ ! if (AARR_NDIM(v) <= 0 || AARR_NDIM(v) > MAXDIM) PG_RETURN_NULL(); /* Sanity check: was the requested dim valid */ ! if (reqdim <= 0 || reqdim > AARR_NDIM(v)) PG_RETURN_NULL(); ! dimv = AARR_DIMS(v); result = dimv[reqdim - 1]; *************** array_length(PG_FUNCTION_ARGS) *** 1788,1796 **** Datum array_cardinality(PG_FUNCTION_ARGS) { ! ArrayType *v = PG_GETARG_ARRAYTYPE_P(0); ! PG_RETURN_INT32(ArrayGetNItems(ARR_NDIM(v), ARR_DIMS(v))); } --- 1772,1780 ---- Datum array_cardinality(PG_FUNCTION_ARGS) { ! AnyArrayType *v = PG_GETARG_ANY_ARRAY(0); ! PG_RETURN_INT32(ArrayGetNItems(AARR_NDIM(v), AARR_DIMS(v))); } *************** array_get_element(Datum arraydatum, *** 1825,1831 **** char elmalign, bool *isNull) { - ArrayType *array; int i, ndim, *dim, --- 1809,1814 ---- *************** array_get_element(Datum arraydatum, *** 1850,1859 **** arraydataptr = (char *) DatumGetPointer(arraydatum); arraynullsptr = NULL; } else { ! /* detoast input array if necessary */ ! array = DatumGetArrayTypeP(arraydatum); ndim = ARR_NDIM(array); dim = ARR_DIMS(array); --- 1833,1854 ---- arraydataptr = (char *) DatumGetPointer(arraydatum); arraynullsptr = NULL; } + else if (VARATT_IS_EXTERNAL_EXPANDED(DatumGetPointer(arraydatum))) + { + /* expanded array: let's do this in a separate function */ + return array_get_element_expanded(arraydatum, + nSubscripts, + indx, + arraytyplen, + elmlen, + elmbyval, + elmalign, + isNull); + } else { ! /* detoast array if necessary, producing normal varlena input */ ! ArrayType *array = DatumGetArrayTypeP(arraydatum); ndim = ARR_NDIM(array); dim = ARR_DIMS(array); *************** array_get_element(Datum arraydatum, *** 1903,1908 **** --- 1898,1985 ---- } /* + * Implementation of array_get_element() for an expanded array + */ + static Datum + array_get_element_expanded(Datum arraydatum, + int nSubscripts, int *indx, + int arraytyplen, + int elmlen, bool elmbyval, char elmalign, + bool *isNull) + { + ExpandedArrayHeader *eah; + int i, + ndim, + *dim, + *lb, + offset; + Datum *dvalues; + bool *dnulls; + + eah = (ExpandedArrayHeader *) DatumGetEOHP(arraydatum); + Assert(eah->ea_magic == EA_MAGIC); + + /* sanity-check caller's info against object */ + Assert(arraytyplen == -1); + Assert(elmlen == eah->typlen); + Assert(elmbyval == eah->typbyval); + Assert(elmalign == eah->typalign); + + ndim = eah->ndims; + dim = eah->dims; + lb = eah->lbound; + + /* + * Return NULL for invalid subscript + */ + if (ndim != nSubscripts || ndim <= 0 || ndim > MAXDIM) + { + *isNull = true; + return (Datum) 0; + } + for (i = 0; i < ndim; i++) + { + if (indx[i] < lb[i] || indx[i] >= (dim[i] + lb[i])) + { + *isNull = true; + return (Datum) 0; + } + } + + /* + * Calculate the element number + */ + offset = ArrayGetOffset(nSubscripts, dim, lb, indx); + + /* + * Deconstruct array if we didn't already. Note that we apply this even + * if the input is nominally read-only: it should be safe enough. + */ + deconstruct_expanded_array(eah); + + dvalues = eah->dvalues; + dnulls = eah->dnulls; + + /* + * Check for NULL array element + */ + if (dnulls && dnulls[offset]) + { + *isNull = true; + return (Datum) 0; + } + + /* + * OK, get the element. It's OK to return a pass-by-ref value as a + * pointer into the expanded array, for the same reason that regular + * array_get_element can return a pointer into flat arrays: the value is + * assumed not to change for as long as the Datum reference can exist. + */ + *isNull = false; + return dvalues[offset]; + } + + /* * array_get_slice : * This routine takes an array and a range of indices (upperIndex and * lowerIndx), creates a new array structure for the referred elements *************** array_get_slice(Datum arraydatum, *** 2083,2089 **** * * Result: * A new array is returned, just like the old except for the one ! * modified entry. The original array object is not changed. * * For one-dimensional arrays only, we allow the array to be extended * by assigning to a position outside the existing subscript range; any --- 2160,2168 ---- * * Result: * A new array is returned, just like the old except for the one ! * modified entry. The original array object is not changed, ! * unless what is passed is a read-write reference to an expanded ! * array object; in that case the expanded array is updated in-place. * * For one-dimensional arrays only, we allow the array to be extended * by assigning to a position outside the existing subscript range; any *************** array_set_element(Datum arraydatum, *** 2166,2171 **** --- 2245,2264 ---- if (elmlen == -1 && !isNull) dataValue = PointerGetDatum(PG_DETOAST_DATUM(dataValue)); + if (VARATT_IS_EXTERNAL_EXPANDED(DatumGetPointer(arraydatum))) + { + /* expanded array: let's do this in a separate function */ + return array_set_element_expanded(arraydatum, + nSubscripts, + indx, + dataValue, + isNull, + arraytyplen, + elmlen, + elmbyval, + elmalign); + } + /* detoast input array if necessary */ array = DatumGetArrayTypeP(arraydatum); *************** array_set_element(Datum arraydatum, *** 2355,2360 **** --- 2448,2702 ---- } /* + * Implementation of array_set_element() for an expanded array + * + * Note: as with any operation on a read/write expanded object, we must + * take pains not to leave the object in a corrupt state if we fail partway + * through. + */ + static Datum + array_set_element_expanded(Datum arraydatum, + int nSubscripts, int *indx, + Datum dataValue, bool isNull, + int arraytyplen, + int elmlen, bool elmbyval, char elmalign) + { + ExpandedArrayHeader *eah; + Datum *dvalues; + bool *dnulls; + int i, + ndim, + dim[MAXDIM], + lb[MAXDIM], + offset; + bool dimschanged, + newhasnulls; + int addedbefore, + addedafter; + char *oldValue; + + /* Convert to R/W object if not so already */ + if (!VARATT_IS_EXTERNAL_EXPANDED_RW(DatumGetPointer(arraydatum))) + arraydatum = expand_array(arraydatum, CurrentMemoryContext, + elmlen, elmbyval, elmalign); + + eah = (ExpandedArrayHeader *) DatumGetEOHP(arraydatum); + Assert(eah->ea_magic == EA_MAGIC); + + /* sanity-check caller's info against object */ + Assert(arraytyplen == -1); + Assert(elmlen == eah->typlen); + Assert(elmbyval == eah->typbyval); + Assert(elmalign == eah->typalign); + + /* + * Copy dimension info into local storage. This allows us to modify the + * dimensions if needed, while not messing up the expanded value if we + * fail partway through. + */ + ndim = eah->ndims; + Assert(ndim >= 0 && ndim <= MAXDIM); + memcpy(dim, eah->dims, ndim * sizeof(int)); + memcpy(lb, eah->lbound, ndim * sizeof(int)); + dimschanged = false; + + /* + * if number of dims is zero, i.e. an empty array, create an array with + * nSubscripts dimensions, and set the lower bounds to the supplied + * subscripts. + */ + if (ndim == 0) + { + /* + * Allocate adequate space for new dimension info. This is harmless + * if we fail later. + */ + Assert(nSubscripts > 0 && nSubscripts <= MAXDIM); + eah->dims = (int *) MemoryContextAllocZero(eah->hdr.eoh_context, + nSubscripts * sizeof(int)); + eah->lbound = (int *) MemoryContextAllocZero(eah->hdr.eoh_context, + nSubscripts * sizeof(int)); + + /* Update local copies of dimension info */ + ndim = nSubscripts; + for (i = 0; i < nSubscripts; i++) + { + dim[i] = 0; + lb[i] = indx[i]; + } + dimschanged = true; + } + else if (ndim != nSubscripts) + ereport(ERROR, + (errcode(ERRCODE_ARRAY_SUBSCRIPT_ERROR), + errmsg("wrong number of array subscripts"))); + + /* + * Deconstruct array if we didn't already. (Someday maybe add a special + * case path for fixed-length, no-nulls cases, where we can overwrite an + * element in place without ever deconstructing. But today is not that + * day.) + */ + deconstruct_expanded_array(eah); + + /* + * Copy new element into array's context, if needed (we assume it's + * already detoasted, so no junk should be created). If we fail further + * down, this memory is leaked, but that's reasonably harmless. + */ + if (!eah->typbyval && !isNull) + { + MemoryContext oldcxt = MemoryContextSwitchTo(eah->hdr.eoh_context); + + dataValue = datumCopy(dataValue, false, eah->typlen); + MemoryContextSwitchTo(oldcxt); + } + + dvalues = eah->dvalues; + dnulls = eah->dnulls; + + newhasnulls = ((dnulls != NULL) || isNull); + addedbefore = addedafter = 0; + + /* + * Check subscripts (this logic matches original array_set_element) + */ + if (ndim == 1) + { + if (indx[0] < lb[0]) + { + addedbefore = lb[0] - indx[0]; + dim[0] += addedbefore; + lb[0] = indx[0]; + dimschanged = true; + if (addedbefore > 1) + newhasnulls = true; /* will insert nulls */ + } + if (indx[0] >= (dim[0] + lb[0])) + { + addedafter = indx[0] - (dim[0] + lb[0]) + 1; + dim[0] += addedafter; + dimschanged = true; + if (addedafter > 1) + newhasnulls = true; /* will insert nulls */ + } + } + else + { + /* + * XXX currently we do not support extending multi-dimensional arrays + * during assignment + */ + for (i = 0; i < ndim; i++) + { + if (indx[i] < lb[i] || + indx[i] >= (dim[i] + lb[i])) + ereport(ERROR, + (errcode(ERRCODE_ARRAY_SUBSCRIPT_ERROR), + errmsg("array subscript out of range"))); + } + } + + /* Now we can calculate linear offset of target item in array */ + offset = ArrayGetOffset(nSubscripts, dim, lb, indx); + + /* Physically enlarge existing dvalues/dnulls arrays if needed */ + if (dim[0] > eah->dvalueslen) + { + /* We want some extra space if we're enlarging */ + int newlen = dim[0] + dim[0] / 8; + + eah->dvalues = dvalues = (Datum *) + repalloc(dvalues, newlen * sizeof(Datum)); + if (dnulls) + eah->dnulls = dnulls = (bool *) + repalloc(dnulls, newlen * sizeof(bool)); + eah->dvalueslen = newlen; + } + + /* + * If we need a nulls bitmap and don't already have one, create it, being + * sure to mark all existing entries as not null. + */ + if (newhasnulls && dnulls == NULL) + eah->dnulls = dnulls = (bool *) + MemoryContextAllocZero(eah->hdr.eoh_context, + eah->dvalueslen * sizeof(bool)); + + /* + * We now have all the needed space allocated, so we're ready to make + * irreversible changes. Be very wary of allowing failure below here. + */ + + /* Flattened value will no longer represent array accurately */ + eah->fvalue = NULL; + /* And we don't know the flattened size either */ + eah->flat_size = 0; + + /* Update dimensionality info if needed */ + if (dimschanged) + { + eah->ndims = ndim; + memcpy(eah->dims, dim, ndim * sizeof(int)); + memcpy(eah->lbound, lb, ndim * sizeof(int)); + } + + /* Reposition items if needed, and fill addedbefore items with nulls */ + if (addedbefore > 0) + { + memmove(dvalues + addedbefore, dvalues, eah->nelems * sizeof(Datum)); + for (i = 0; i < addedbefore; i++) + dvalues[i] = (Datum) 0; + if (dnulls) + { + memmove(dnulls + addedbefore, dnulls, eah->nelems * sizeof(bool)); + for (i = 0; i < addedbefore; i++) + dnulls[i] = true; + } + eah->nelems += addedbefore; + } + + /* fill addedafter items with nulls */ + if (addedafter > 0) + { + for (i = 0; i < addedafter; i++) + dvalues[eah->nelems + i] = (Datum) 0; + if (dnulls) + { + for (i = 0; i < addedafter; i++) + dnulls[eah->nelems + i] = true; + } + eah->nelems += addedafter; + } + + /* Grab old element value for pfree'ing, if needed. */ + if (!eah->typbyval && (dnulls == NULL || !dnulls[offset])) + oldValue = (char *) DatumGetPointer(dvalues[offset]); + else + oldValue = NULL; + + /* And finally we can insert the new element. */ + dvalues[offset] = dataValue; + if (dnulls) + dnulls[offset] = isNull; + + /* + * Free old element if needed; this keeps repeated element replacements + * from bloating the array's storage. If the pfree somehow fails, it + * won't corrupt the array. + */ + if (oldValue) + { + /* Don't try to pfree a part of the original flat array */ + if (oldValue < eah->fstartptr || oldValue >= eah->fendptr) + pfree(oldValue); + } + + /* Done, return standard TOAST pointer for object */ + return EOHPGetRWDatum(&eah->hdr); + } + + /* * array_set_slice : * This routine sets the value of a range of array locations (specified * by upper and lower subscript values) to new values passed as *************** array_set(ArrayType *array, int nSubscri *** 2734,2741 **** * the function fn(), and if nargs > 1 then argument positions after the * first must be preset to the additional values to be passed. The * first argument position initially holds the input array value. - * * inpType: OID of element type of input array. This must be the same as, - * or binary-compatible with, the first argument type of fn(). * * retType: OID of element type of output array. This must be the same as, * or binary-compatible with, the result type of fn(). * * amstate: workspace for array_map. Must be zeroed by caller before --- 3076,3081 ---- *************** array_set(ArrayType *array, int nSubscri *** 2749,2762 **** * the array are OK however. */ Datum ! array_map(FunctionCallInfo fcinfo, Oid inpType, Oid retType, ! ArrayMapState *amstate) { ! ArrayType *v; ArrayType *result; Datum *values; bool *nulls; - Datum elt; int *dim; int ndim; int nitems; --- 3089,3100 ---- * the array are OK however. */ Datum ! array_map(FunctionCallInfo fcinfo, Oid retType, ArrayMapState *amstate) { ! AnyArrayType *v; ArrayType *result; Datum *values; bool *nulls; int *dim; int ndim; int nitems; *************** array_map(FunctionCallInfo fcinfo, Oid i *** 2764,2778 **** int32 nbytes = 0; int32 dataoffset; bool hasnulls; int inp_typlen; bool inp_typbyval; char inp_typalign; int typlen; bool typbyval; char typalign; ! char *s; ! bits8 *bitmap; ! int bitmask; ArrayMetaState *inp_extra; ArrayMetaState *ret_extra; --- 3102,3115 ---- int32 nbytes = 0; int32 dataoffset; bool hasnulls; + Oid inpType; int inp_typlen; bool inp_typbyval; char inp_typalign; int typlen; bool typbyval; char typalign; ! ARRAY_ITER ARRAY_ITER_VARS(iter); ArrayMetaState *inp_extra; ArrayMetaState *ret_extra; *************** array_map(FunctionCallInfo fcinfo, Oid i *** 2781,2792 **** elog(ERROR, "invalid nargs: %d", fcinfo->nargs); if (PG_ARGISNULL(0)) elog(ERROR, "null input array"); ! v = PG_GETARG_ARRAYTYPE_P(0); ! ! Assert(ARR_ELEMTYPE(v) == inpType); ! ndim = ARR_NDIM(v); ! dim = ARR_DIMS(v); nitems = ArrayGetNItems(ndim, dim); /* Check for empty array */ --- 3118,3128 ---- elog(ERROR, "invalid nargs: %d", fcinfo->nargs); if (PG_ARGISNULL(0)) elog(ERROR, "null input array"); ! v = PG_GETARG_ANY_ARRAY(0); ! inpType = AARR_ELEMTYPE(v); ! ndim = AARR_NDIM(v); ! dim = AARR_DIMS(v); nitems = ArrayGetNItems(ndim, dim); /* Check for empty array */ *************** array_map(FunctionCallInfo fcinfo, Oid i *** 2833,2841 **** nulls = (bool *) palloc(nitems * sizeof(bool)); /* Loop over source data */ ! s = ARR_DATA_PTR(v); ! bitmap = ARR_NULLBITMAP(v); ! bitmask = 1; hasnulls = false; for (i = 0; i < nitems; i++) --- 3169,3175 ---- nulls = (bool *) palloc(nitems * sizeof(bool)); /* Loop over source data */ ! ARRAY_ITER_SETUP(iter, v); hasnulls = false; for (i = 0; i < nitems; i++) *************** array_map(FunctionCallInfo fcinfo, Oid i *** 2843,2860 **** bool callit = true; /* Get source element, checking for NULL */ ! if (bitmap && (*bitmap & bitmask) == 0) ! { ! fcinfo->argnull[0] = true; ! } ! else ! { ! elt = fetch_att(s, inp_typbyval, inp_typlen); ! s = att_addlength_datum(s, inp_typlen, elt); ! s = (char *) att_align_nominal(s, inp_typalign); ! fcinfo->arg[0] = elt; ! fcinfo->argnull[0] = false; ! } /* * Apply the given function to source elt and extra args. --- 3177,3184 ---- bool callit = true; /* Get source element, checking for NULL */ ! ARRAY_ITER_NEXT(iter, i, fcinfo->arg[0], fcinfo->argnull[0], ! inp_typlen, inp_typbyval, inp_typalign); /* * Apply the given function to source elt and extra args. *************** array_map(FunctionCallInfo fcinfo, Oid i *** 2899,2915 **** errmsg("array size exceeds the maximum allowed (%d)", (int) MaxAllocSize))); } - - /* advance bitmap pointer if any */ - if (bitmap) - { - bitmask <<= 1; - if (bitmask == 0x100) - { - bitmap++; - bitmask = 1; - } - } } /* Allocate and initialize the result array */ --- 3223,3228 ---- *************** array_map(FunctionCallInfo fcinfo, Oid i *** 2928,2934 **** result->ndim = ndim; result->dataoffset = dataoffset; result->elemtype = retType; ! memcpy(ARR_DIMS(result), ARR_DIMS(v), 2 * ndim * sizeof(int)); /* * Note: do not risk trying to pfree the results of the called function --- 3241,3248 ---- result->ndim = ndim; result->dataoffset = dataoffset; result->elemtype = retType; ! memcpy(ARR_DIMS(result), AARR_DIMS(v), ndim * sizeof(int)); ! memcpy(ARR_LBOUND(result), AARR_LBOUND(v), ndim * sizeof(int)); /* * Note: do not risk trying to pfree the results of the called function *************** construct_empty_array(Oid elmtype) *** 3092,3097 **** --- 3406,3429 ---- } /* + * construct_empty_expanded_array: make an empty expanded array + * given only type information. (elmlen etc can be zeroes if not known.) + */ + ExpandedArrayHeader * + construct_empty_expanded_array(Oid element_type, + MemoryContext parentcontext, + int elmlen, bool elmbyval, char elmalign) + { + ArrayType *array = construct_empty_array(element_type); + Datum d; + + d = expand_array(PointerGetDatum(array), parentcontext, + elmlen, elmbyval, elmalign); + pfree(array); + return (ExpandedArrayHeader *) DatumGetEOHP(d); + } + + /* * deconstruct_array --- simple method for extracting data from an array * * array: array object to examine (must not be NULL) *************** array_contains_nulls(ArrayType *array) *** 3229,3264 **** Datum array_eq(PG_FUNCTION_ARGS) { ! ArrayType *array1 = PG_GETARG_ARRAYTYPE_P(0); ! ArrayType *array2 = PG_GETARG_ARRAYTYPE_P(1); Oid collation = PG_GET_COLLATION(); ! int ndims1 = ARR_NDIM(array1); ! int ndims2 = ARR_NDIM(array2); ! int *dims1 = ARR_DIMS(array1); ! int *dims2 = ARR_DIMS(array2); ! Oid element_type = ARR_ELEMTYPE(array1); bool result = true; int nitems; TypeCacheEntry *typentry; int typlen; bool typbyval; char typalign; ! char *ptr1; ! char *ptr2; ! bits8 *bitmap1; ! bits8 *bitmap2; ! int bitmask; int i; FunctionCallInfoData locfcinfo; ! if (element_type != ARR_ELEMTYPE(array2)) ereport(ERROR, (errcode(ERRCODE_DATATYPE_MISMATCH), errmsg("cannot compare arrays of different element types"))); /* fast path if the arrays do not have the same dimensionality */ if (ndims1 != ndims2 || ! memcmp(dims1, dims2, 2 * ndims1 * sizeof(int)) != 0) result = false; else { --- 3561,3596 ---- Datum array_eq(PG_FUNCTION_ARGS) { ! AnyArrayType *array1 = PG_GETARG_ANY_ARRAY(0); ! AnyArrayType *array2 = PG_GETARG_ANY_ARRAY(1); Oid collation = PG_GET_COLLATION(); ! int ndims1 = AARR_NDIM(array1); ! int ndims2 = AARR_NDIM(array2); ! int *dims1 = AARR_DIMS(array1); ! int *dims2 = AARR_DIMS(array2); ! int *lbs1 = AARR_LBOUND(array1); ! int *lbs2 = AARR_LBOUND(array2); ! Oid element_type = AARR_ELEMTYPE(array1); bool result = true; int nitems; TypeCacheEntry *typentry; int typlen; bool typbyval; char typalign; ! ARRAY_ITER ARRAY_ITER_VARS(it1); ! ARRAY_ITER ARRAY_ITER_VARS(it2); int i; FunctionCallInfoData locfcinfo; ! if (element_type != AARR_ELEMTYPE(array2)) ereport(ERROR, (errcode(ERRCODE_DATATYPE_MISMATCH), errmsg("cannot compare arrays of different element types"))); /* fast path if the arrays do not have the same dimensionality */ if (ndims1 != ndims2 || ! memcmp(dims1, dims2, ndims1 * sizeof(int)) != 0 || ! memcmp(lbs1, lbs2, ndims1 * sizeof(int)) != 0) result = false; else { *************** array_eq(PG_FUNCTION_ARGS) *** 3293,3303 **** /* Loop over source data */ nitems = ArrayGetNItems(ndims1, dims1); ! ptr1 = ARR_DATA_PTR(array1); ! ptr2 = ARR_DATA_PTR(array2); ! bitmap1 = ARR_NULLBITMAP(array1); ! bitmap2 = ARR_NULLBITMAP(array2); ! bitmask = 1; /* use same bitmask for both arrays */ for (i = 0; i < nitems; i++) { --- 3625,3632 ---- /* Loop over source data */ nitems = ArrayGetNItems(ndims1, dims1); ! ARRAY_ITER_SETUP(it1, array1); ! ARRAY_ITER_SETUP(it2, array2); for (i = 0; i < nitems; i++) { *************** array_eq(PG_FUNCTION_ARGS) *** 3308,3349 **** bool oprresult; /* Get elements, checking for NULL */ ! if (bitmap1 && (*bitmap1 & bitmask) == 0) ! { ! isnull1 = true; ! elt1 = (Datum) 0; ! } ! else ! { ! isnull1 = false; ! elt1 = fetch_att(ptr1, typbyval, typlen); ! ptr1 = att_addlength_pointer(ptr1, typlen, ptr1); ! ptr1 = (char *) att_align_nominal(ptr1, typalign); ! } ! ! if (bitmap2 && (*bitmap2 & bitmask) == 0) ! { ! isnull2 = true; ! elt2 = (Datum) 0; ! } ! else ! { ! isnull2 = false; ! elt2 = fetch_att(ptr2, typbyval, typlen); ! ptr2 = att_addlength_pointer(ptr2, typlen, ptr2); ! ptr2 = (char *) att_align_nominal(ptr2, typalign); ! } ! ! /* advance bitmap pointers if any */ ! bitmask <<= 1; ! if (bitmask == 0x100) ! { ! if (bitmap1) ! bitmap1++; ! if (bitmap2) ! bitmap2++; ! bitmask = 1; ! } /* * We consider two NULLs equal; NULL and not-NULL are unequal. --- 3637,3644 ---- bool oprresult; /* Get elements, checking for NULL */ ! ARRAY_ITER_NEXT(it1, i, elt1, isnull1, typlen, typbyval, typalign); ! ARRAY_ITER_NEXT(it2, i, elt2, isnull2, typlen, typbyval, typalign); /* * We consider two NULLs equal; NULL and not-NULL are unequal. *************** array_eq(PG_FUNCTION_ARGS) *** 3374,3381 **** } /* Avoid leaking memory when handed toasted input. */ ! PG_FREE_IF_COPY(array1, 0); ! PG_FREE_IF_COPY(array2, 1); PG_RETURN_BOOL(result); } --- 3669,3676 ---- } /* Avoid leaking memory when handed toasted input. */ ! AARR_FREE_IF_COPY(array1, 0); ! AARR_FREE_IF_COPY(array2, 1); PG_RETURN_BOOL(result); } *************** btarraycmp(PG_FUNCTION_ARGS) *** 3435,3465 **** static int array_cmp(FunctionCallInfo fcinfo) { ! ArrayType *array1 = PG_GETARG_ARRAYTYPE_P(0); ! ArrayType *array2 = PG_GETARG_ARRAYTYPE_P(1); Oid collation = PG_GET_COLLATION(); ! int ndims1 = ARR_NDIM(array1); ! int ndims2 = ARR_NDIM(array2); ! int *dims1 = ARR_DIMS(array1); ! int *dims2 = ARR_DIMS(array2); int nitems1 = ArrayGetNItems(ndims1, dims1); int nitems2 = ArrayGetNItems(ndims2, dims2); ! Oid element_type = ARR_ELEMTYPE(array1); int result = 0; TypeCacheEntry *typentry; int typlen; bool typbyval; char typalign; int min_nitems; ! char *ptr1; ! char *ptr2; ! bits8 *bitmap1; ! bits8 *bitmap2; ! int bitmask; int i; FunctionCallInfoData locfcinfo; ! if (element_type != ARR_ELEMTYPE(array2)) ereport(ERROR, (errcode(ERRCODE_DATATYPE_MISMATCH), errmsg("cannot compare arrays of different element types"))); --- 3730,3757 ---- static int array_cmp(FunctionCallInfo fcinfo) { ! AnyArrayType *array1 = PG_GETARG_ANY_ARRAY(0); ! AnyArrayType *array2 = PG_GETARG_ANY_ARRAY(1); Oid collation = PG_GET_COLLATION(); ! int ndims1 = AARR_NDIM(array1); ! int ndims2 = AARR_NDIM(array2); ! int *dims1 = AARR_DIMS(array1); ! int *dims2 = AARR_DIMS(array2); int nitems1 = ArrayGetNItems(ndims1, dims1); int nitems2 = ArrayGetNItems(ndims2, dims2); ! Oid element_type = AARR_ELEMTYPE(array1); int result = 0; TypeCacheEntry *typentry; int typlen; bool typbyval; char typalign; int min_nitems; ! ARRAY_ITER ARRAY_ITER_VARS(it1); ! ARRAY_ITER ARRAY_ITER_VARS(it2); int i; FunctionCallInfoData locfcinfo; ! if (element_type != AARR_ELEMTYPE(array2)) ereport(ERROR, (errcode(ERRCODE_DATATYPE_MISMATCH), errmsg("cannot compare arrays of different element types"))); *************** array_cmp(FunctionCallInfo fcinfo) *** 3495,3505 **** /* Loop over source data */ min_nitems = Min(nitems1, nitems2); ! ptr1 = ARR_DATA_PTR(array1); ! ptr2 = ARR_DATA_PTR(array2); ! bitmap1 = ARR_NULLBITMAP(array1); ! bitmap2 = ARR_NULLBITMAP(array2); ! bitmask = 1; /* use same bitmask for both arrays */ for (i = 0; i < min_nitems; i++) { --- 3787,3794 ---- /* Loop over source data */ min_nitems = Min(nitems1, nitems2); ! ARRAY_ITER_SETUP(it1, array1); ! ARRAY_ITER_SETUP(it2, array2); for (i = 0; i < min_nitems; i++) { *************** array_cmp(FunctionCallInfo fcinfo) *** 3510,3551 **** int32 cmpresult; /* Get elements, checking for NULL */ ! if (bitmap1 && (*bitmap1 & bitmask) == 0) ! { ! isnull1 = true; ! elt1 = (Datum) 0; ! } ! else ! { ! isnull1 = false; ! elt1 = fetch_att(ptr1, typbyval, typlen); ! ptr1 = att_addlength_pointer(ptr1, typlen, ptr1); ! ptr1 = (char *) att_align_nominal(ptr1, typalign); ! } ! ! if (bitmap2 && (*bitmap2 & bitmask) == 0) ! { ! isnull2 = true; ! elt2 = (Datum) 0; ! } ! else ! { ! isnull2 = false; ! elt2 = fetch_att(ptr2, typbyval, typlen); ! ptr2 = att_addlength_pointer(ptr2, typlen, ptr2); ! ptr2 = (char *) att_align_nominal(ptr2, typalign); ! } ! ! /* advance bitmap pointers if any */ ! bitmask <<= 1; ! if (bitmask == 0x100) ! { ! if (bitmap1) ! bitmap1++; ! if (bitmap2) ! bitmap2++; ! bitmask = 1; ! } /* * We consider two NULLs equal; NULL > not-NULL. --- 3799,3806 ---- int32 cmpresult; /* Get elements, checking for NULL */ ! ARRAY_ITER_NEXT(it1, i, elt1, isnull1, typlen, typbyval, typalign); ! ARRAY_ITER_NEXT(it2, i, elt2, isnull2, typlen, typbyval, typalign); /* * We consider two NULLs equal; NULL > not-NULL. *************** array_cmp(FunctionCallInfo fcinfo) *** 3604,3611 **** result = (ndims1 < ndims2) ? -1 : 1; else { ! /* this relies on LB array immediately following DIMS array */ ! for (i = 0; i < ndims1 * 2; i++) { if (dims1[i] != dims2[i]) { --- 3859,3865 ---- result = (ndims1 < ndims2) ? -1 : 1; else { ! for (i = 0; i < ndims1; i++) { if (dims1[i] != dims2[i]) { *************** array_cmp(FunctionCallInfo fcinfo) *** 3613,3624 **** break; } } } } /* Avoid leaking memory when handed toasted input. */ ! PG_FREE_IF_COPY(array1, 0); ! PG_FREE_IF_COPY(array2, 1); return result; } --- 3867,3892 ---- break; } } + if (result == 0) + { + int *lbound1 = AARR_LBOUND(array1); + int *lbound2 = AARR_LBOUND(array2); + + for (i = 0; i < ndims1; i++) + { + if (lbound1[i] != lbound2[i]) + { + result = (lbound1[i] < lbound2[i]) ? -1 : 1; + break; + } + } + } } } /* Avoid leaking memory when handed toasted input. */ ! AARR_FREE_IF_COPY(array1, 0); ! AARR_FREE_IF_COPY(array2, 1); return result; } *************** array_cmp(FunctionCallInfo fcinfo) *** 3633,3652 **** Datum hash_array(PG_FUNCTION_ARGS) { ! ArrayType *array = PG_GETARG_ARRAYTYPE_P(0); ! int ndims = ARR_NDIM(array); ! int *dims = ARR_DIMS(array); ! Oid element_type = ARR_ELEMTYPE(array); uint32 result = 1; int nitems; TypeCacheEntry *typentry; int typlen; bool typbyval; char typalign; - char *ptr; - bits8 *bitmap; - int bitmask; int i; FunctionCallInfoData locfcinfo; /* --- 3901,3918 ---- Datum hash_array(PG_FUNCTION_ARGS) { ! AnyArrayType *array = PG_GETARG_ANY_ARRAY(0); ! int ndims = AARR_NDIM(array); ! int *dims = AARR_DIMS(array); ! Oid element_type = AARR_ELEMTYPE(array); uint32 result = 1; int nitems; TypeCacheEntry *typentry; int typlen; bool typbyval; char typalign; int i; + ARRAY_ITER ARRAY_ITER_VARS(iter); FunctionCallInfoData locfcinfo; /* *************** hash_array(PG_FUNCTION_ARGS) *** 3680,3707 **** /* Loop over source data */ nitems = ArrayGetNItems(ndims, dims); ! ptr = ARR_DATA_PTR(array); ! bitmap = ARR_NULLBITMAP(array); ! bitmask = 1; for (i = 0; i < nitems; i++) { uint32 elthash; /* Get element, checking for NULL */ ! if (bitmap && (*bitmap & bitmask) == 0) { /* Treat nulls as having hashvalue 0 */ elthash = 0; } else { - Datum elt; - - elt = fetch_att(ptr, typbyval, typlen); - ptr = att_addlength_pointer(ptr, typlen, ptr); - ptr = (char *) att_align_nominal(ptr, typalign); - /* Apply the hash function */ locfcinfo.arg[0] = elt; locfcinfo.argnull[0] = false; --- 3946,3969 ---- /* Loop over source data */ nitems = ArrayGetNItems(ndims, dims); ! ARRAY_ITER_SETUP(iter, array); for (i = 0; i < nitems; i++) { + Datum elt; + bool isnull; uint32 elthash; /* Get element, checking for NULL */ ! ARRAY_ITER_NEXT(iter, i, elt, isnull, typlen, typbyval, typalign); ! ! if (isnull) { /* Treat nulls as having hashvalue 0 */ elthash = 0; } else { /* Apply the hash function */ locfcinfo.arg[0] = elt; locfcinfo.argnull[0] = false; *************** hash_array(PG_FUNCTION_ARGS) *** 3709,3725 **** elthash = DatumGetUInt32(FunctionCallInvoke(&locfcinfo)); } - /* advance bitmap pointer if any */ - if (bitmap) - { - bitmask <<= 1; - if (bitmask == 0x100) - { - bitmap++; - bitmask = 1; - } - } - /* * Combine hash values of successive elements by multiplying the * current value by 31 and adding on the new element's hash value. --- 3971,3976 ---- *************** hash_array(PG_FUNCTION_ARGS) *** 3735,3741 **** } /* Avoid leaking memory when handed toasted input. */ ! PG_FREE_IF_COPY(array, 0); PG_RETURN_UINT32(result); } --- 3986,3992 ---- } /* Avoid leaking memory when handed toasted input. */ ! AARR_FREE_IF_COPY(array, 0); PG_RETURN_UINT32(result); } *************** hash_array(PG_FUNCTION_ARGS) *** 3756,3766 **** * When matchall is false, return true if any members of array1 are in array2. */ static bool ! array_contain_compare(ArrayType *array1, ArrayType *array2, Oid collation, bool matchall, void **fn_extra) { bool result = matchall; ! Oid element_type = ARR_ELEMTYPE(array1); TypeCacheEntry *typentry; int nelems1; Datum *values2; --- 4007,4017 ---- * When matchall is false, return true if any members of array1 are in array2. */ static bool ! array_contain_compare(AnyArrayType *array1, AnyArrayType *array2, Oid collation, bool matchall, void **fn_extra) { bool result = matchall; ! Oid element_type = AARR_ELEMTYPE(array1); TypeCacheEntry *typentry; int nelems1; Datum *values2; *************** array_contain_compare(ArrayType *array1, *** 3769,3782 **** int typlen; bool typbyval; char typalign; - char *ptr1; - bits8 *bitmap1; - int bitmask; int i; int j; FunctionCallInfoData locfcinfo; ! if (element_type != ARR_ELEMTYPE(array2)) ereport(ERROR, (errcode(ERRCODE_DATATYPE_MISMATCH), errmsg("cannot compare arrays of different element types"))); --- 4020,4031 ---- int typlen; bool typbyval; char typalign; int i; int j; + ARRAY_ITER ARRAY_ITER_VARS(it1); FunctionCallInfoData locfcinfo; ! if (element_type != AARR_ELEMTYPE(array2)) ereport(ERROR, (errcode(ERRCODE_DATATYPE_MISMATCH), errmsg("cannot compare arrays of different element types"))); *************** array_contain_compare(ArrayType *array1, *** 3809,3816 **** * worthwhile to use deconstruct_array on it. We scan array1 the hard way * however, since we very likely won't need to look at all of it. */ ! deconstruct_array(array2, element_type, typlen, typbyval, typalign, ! &values2, &nulls2, &nelems2); /* * Apply the comparison operator to each pair of array elements. --- 4058,4075 ---- * worthwhile to use deconstruct_array on it. We scan array1 the hard way * however, since we very likely won't need to look at all of it. */ ! if (VARATT_IS_EXPANDED_HEADER(array2)) ! { ! /* This should be safe even if input is read-only */ ! deconstruct_expanded_array(&(array2->xpn)); ! values2 = array2->xpn.dvalues; ! nulls2 = array2->xpn.dnulls; ! nelems2 = array2->xpn.nelems; ! } ! else ! deconstruct_array(&(array2->flt), ! element_type, typlen, typbyval, typalign, ! &values2, &nulls2, &nelems2); /* * Apply the comparison operator to each pair of array elements. *************** array_contain_compare(ArrayType *array1, *** 3819,3828 **** collation, NULL, NULL); /* Loop over source data */ ! nelems1 = ArrayGetNItems(ARR_NDIM(array1), ARR_DIMS(array1)); ! ptr1 = ARR_DATA_PTR(array1); ! bitmap1 = ARR_NULLBITMAP(array1); ! bitmask = 1; for (i = 0; i < nelems1; i++) { --- 4078,4085 ---- collation, NULL, NULL); /* Loop over source data */ ! nelems1 = ArrayGetNItems(AARR_NDIM(array1), AARR_DIMS(array1)); ! ARRAY_ITER_SETUP(it1, array1); for (i = 0; i < nelems1; i++) { *************** array_contain_compare(ArrayType *array1, *** 3830,3856 **** bool isnull1; /* Get element, checking for NULL */ ! if (bitmap1 && (*bitmap1 & bitmask) == 0) ! { ! isnull1 = true; ! elt1 = (Datum) 0; ! } ! else ! { ! isnull1 = false; ! elt1 = fetch_att(ptr1, typbyval, typlen); ! ptr1 = att_addlength_pointer(ptr1, typlen, ptr1); ! ptr1 = (char *) att_align_nominal(ptr1, typalign); ! } ! ! /* advance bitmap pointer if any */ ! bitmask <<= 1; ! if (bitmask == 0x100) ! { ! if (bitmap1) ! bitmap1++; ! bitmask = 1; ! } /* * We assume that the comparison operator is strict, so a NULL can't --- 4087,4093 ---- bool isnull1; /* Get element, checking for NULL */ ! ARRAY_ITER_NEXT(it1, i, elt1, isnull1, typlen, typbyval, typalign); /* * We assume that the comparison operator is strict, so a NULL can't *************** array_contain_compare(ArrayType *array1, *** 3909,3925 **** } } - pfree(values2); - pfree(nulls2); - return result; } Datum arrayoverlap(PG_FUNCTION_ARGS) { ! ArrayType *array1 = PG_GETARG_ARRAYTYPE_P(0); ! ArrayType *array2 = PG_GETARG_ARRAYTYPE_P(1); Oid collation = PG_GET_COLLATION(); bool result; --- 4146,4159 ---- } } return result; } Datum arrayoverlap(PG_FUNCTION_ARGS) { ! AnyArrayType *array1 = PG_GETARG_ANY_ARRAY(0); ! AnyArrayType *array2 = PG_GETARG_ANY_ARRAY(1); Oid collation = PG_GET_COLLATION(); bool result; *************** arrayoverlap(PG_FUNCTION_ARGS) *** 3927,3934 **** &fcinfo->flinfo->fn_extra); /* Avoid leaking memory when handed toasted input. */ ! PG_FREE_IF_COPY(array1, 0); ! PG_FREE_IF_COPY(array2, 1); PG_RETURN_BOOL(result); } --- 4161,4168 ---- &fcinfo->flinfo->fn_extra); /* Avoid leaking memory when handed toasted input. */ ! AARR_FREE_IF_COPY(array1, 0); ! AARR_FREE_IF_COPY(array2, 1); PG_RETURN_BOOL(result); } *************** arrayoverlap(PG_FUNCTION_ARGS) *** 3936,3943 **** Datum arraycontains(PG_FUNCTION_ARGS) { ! ArrayType *array1 = PG_GETARG_ARRAYTYPE_P(0); ! ArrayType *array2 = PG_GETARG_ARRAYTYPE_P(1); Oid collation = PG_GET_COLLATION(); bool result; --- 4170,4177 ---- Datum arraycontains(PG_FUNCTION_ARGS) { ! AnyArrayType *array1 = PG_GETARG_ANY_ARRAY(0); ! AnyArrayType *array2 = PG_GETARG_ANY_ARRAY(1); Oid collation = PG_GET_COLLATION(); bool result; *************** arraycontains(PG_FUNCTION_ARGS) *** 3945,3952 **** &fcinfo->flinfo->fn_extra); /* Avoid leaking memory when handed toasted input. */ ! PG_FREE_IF_COPY(array1, 0); ! PG_FREE_IF_COPY(array2, 1); PG_RETURN_BOOL(result); } --- 4179,4186 ---- &fcinfo->flinfo->fn_extra); /* Avoid leaking memory when handed toasted input. */ ! AARR_FREE_IF_COPY(array1, 0); ! AARR_FREE_IF_COPY(array2, 1); PG_RETURN_BOOL(result); } *************** arraycontains(PG_FUNCTION_ARGS) *** 3954,3961 **** Datum arraycontained(PG_FUNCTION_ARGS) { ! ArrayType *array1 = PG_GETARG_ARRAYTYPE_P(0); ! ArrayType *array2 = PG_GETARG_ARRAYTYPE_P(1); Oid collation = PG_GET_COLLATION(); bool result; --- 4188,4195 ---- Datum arraycontained(PG_FUNCTION_ARGS) { ! AnyArrayType *array1 = PG_GETARG_ANY_ARRAY(0); ! AnyArrayType *array2 = PG_GETARG_ANY_ARRAY(1); Oid collation = PG_GET_COLLATION(); bool result; *************** arraycontained(PG_FUNCTION_ARGS) *** 3963,3970 **** &fcinfo->flinfo->fn_extra); /* Avoid leaking memory when handed toasted input. */ ! PG_FREE_IF_COPY(array1, 0); ! PG_FREE_IF_COPY(array2, 1); PG_RETURN_BOOL(result); } --- 4197,4204 ---- &fcinfo->flinfo->fn_extra); /* Avoid leaking memory when handed toasted input. */ ! AARR_FREE_IF_COPY(array1, 0); ! AARR_FREE_IF_COPY(array2, 1); PG_RETURN_BOOL(result); } *************** makeArrayResultAny(ArrayBuildStateAny *a *** 5213,5243 **** Datum array_larger(PG_FUNCTION_ARGS) { ! ArrayType *v1, ! *v2, ! *result; ! ! v1 = PG_GETARG_ARRAYTYPE_P(0); ! v2 = PG_GETARG_ARRAYTYPE_P(1); ! ! result = ((array_cmp(fcinfo) > 0) ? v1 : v2); ! ! PG_RETURN_ARRAYTYPE_P(result); } Datum array_smaller(PG_FUNCTION_ARGS) { ! ArrayType *v1, ! *v2, ! *result; ! ! v1 = PG_GETARG_ARRAYTYPE_P(0); ! v2 = PG_GETARG_ARRAYTYPE_P(1); ! ! result = ((array_cmp(fcinfo) < 0) ? v1 : v2); ! ! PG_RETURN_ARRAYTYPE_P(result); } --- 5447,5465 ---- Datum array_larger(PG_FUNCTION_ARGS) { ! if (array_cmp(fcinfo) > 0) ! PG_RETURN_DATUM(PG_GETARG_DATUM(0)); ! else ! PG_RETURN_DATUM(PG_GETARG_DATUM(1)); } Datum array_smaller(PG_FUNCTION_ARGS) { ! if (array_cmp(fcinfo) < 0) ! PG_RETURN_DATUM(PG_GETARG_DATUM(0)); ! else ! PG_RETURN_DATUM(PG_GETARG_DATUM(1)); } *************** generate_subscripts(PG_FUNCTION_ARGS) *** 5262,5268 **** /* stuff done only on the first call of the function */ if (SRF_IS_FIRSTCALL()) { ! ArrayType *v = PG_GETARG_ARRAYTYPE_P(0); int reqdim = PG_GETARG_INT32(1); int *lb, *dimv; --- 5484,5490 ---- /* stuff done only on the first call of the function */ if (SRF_IS_FIRSTCALL()) { ! AnyArrayType *v = PG_GETARG_ANY_ARRAY(0); int reqdim = PG_GETARG_INT32(1); int *lb, *dimv; *************** generate_subscripts(PG_FUNCTION_ARGS) *** 5271,5281 **** funcctx = SRF_FIRSTCALL_INIT(); /* Sanity check: does it look like an array at all? */ ! if (ARR_NDIM(v) <= 0 || ARR_NDIM(v) > MAXDIM) SRF_RETURN_DONE(funcctx); /* Sanity check: was the requested dim valid */ ! if (reqdim <= 0 || reqdim > ARR_NDIM(v)) SRF_RETURN_DONE(funcctx); /* --- 5493,5503 ---- funcctx = SRF_FIRSTCALL_INIT(); /* Sanity check: does it look like an array at all? */ ! if (AARR_NDIM(v) <= 0 || AARR_NDIM(v) > MAXDIM) SRF_RETURN_DONE(funcctx); /* Sanity check: was the requested dim valid */ ! if (reqdim <= 0 || reqdim > AARR_NDIM(v)) SRF_RETURN_DONE(funcctx); /* *************** generate_subscripts(PG_FUNCTION_ARGS) *** 5284,5291 **** oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx); fctx = (generate_subscripts_fctx *) palloc(sizeof(generate_subscripts_fctx)); ! lb = ARR_LBOUND(v); ! dimv = ARR_DIMS(v); fctx->lower = lb[reqdim - 1]; fctx->upper = dimv[reqdim - 1] + lb[reqdim - 1] - 1; --- 5506,5513 ---- oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx); fctx = (generate_subscripts_fctx *) palloc(sizeof(generate_subscripts_fctx)); ! lb = AARR_LBOUND(v); ! dimv = AARR_DIMS(v); fctx->lower = lb[reqdim - 1]; fctx->upper = dimv[reqdim - 1] + lb[reqdim - 1] - 1; *************** array_unnest(PG_FUNCTION_ARGS) *** 5604,5614 **** { typedef struct { ! ArrayType *arr; int nextelem; int numelems; - char *elemdataptr; /* this moves with nextelem */ - bits8 *arraynullsptr; /* this does not */ int16 elmlen; bool elmbyval; char elmalign; --- 5826,5834 ---- { typedef struct { ! ARRAY_ITER ARRAY_ITER_VARS(iter); int nextelem; int numelems; int16 elmlen; bool elmbyval; char elmalign; *************** array_unnest(PG_FUNCTION_ARGS) *** 5621,5627 **** /* stuff done only on the first call of the function */ if (SRF_IS_FIRSTCALL()) { ! ArrayType *arr; /* create a function context for cross-call persistence */ funcctx = SRF_FIRSTCALL_INIT(); --- 5841,5847 ---- /* stuff done only on the first call of the function */ if (SRF_IS_FIRSTCALL()) { ! AnyArrayType *arr; /* create a function context for cross-call persistence */ funcctx = SRF_FIRSTCALL_INIT(); *************** array_unnest(PG_FUNCTION_ARGS) *** 5638,5660 **** * and not before. (If no detoast happens, we assume the originally * passed array will stick around till then.) */ ! arr = PG_GETARG_ARRAYTYPE_P(0); /* allocate memory for user context */ fctx = (array_unnest_fctx *) palloc(sizeof(array_unnest_fctx)); /* initialize state */ ! fctx->arr = arr; fctx->nextelem = 0; ! fctx->numelems = ArrayGetNItems(ARR_NDIM(arr), ARR_DIMS(arr)); ! ! fctx->elemdataptr = ARR_DATA_PTR(arr); ! fctx->arraynullsptr = ARR_NULLBITMAP(arr); ! get_typlenbyvalalign(ARR_ELEMTYPE(arr), ! &fctx->elmlen, ! &fctx->elmbyval, ! &fctx->elmalign); funcctx->user_fctx = fctx; MemoryContextSwitchTo(oldcontext); --- 5858,5885 ---- * and not before. (If no detoast happens, we assume the originally * passed array will stick around till then.) */ ! arr = PG_GETARG_ANY_ARRAY(0); /* allocate memory for user context */ fctx = (array_unnest_fctx *) palloc(sizeof(array_unnest_fctx)); /* initialize state */ ! ARRAY_ITER_SETUP(fctx->iter, arr); fctx->nextelem = 0; ! fctx->numelems = ArrayGetNItems(AARR_NDIM(arr), AARR_DIMS(arr)); ! if (VARATT_IS_EXPANDED_HEADER(arr)) ! { ! /* we can just grab the type data from expanded array */ ! fctx->elmlen = arr->xpn.typlen; ! fctx->elmbyval = arr->xpn.typbyval; ! fctx->elmalign = arr->xpn.typalign; ! } ! else ! get_typlenbyvalalign(AARR_ELEMTYPE(arr), ! &fctx->elmlen, ! &fctx->elmbyval, ! &fctx->elmalign); funcctx->user_fctx = fctx; MemoryContextSwitchTo(oldcontext); *************** array_unnest(PG_FUNCTION_ARGS) *** 5669,5700 **** int offset = fctx->nextelem++; Datum elem; ! /* ! * Check for NULL array element ! */ ! if (array_get_isnull(fctx->arraynullsptr, offset)) ! { ! fcinfo->isnull = true; ! elem = (Datum) 0; ! /* elemdataptr does not move */ ! } ! else ! { ! /* ! * OK, get the element ! */ ! char *ptr = fctx->elemdataptr; ! ! fcinfo->isnull = false; ! elem = ArrayCast(ptr, fctx->elmbyval, fctx->elmlen); ! ! /* ! * Advance elemdataptr over it ! */ ! ptr = att_addlength_pointer(ptr, fctx->elmlen, ptr); ! ptr = (char *) att_align_nominal(ptr, fctx->elmalign); ! fctx->elemdataptr = ptr; ! } SRF_RETURN_NEXT(funcctx, elem); } --- 5894,5901 ---- int offset = fctx->nextelem++; Datum elem; ! ARRAY_ITER_NEXT(fctx->iter, offset, elem, fcinfo->isnull, ! fctx->elmlen, fctx->elmbyval, fctx->elmalign); SRF_RETURN_NEXT(funcctx, elem); } *************** array_replace_internal(ArrayType *array, *** 5946,5952 **** result->ndim = ndim; result->dataoffset = dataoffset; result->elemtype = element_type; ! memcpy(ARR_DIMS(result), ARR_DIMS(array), 2 * ndim * sizeof(int)); if (remove) { --- 6147,6154 ---- result->ndim = ndim; result->dataoffset = dataoffset; result->elemtype = element_type; ! memcpy(ARR_DIMS(result), ARR_DIMS(array), ndim * sizeof(int)); ! memcpy(ARR_LBOUND(result), ARR_LBOUND(array), ndim * sizeof(int)); if (remove) { diff --git a/src/backend/utils/adt/datum.c b/src/backend/utils/adt/datum.c index 014eca5..e8af030 100644 *** a/src/backend/utils/adt/datum.c --- b/src/backend/utils/adt/datum.c *************** *** 12,19 **** * *------------------------------------------------------------------------- */ /* ! * In the implementation of the next routines we assume the following: * * A) if a type is "byVal" then all the information is stored in the * Datum itself (i.e. no pointers involved!). In this case the --- 12,20 ---- * *------------------------------------------------------------------------- */ + /* ! * In the implementation of these routines we assume the following: * * A) if a type is "byVal" then all the information is stored in the * Datum itself (i.e. no pointers involved!). In this case the *************** *** 34,44 **** --- 35,49 ---- * * Note that we do not treat "toasted" datums specially; therefore what * will be copied or compared is the compressed data or toast reference. + * An exception is made for datumCopy() of an expanded object, however, + * because most callers expect to get a simple contiguous (and pfree'able) + * result from datumCopy(). See also datumTransfer(). */ #include "postgres.h" #include "utils/datum.h" + #include "utils/expandeddatum.h" /*------------------------------------------------------------------------- *************** *** 46,51 **** --- 51,57 ---- * * Find the "real" size of a datum, given the datum value, * whether it is a "by value", and the declared type length. + * (For TOAST pointer datums, this is the size of the pointer datum.) * * This is essentially an out-of-line version of the att_addlength_datum() * macro in access/tupmacs.h. We do a tad more error checking though. *************** datumGetSize(Datum value, bool typByVal, *** 106,114 **** /*------------------------------------------------------------------------- * datumCopy * ! * make a copy of a datum * * If the datatype is pass-by-reference, memory is obtained with palloc(). *------------------------------------------------------------------------- */ Datum --- 112,127 ---- /*------------------------------------------------------------------------- * datumCopy * ! * Make a copy of a non-NULL datum. * * If the datatype is pass-by-reference, memory is obtained with palloc(). + * + * If the value is a reference to an expanded object, we flatten into memory + * obtained with palloc(). We need to copy because one of the main uses of + * this function is to copy a datum out of a transient memory context that's + * about to be destroyed, and the expanded object is probably in a child + * context that will also go away. Moreover, many callers assume that the + * result is a single pfree-able chunk. *------------------------------------------------------------------------- */ Datum *************** datumCopy(Datum value, bool typByVal, in *** 118,161 **** if (typByVal) res = value; else { Size realSize; ! char *s; ! ! if (DatumGetPointer(value) == NULL) ! return PointerGetDatum(NULL); realSize = datumGetSize(value, typByVal, typLen); ! s = (char *) palloc(realSize); ! memcpy(s, DatumGetPointer(value), realSize); ! res = PointerGetDatum(s); } return res; } /*------------------------------------------------------------------------- ! * datumFree * ! * Free the space occupied by a datum CREATED BY "datumCopy" * ! * NOTE: DO NOT USE THIS ROUTINE with datums returned by heap_getattr() etc. ! * ONLY datums created by "datumCopy" can be freed! *------------------------------------------------------------------------- */ ! #ifdef NOT_USED ! void ! datumFree(Datum value, bool typByVal, int typLen) { ! if (!typByVal) ! { ! Pointer s = DatumGetPointer(value); ! ! pfree(s); ! } } - #endif /*------------------------------------------------------------------------- * datumIsEqual --- 131,201 ---- if (typByVal) res = value; + else if (typLen == -1) + { + /* It is a varlena datatype */ + struct varlena *vl = (struct varlena *) DatumGetPointer(value); + + if (VARATT_IS_EXTERNAL_EXPANDED(vl)) + { + /* Flatten into the caller's memory context */ + ExpandedObjectHeader *eoh = DatumGetEOHP(value); + Size resultsize; + char *resultptr; + + resultsize = EOH_get_flat_size(eoh); + resultptr = (char *) palloc(resultsize); + EOH_flatten_into(eoh, (void *) resultptr, resultsize); + res = PointerGetDatum(resultptr); + } + else + { + /* Otherwise, just copy the varlena datum verbatim */ + Size realSize; + char *resultptr; + + realSize = (Size) VARSIZE_ANY(vl); + resultptr = (char *) palloc(realSize); + memcpy(resultptr, vl, realSize); + res = PointerGetDatum(resultptr); + } + } else { + /* Pass by reference, but not varlena, so not toasted */ Size realSize; ! char *resultptr; realSize = datumGetSize(value, typByVal, typLen); ! resultptr = (char *) palloc(realSize); ! memcpy(resultptr, DatumGetPointer(value), realSize); ! res = PointerGetDatum(resultptr); } return res; } /*------------------------------------------------------------------------- ! * datumTransfer * ! * Transfer a non-NULL datum into the current memory context. * ! * This is equivalent to datumCopy() except when the datum is a read-write ! * pointer to an expanded object. In that case we merely reparent the object ! * into the current context, and return its standard R/W pointer (in case the ! * given one is a transient pointer of shorter lifespan). *------------------------------------------------------------------------- */ ! Datum ! datumTransfer(Datum value, bool typByVal, int typLen) { ! if (!typByVal && typLen == -1 && ! VARATT_IS_EXTERNAL_EXPANDED_RW(DatumGetPointer(value))) ! value = TransferExpandedObject(value, CurrentMemoryContext); ! else ! value = datumCopy(value, typByVal, typLen); ! return value; } /*------------------------------------------------------------------------- * datumIsEqual diff --git a/src/backend/utils/adt/expandeddatum.c b/src/backend/utils/adt/expandeddatum.c index ...039671b . *** a/src/backend/utils/adt/expandeddatum.c --- b/src/backend/utils/adt/expandeddatum.c *************** *** 0 **** --- 1,163 ---- + /*------------------------------------------------------------------------- + * + * expandeddatum.c + * Support functions for "expanded" value representations. + * + * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group + * Portions Copyright (c) 1994, Regents of the University of California + * + * + * IDENTIFICATION + * src/backend/utils/adt/expandeddatum.c + * + *------------------------------------------------------------------------- + */ + #include "postgres.h" + + #include "utils/expandeddatum.h" + #include "utils/memutils.h" + + /* + * DatumGetEOHP + * + * Given a Datum that is an expanded-object reference, extract the pointer. + * + * This is a bit tedious since the pointer may not be properly aligned; + * compare VARATT_EXTERNAL_GET_POINTER(). + */ + ExpandedObjectHeader * + DatumGetEOHP(Datum d) + { + varattrib_1b_e *datum = (varattrib_1b_e *) DatumGetPointer(d); + varatt_expanded ptr; + + Assert(VARATT_IS_EXTERNAL_EXPANDED(datum)); + memcpy(&ptr, VARDATA_EXTERNAL(datum), sizeof(ptr)); + Assert(VARATT_IS_EXPANDED_HEADER(ptr.eohptr)); + return ptr.eohptr; + } + + /* + * EOH_init_header + * + * Initialize the common header of an expanded object. + * + * The main thing this encapsulates is initializing the TOAST pointers. + */ + void + EOH_init_header(ExpandedObjectHeader *eohptr, + const ExpandedObjectMethods *methods, + MemoryContext obj_context) + { + varatt_expanded ptr; + + eohptr->vl_len_ = EOH_HEADER_MAGIC; + eohptr->eoh_methods = methods; + eohptr->eoh_context = obj_context; + + ptr.eohptr = eohptr; + + SET_VARTAG_EXTERNAL(eohptr->eoh_rw_ptr, VARTAG_EXPANDED_RW); + memcpy(VARDATA_EXTERNAL(eohptr->eoh_rw_ptr), &ptr, sizeof(ptr)); + + SET_VARTAG_EXTERNAL(eohptr->eoh_ro_ptr, VARTAG_EXPANDED_RO); + memcpy(VARDATA_EXTERNAL(eohptr->eoh_ro_ptr), &ptr, sizeof(ptr)); + } + + /* + * EOH_get_flat_size + * EOH_flatten_into + * + * Convenience functions for invoking the "methods" of an expanded object. + */ + + Size + EOH_get_flat_size(ExpandedObjectHeader *eohptr) + { + return (*eohptr->eoh_methods->get_flat_size) (eohptr); + } + + void + EOH_flatten_into(ExpandedObjectHeader *eohptr, + void *result, Size allocated_size) + { + (*eohptr->eoh_methods->flatten_into) (eohptr, result, allocated_size); + } + + /* + * Does the Datum represent a writable expanded object? + */ + bool + DatumIsReadWriteExpandedObject(Datum d, bool isnull, int16 typlen) + { + /* Reject if it's NULL or not a varlena type */ + if (isnull || typlen != -1) + return false; + + /* Reject if not a read-write expanded-object pointer */ + if (!VARATT_IS_EXTERNAL_EXPANDED_RW(DatumGetPointer(d))) + return false; + + return true; + } + + /* + * If the Datum represents a R/W expanded object, change it to R/O. + * Otherwise return the original Datum. + */ + Datum + MakeExpandedObjectReadOnly(Datum d, bool isnull, int16 typlen) + { + ExpandedObjectHeader *eohptr; + + /* Nothing to do if it's NULL or not a varlena type */ + if (isnull || typlen != -1) + return d; + + /* Nothing to do if not a read-write expanded-object pointer */ + if (!VARATT_IS_EXTERNAL_EXPANDED_RW(DatumGetPointer(d))) + return d; + + /* Now safe to extract the object pointer */ + eohptr = DatumGetEOHP(d); + + /* Return the built-in read-only pointer instead of given pointer */ + return EOHPGetRODatum(eohptr); + } + + /* + * Transfer ownership of an expanded object to a new parent memory context. + * The object must be referenced by a R/W pointer, and what we return is + * always its "standard" R/W pointer, which is certain to have the same + * lifespan as the object itself. (The passed-in pointer might not, and + * in any case wouldn't provide a unique identifier if it's not that one.) + */ + Datum + TransferExpandedObject(Datum d, MemoryContext new_parent) + { + ExpandedObjectHeader *eohptr = DatumGetEOHP(d); + + /* Assert caller gave a R/W pointer */ + Assert(VARATT_IS_EXTERNAL_EXPANDED_RW(DatumGetPointer(d))); + + /* Transfer ownership */ + MemoryContextSetParent(eohptr->eoh_context, new_parent); + + /* Return the object's standard read-write pointer */ + return EOHPGetRWDatum(eohptr); + } + + /* + * Delete an expanded object (must be referenced by a R/W pointer). + */ + void + DeleteExpandedObject(Datum d) + { + ExpandedObjectHeader *eohptr = DatumGetEOHP(d); + + /* Assert caller gave a R/W pointer */ + Assert(VARATT_IS_EXTERNAL_EXPANDED_RW(DatumGetPointer(d))); + + /* Kill it */ + MemoryContextDelete(eohptr->eoh_context); + } diff --git a/src/backend/utils/mmgr/mcxt.c b/src/backend/utils/mmgr/mcxt.c index 202bc78..4b24066 100644 *** a/src/backend/utils/mmgr/mcxt.c --- b/src/backend/utils/mmgr/mcxt.c *************** MemoryContextSetParent(MemoryContext con *** 266,271 **** --- 266,275 ---- AssertArg(MemoryContextIsValid(context)); AssertArg(context != new_parent); + /* Fast path if it's got correct parent already */ + if (new_parent == context->parent) + return; + /* Delink from existing parent, if any */ if (context->parent) { diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h index 40fde83..a98a7af 100644 *** a/src/include/executor/executor.h --- b/src/include/executor/executor.h *************** extern void FreeExprContext(ExprContext *** 312,318 **** extern void ReScanExprContext(ExprContext *econtext); #define ResetExprContext(econtext) \ ! MemoryContextReset((econtext)->ecxt_per_tuple_memory) extern ExprContext *MakePerTupleExprContext(EState *estate); --- 312,318 ---- extern void ReScanExprContext(ExprContext *econtext); #define ResetExprContext(econtext) \ ! MemoryContextResetAndDeleteChildren((econtext)->ecxt_per_tuple_memory) extern ExprContext *MakePerTupleExprContext(EState *estate); diff --git a/src/include/executor/spi.h b/src/include/executor/spi.h index 9e912ba..fbcae0c 100644 *** a/src/include/executor/spi.h --- b/src/include/executor/spi.h *************** extern char *SPI_getnspname(Relation rel *** 124,129 **** --- 124,130 ---- extern void *SPI_palloc(Size size); extern void *SPI_repalloc(void *pointer, Size size); extern void SPI_pfree(void *pointer); + extern Datum SPI_datumTransfer(Datum value, bool typByVal, int typLen); extern void SPI_freetuple(HeapTuple pointer); extern void SPI_freetuptable(SPITupleTable *tuptable); diff --git a/src/include/executor/tuptable.h b/src/include/executor/tuptable.h index 48f84bf..00686b0 100644 *** a/src/include/executor/tuptable.h --- b/src/include/executor/tuptable.h *************** extern Datum ExecFetchSlotTupleDatum(Tup *** 163,168 **** --- 163,169 ---- extern HeapTuple ExecMaterializeSlot(TupleTableSlot *slot); extern TupleTableSlot *ExecCopySlot(TupleTableSlot *dstslot, TupleTableSlot *srcslot); + extern TupleTableSlot *ExecMakeSlotContentsReadOnly(TupleTableSlot *slot); /* in access/common/heaptuple.c */ extern Datum slot_getattr(TupleTableSlot *slot, int attnum, bool *isnull); diff --git a/src/include/nodes/primnodes.h b/src/include/nodes/primnodes.h index 1d06f42..932a96b 100644 *** a/src/include/nodes/primnodes.h --- b/src/include/nodes/primnodes.h *************** typedef struct WindowFunc *** 310,315 **** --- 310,319 ---- * Note: the result datatype is the element type when fetching a single * element; but it is the array type when doing subarray fetch or either * type of store. + * + * Note: for the cases where an array is returned, if refexpr yields a R/W + * expanded array, then the implementation is allowed to modify that object + * in-place and return the same object.) * ---------------- */ typedef struct ArrayRef diff --git a/src/include/postgres.h b/src/include/postgres.h index 082c75b..5dd897a 100644 *** a/src/include/postgres.h --- b/src/include/postgres.h *************** typedef struct varatt_indirect *** 88,93 **** --- 88,110 ---- } varatt_indirect; /* + * struct varatt_expanded is a "TOAST pointer" representing an out-of-line + * Datum that is stored in memory, in some type-specific, not necessarily + * physically contiguous format that is convenient for computation not + * storage. APIs for this, in particular the definition of struct + * ExpandedObjectHeader, are in src/include/utils/expandeddatum.h. + * + * Note that just as for struct varatt_external, this struct is stored + * unaligned within any containing tuple. + */ + typedef struct ExpandedObjectHeader ExpandedObjectHeader; + + typedef struct varatt_expanded + { + ExpandedObjectHeader *eohptr; + } varatt_expanded; + + /* * Type tag for the various sorts of "TOAST pointer" datums. The peculiar * value for VARTAG_ONDISK comes from a requirement for on-disk compatibility * with a previous notion that the tag field was the pointer datum's length. *************** typedef struct varatt_indirect *** 95,105 **** --- 112,129 ---- typedef enum vartag_external { VARTAG_INDIRECT = 1, + VARTAG_EXPANDED_RO = 2, + VARTAG_EXPANDED_RW = 3, VARTAG_ONDISK = 18 } vartag_external; + /* this test relies on the specific tag values above */ + #define VARTAG_IS_EXPANDED(tag) \ + (((tag) & ~1) == VARTAG_EXPANDED_RO) + #define VARTAG_SIZE(tag) \ ((tag) == VARTAG_INDIRECT ? sizeof(varatt_indirect) : \ + VARTAG_IS_EXPANDED(tag) ? sizeof(varatt_expanded) : \ (tag) == VARTAG_ONDISK ? sizeof(varatt_external) : \ TrapMacro(true, "unrecognized TOAST vartag")) *************** typedef struct *** 294,299 **** --- 318,329 ---- (VARATT_IS_EXTERNAL(PTR) && VARTAG_EXTERNAL(PTR) == VARTAG_ONDISK) #define VARATT_IS_EXTERNAL_INDIRECT(PTR) \ (VARATT_IS_EXTERNAL(PTR) && VARTAG_EXTERNAL(PTR) == VARTAG_INDIRECT) + #define VARATT_IS_EXTERNAL_EXPANDED_RO(PTR) \ + (VARATT_IS_EXTERNAL(PTR) && VARTAG_EXTERNAL(PTR) == VARTAG_EXPANDED_RO) + #define VARATT_IS_EXTERNAL_EXPANDED_RW(PTR) \ + (VARATT_IS_EXTERNAL(PTR) && VARTAG_EXTERNAL(PTR) == VARTAG_EXPANDED_RW) + #define VARATT_IS_EXTERNAL_EXPANDED(PTR) \ + (VARATT_IS_EXTERNAL(PTR) && VARTAG_IS_EXPANDED(VARTAG_EXTERNAL(PTR))) #define VARATT_IS_SHORT(PTR) VARATT_IS_1B(PTR) #define VARATT_IS_EXTENDED(PTR) (!VARATT_IS_4B_U(PTR)) diff --git a/src/include/utils/array.h b/src/include/utils/array.h index d9fac80..93bee4d 100644 *** a/src/include/utils/array.h --- b/src/include/utils/array.h *************** *** 45,50 **** --- 45,55 ---- * We support subscripting on these types, but array_in() and array_out() * only work with varlena arrays. * + * In addition, arrays are a major user of the "expanded object" TOAST + * infrastructure. This allows a varlena array to be converted to a + * separate representation that may include "deconstructed" Datum/isnull + * arrays holding the elements. + * * * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group * Portions Copyright (c) 1994, Regents of the University of California *************** *** 57,62 **** --- 62,69 ---- #define ARRAY_H #include "fmgr.h" + #include "utils/expandeddatum.h" + /* * Arrays are varlena objects, so must meet the varlena convention that *************** typedef struct *** 75,80 **** --- 82,167 ---- } ArrayType; /* + * An expanded array is contained within a private memory context (as + * all expanded objects must be) and has a control structure as below. + * + * The expanded array might contain a regular "flat" array if that was the + * original input and we've not modified it significantly. Otherwise, the + * contents are represented by Datum/isnull arrays plus dimensionality and + * type information. We could also have both forms, if we've deconstructed + * the original array for access purposes but not yet changed it. For pass- + * by-reference element types, the Datums would point into the flat array in + * this situation. Once we start modifying array elements, new pass-by-ref + * elements are separately palloc'd within the memory context. + */ + #define EA_MAGIC 689375833 /* ID for debugging crosschecks */ + + typedef struct ExpandedArrayHeader + { + /* Standard header for expanded objects */ + ExpandedObjectHeader hdr; + + /* Magic value identifying an expanded array (for debugging only) */ + int ea_magic; + + /* Dimensionality info (always valid) */ + int ndims; /* # of dimensions */ + int *dims; /* array dimensions */ + int *lbound; /* index lower bounds for each dimension */ + + /* Element type info (always valid) */ + Oid element_type; /* element type OID */ + int16 typlen; /* needed info about element datatype */ + bool typbyval; + char typalign; + + /* + * If we have a Datum-array representation of the array, it's kept here; + * else dvalues/dnulls are NULL. The dvalues and dnulls arrays are always + * palloc'd within the object private context, but may change size from + * time to time. For pass-by-ref element types, dvalues entries might + * point either into the fstartptr..fendptr area, or to separately + * palloc'd chunks. Elements should always be fully detoasted, as they + * are in the standard flat representation. + * + * Even when dvalues is valid, dnulls can be NULL if there are no null + * elements. + */ + Datum *dvalues; /* array of Datums */ + bool *dnulls; /* array of is-null flags for Datums */ + int dvalueslen; /* allocated length of above arrays */ + int nelems; /* number of valid entries in above arrays */ + + /* + * flat_size is the current space requirement for the flat equivalent of + * the expanded array, if known; otherwise it's 0. We store this to make + * consecutive calls of get_flat_size cheap. + */ + Size flat_size; + + /* + * fvalue points to the flat representation if it is valid, else it is + * NULL. If we have or ever had a flat representation then + * fstartptr/fendptr point to the start and end+1 of its data area; this + * is so that we can tell which Datum pointers point into the flat + * representation rather than being pointers to separately palloc'd data. + */ + ArrayType *fvalue; /* must be a fully detoasted array */ + char *fstartptr; /* start of its data area */ + char *fendptr; /* end+1 of its data area */ + } ExpandedArrayHeader; + + /* + * Functions that can handle either a "flat" varlena array or an expanded + * array use this union to work with their input. + */ + typedef union AnyArrayType + { + ArrayType flt; + ExpandedArrayHeader xpn; + } AnyArrayType; + + /* * working state for accumArrayResult() and friends * note that the input must be scalars (legal array elements) */ *************** typedef struct ArrayMapState *** 149,165 **** /* ArrayIteratorData is private in arrayfuncs.c */ typedef struct ArrayIteratorData *ArrayIterator; ! /* ! * fmgr macros for array objects ! */ #define DatumGetArrayTypeP(X) ((ArrayType *) PG_DETOAST_DATUM(X)) #define DatumGetArrayTypePCopy(X) ((ArrayType *) PG_DETOAST_DATUM_COPY(X)) #define PG_GETARG_ARRAYTYPE_P(n) DatumGetArrayTypeP(PG_GETARG_DATUM(n)) #define PG_GETARG_ARRAYTYPE_P_COPY(n) DatumGetArrayTypePCopy(PG_GETARG_DATUM(n)) #define PG_RETURN_ARRAYTYPE_P(x) PG_RETURN_POINTER(x) /* ! * Access macros for array header fields. * * ARR_DIMS returns a pointer to an array of array dimensions (number of * elements along the various array axes). --- 236,259 ---- /* ArrayIteratorData is private in arrayfuncs.c */ typedef struct ArrayIteratorData *ArrayIterator; ! /* fmgr macros for regular varlena array objects */ #define DatumGetArrayTypeP(X) ((ArrayType *) PG_DETOAST_DATUM(X)) #define DatumGetArrayTypePCopy(X) ((ArrayType *) PG_DETOAST_DATUM_COPY(X)) #define PG_GETARG_ARRAYTYPE_P(n) DatumGetArrayTypeP(PG_GETARG_DATUM(n)) #define PG_GETARG_ARRAYTYPE_P_COPY(n) DatumGetArrayTypePCopy(PG_GETARG_DATUM(n)) #define PG_RETURN_ARRAYTYPE_P(x) PG_RETURN_POINTER(x) + /* fmgr macros for expanded array objects */ + #define PG_GETARG_EXPANDED_ARRAY(n) DatumGetExpandedArray(PG_GETARG_DATUM(n)) + #define PG_GETARG_EXPANDED_ARRAYX(n, elmlen, elmbyval, elmalign) \ + DatumGetExpandedArrayX(PG_GETARG_DATUM(n), elmlen, elmbyval, elmalign) + #define PG_RETURN_EXPANDED_ARRAY(x) PG_RETURN_DATUM(EOHPGetRWDatum(&(x)->hdr)) + + /* fmgr macros for AnyArrayType (ie, get either varlena or expanded form) */ + #define PG_GETARG_ANY_ARRAY(n) DatumGetAnyArray(PG_GETARG_DATUM(n)) + /* ! * Access macros for varlena array header fields. * * ARR_DIMS returns a pointer to an array of array dimensions (number of * elements along the various array axes). *************** typedef struct ArrayIteratorData *ArrayI *** 207,212 **** --- 301,402 ---- #define ARR_DATA_PTR(a) \ (((char *) (a)) + ARR_DATA_OFFSET(a)) + /* + * Macros for working with AnyArrayType inputs. Beware multiple references! + */ + #define AARR_NDIM(a) \ + (VARATT_IS_EXPANDED_HEADER(a) ? (a)->xpn.ndims : ARR_NDIM(&(a)->flt)) + #define AARR_HASNULL(a) \ + (VARATT_IS_EXPANDED_HEADER(a) ? \ + ((a)->xpn.dvalues != NULL ? (a)->xpn.dnulls != NULL : ARR_HASNULL((a)->xpn.fvalue)) : \ + ARR_HASNULL(&(a)->flt)) + #define AARR_ELEMTYPE(a) \ + (VARATT_IS_EXPANDED_HEADER(a) ? (a)->xpn.element_type : ARR_ELEMTYPE(&(a)->flt)) + #define AARR_DIMS(a) \ + (VARATT_IS_EXPANDED_HEADER(a) ? (a)->xpn.dims : ARR_DIMS(&(a)->flt)) + #define AARR_LBOUND(a) \ + (VARATT_IS_EXPANDED_HEADER(a) ? (a)->xpn.lbound : ARR_LBOUND(&(a)->flt)) + + /* + * Macros for iterating through elements of a flat or expanded array. + * Use "ARRAY_ITER ARRAY_ITER_VARS(name);" to declare the local variables + * needed for an iterator (more than one set can be used in the same function, + * if they have different names). + * Use "ARRAY_ITER_SETUP(name, arrayptr);" to prepare to iterate, and + * "ARRAY_ITER_NEXT(name, index, datumvar, isnullvar, ...);" to fetch the + * next element into datumvar/isnullvar. "index" must be the zero-origin + * element number; we make caller provide this since caller is generally + * counting the elements anyway. + */ + #define ARRAY_ITER /* dummy type name to keep pgindent happy */ + + #define ARRAY_ITER_VARS(iter) \ + Datum *iter##datumptr; \ + bool *iter##isnullptr; \ + char *iter##dataptr; \ + bits8 *iter##bitmapptr; \ + int iter##bitmask + + #define ARRAY_ITER_SETUP(iter, arrayptr) \ + do { \ + if (VARATT_IS_EXPANDED_HEADER(arrayptr)) \ + { \ + if ((arrayptr)->xpn.dvalues) \ + { \ + (iter##datumptr) = (arrayptr)->xpn.dvalues; \ + (iter##isnullptr) = (arrayptr)->xpn.dnulls; \ + (iter##dataptr) = NULL; \ + (iter##bitmapptr) = NULL; \ + } \ + else \ + { \ + (iter##datumptr) = NULL; \ + (iter##isnullptr) = NULL; \ + (iter##dataptr) = ARR_DATA_PTR((arrayptr)->xpn.fvalue); \ + (iter##bitmapptr) = ARR_NULLBITMAP((arrayptr)->xpn.fvalue); \ + } \ + } \ + else \ + { \ + (iter##datumptr) = NULL; \ + (iter##isnullptr) = NULL; \ + (iter##dataptr) = ARR_DATA_PTR(&(arrayptr)->flt); \ + (iter##bitmapptr) = ARR_NULLBITMAP(&(arrayptr)->flt); \ + } \ + (iter##bitmask) = 1; \ + } while (0) + + #define ARRAY_ITER_NEXT(iter,i, datumvar,isnullvar, elmlen,elmbyval,elmalign) \ + do { \ + if (iter##datumptr) \ + { \ + (datumvar) = (iter##datumptr)[i]; \ + (isnullvar) = (iter##isnullptr) ? (iter##isnullptr)[i] : false; \ + } \ + else \ + { \ + if ((iter##bitmapptr) && (*(iter##bitmapptr) & (iter##bitmask)) == 0) \ + { \ + (isnullvar) = true; \ + (datumvar) = (Datum) 0; \ + } \ + else \ + { \ + (isnullvar) = false; \ + (datumvar) = fetch_att(iter##dataptr, elmbyval, elmlen); \ + (iter##dataptr) = att_addlength_pointer(iter##dataptr, elmlen, iter##dataptr); \ + (iter##dataptr) = (char *) att_align_nominal(iter##dataptr, elmalign); \ + } \ + (iter##bitmask) <<= 1; \ + if ((iter##bitmask) == 0x100) \ + { \ + if (iter##bitmapptr) \ + (iter##bitmapptr)++; \ + (iter##bitmask) = 1; \ + } \ + } \ + } while (0) + /* * GUC parameter *************** extern Datum array_remove(PG_FUNCTION_AR *** 248,253 **** --- 438,452 ---- extern Datum array_replace(PG_FUNCTION_ARGS); extern Datum width_bucket_array(PG_FUNCTION_ARGS); + extern void CopyArrayEls(ArrayType *array, + Datum *values, + bool *nulls, + int nitems, + int typlen, + bool typbyval, + char typalign, + bool freedata); + extern Datum array_get_element(Datum arraydatum, int nSubscripts, int *indx, int arraytyplen, int elmlen, bool elmbyval, char elmalign, bool *isNull); *************** extern ArrayType *array_set(ArrayType *a *** 269,275 **** Datum dataValue, bool isNull, int arraytyplen, int elmlen, bool elmbyval, char elmalign); ! extern Datum array_map(FunctionCallInfo fcinfo, Oid inpType, Oid retType, ArrayMapState *amstate); extern void array_bitmap_copy(bits8 *destbitmap, int destoffset, --- 468,474 ---- Datum dataValue, bool isNull, int arraytyplen, int elmlen, bool elmbyval, char elmalign); ! extern Datum array_map(FunctionCallInfo fcinfo, Oid retType, ArrayMapState *amstate); extern void array_bitmap_copy(bits8 *destbitmap, int destoffset, *************** extern ArrayType *construct_md_array(Dat *** 286,291 **** --- 485,493 ---- int *lbs, Oid elmtype, int elmlen, bool elmbyval, char elmalign); extern ArrayType *construct_empty_array(Oid elmtype); + extern ExpandedArrayHeader *construct_empty_expanded_array(Oid element_type, + MemoryContext parentcontext, + int elmlen, bool elmbyval, char elmalign); extern void deconstruct_array(ArrayType *array, Oid elmtype, int elmlen, bool elmbyval, char elmalign, *************** extern int mda_next_tuple(int n, int *cu *** 339,344 **** --- 541,557 ---- extern int32 *ArrayGetIntegerTypmods(ArrayType *arr, int *n); /* + * prototypes for functions defined in array_expanded.c + */ + extern Datum expand_array(Datum arraydatum, MemoryContext parentcontext, + int elmlen, bool elmbyval, char elmalign); + extern ExpandedArrayHeader *DatumGetExpandedArray(Datum d); + extern ExpandedArrayHeader *DatumGetExpandedArrayX(Datum d, + int elmlen, bool elmbyval, char elmalign); + extern AnyArrayType *DatumGetAnyArray(Datum d); + extern void deconstruct_expanded_array(ExpandedArrayHeader *eah); + + /* * prototypes for functions defined in array_userfuncs.c */ extern Datum array_append(PG_FUNCTION_ARGS); diff --git a/src/include/utils/datum.h b/src/include/utils/datum.h index 663414b..c572f79 100644 *** a/src/include/utils/datum.h --- b/src/include/utils/datum.h *************** *** 24,41 **** extern Size datumGetSize(Datum value, bool typByVal, int typLen); /* ! * datumCopy - make a copy of a datum. * * If the datatype is pass-by-reference, memory is obtained with palloc(). */ extern Datum datumCopy(Datum value, bool typByVal, int typLen); /* ! * datumFree - free a datum previously allocated by datumCopy, if any. * ! * Does nothing if datatype is pass-by-value. */ ! extern void datumFree(Datum value, bool typByVal, int typLen); /* * datumIsEqual --- 24,41 ---- extern Size datumGetSize(Datum value, bool typByVal, int typLen); /* ! * datumCopy - make a copy of a non-NULL datum. * * If the datatype is pass-by-reference, memory is obtained with palloc(). */ extern Datum datumCopy(Datum value, bool typByVal, int typLen); /* ! * datumTransfer - transfer a non-NULL datum into the current memory context. * ! * Differs from datumCopy() in its handling of read-write expanded objects. */ ! extern Datum datumTransfer(Datum value, bool typByVal, int typLen); /* * datumIsEqual diff --git a/src/include/utils/expandeddatum.h b/src/include/utils/expandeddatum.h index ...3a8336e . *** a/src/include/utils/expandeddatum.h --- b/src/include/utils/expandeddatum.h *************** *** 0 **** --- 1,148 ---- + /*------------------------------------------------------------------------- + * + * expandeddatum.h + * Declarations for access to "expanded" value representations. + * + * Complex data types, particularly container types such as arrays and + * records, usually have on-disk representations that are compact but not + * especially convenient to modify. What's more, when we do modify them, + * having to recopy all the rest of the value can be extremely inefficient. + * Therefore, we provide a notion of an "expanded" representation that is used + * only in memory and is optimized more for computation than storage. + * The format appearing on disk is called the data type's "flattened" + * representation, since it is required to be a contiguous blob of bytes -- + * but the type can have an expanded representation that is not. Data types + * must provide means to translate an expanded representation back to + * flattened form. + * + * An expanded object is meant to survive across multiple operations, but + * not to be enormously long-lived; for example it might be a local variable + * in a PL/pgSQL procedure. So its extra bulk compared to the on-disk format + * is a worthwhile trade-off. + * + * References to expanded objects are a type of TOAST pointer. + * Because of longstanding conventions in Postgres, this means that the + * flattened form of such an object must always be a varlena object. + * Fortunately that's no restriction in practice. + * + * There are actually two kinds of TOAST pointers for expanded objects: + * read-only and read-write pointers. Possession of one of the latter + * authorizes a function to modify the value in-place rather than copying it + * as would normally be required. Functions should always return a read-write + * pointer to any new expanded object they create. Functions that modify an + * argument value in-place must take care that they do not corrupt the old + * value if they fail partway through. + * + * + * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group + * Portions Copyright (c) 1994, Regents of the University of California + * + * src/include/utils/expandeddatum.h + * + *------------------------------------------------------------------------- + */ + #ifndef EXPANDEDDATUM_H + #define EXPANDEDDATUM_H + + /* Size of an EXTERNAL datum that contains a pointer to an expanded object */ + #define EXPANDED_POINTER_SIZE (VARHDRSZ_EXTERNAL + sizeof(varatt_expanded)) + + /* + * "Methods" that must be provided for any expanded object. + * + * get_flat_size: compute space needed for flattened representation (which + * must be a valid in-line, non-compressed, 4-byte-header varlena object). + * + * flatten_into: construct flattened representation in the caller-allocated + * space at *result, of size allocated_size (which will always be the result + * of a preceding get_flat_size call; it's passed for cross-checking). + * + * Note: construction of a heap tuple from an expanded datum calls + * get_flat_size twice, so it's worthwhile to make sure that that doesn't + * incur too much overhead. + */ + typedef Size (*EOM_get_flat_size_method) (ExpandedObjectHeader *eohptr); + typedef void (*EOM_flatten_into_method) (ExpandedObjectHeader *eohptr, + void *result, Size allocated_size); + + /* Struct of function pointers for an expanded object's methods */ + typedef struct ExpandedObjectMethods + { + EOM_get_flat_size_method get_flat_size; + EOM_flatten_into_method flatten_into; + } ExpandedObjectMethods; + + /* + * Every expanded object must contain this header; typically the header + * is embedded in some larger struct that adds type-specific fields. + * + * It is presumed that the header object and all subsidiary data are stored + * in eoh_context, so that the object can be freed by deleting that context, + * or its storage lifespan can be altered by reparenting the context. + * (In principle the object could own additional resources, such as malloc'd + * storage, and use a memory context reset callback to free them upon reset or + * deletion of eoh_context.) + * + * We set up two TOAST pointers within the standard header, one read-write + * and one read-only. This allows functions to return either kind of pointer + * without making an additional allocation, and in particular without worrying + * whether a separately palloc'd object would have sufficient lifespan. + * But note that these pointers are just a convenience; a pointer object + * appearing somewhere else would still be legal. + * + * The typedef declaration for this appears in postgres.h. + */ + struct ExpandedObjectHeader + { + /* Phony varlena header */ + int32 vl_len_; /* always EOH_HEADER_MAGIC, see below */ + + /* Pointer to methods required for object type */ + const ExpandedObjectMethods *eoh_methods; + + /* Memory context containing this header and subsidiary data */ + MemoryContext eoh_context; + + /* Standard R/W TOAST pointer for this object is kept here */ + char eoh_rw_ptr[EXPANDED_POINTER_SIZE]; + + /* Standard R/O TOAST pointer for this object is kept here */ + char eoh_ro_ptr[EXPANDED_POINTER_SIZE]; + }; + + /* + * Particularly for read-only functions, it is handy to be able to work with + * either regular "flat" varlena inputs or expanded inputs of the same data + * type. To allow determining which case an argument-fetching function has + * returned, the first int32 of an ExpandedObjectHeader always contains -1 + * (EOH_HEADER_MAGIC to the code). This works since no 4-byte-header varlena + * could have that as its first 4 bytes. Caution: we could not reliably tell + * the difference between an ExpandedObjectHeader and a short-header object + * with this trick. However, it works fine if the argument fetching code + * always returns either a 4-byte-header flat object or an expanded object. + */ + #define EOH_HEADER_MAGIC (-1) + #define VARATT_IS_EXPANDED_HEADER(PTR) \ + (((ExpandedObjectHeader *) (PTR))->vl_len_ == EOH_HEADER_MAGIC) + + /* + * Generic support functions for expanded objects. + * (More of these might be worth inlining later.) + */ + + #define EOHPGetRWDatum(eohptr) PointerGetDatum((eohptr)->eoh_rw_ptr) + #define EOHPGetRODatum(eohptr) PointerGetDatum((eohptr)->eoh_ro_ptr) + + extern ExpandedObjectHeader *DatumGetEOHP(Datum d); + extern void EOH_init_header(ExpandedObjectHeader *eohptr, + const ExpandedObjectMethods *methods, + MemoryContext obj_context); + extern Size EOH_get_flat_size(ExpandedObjectHeader *eohptr); + extern void EOH_flatten_into(ExpandedObjectHeader *eohptr, + void *result, Size allocated_size); + extern bool DatumIsReadWriteExpandedObject(Datum d, bool isnull, int16 typlen); + extern Datum MakeExpandedObjectReadOnly(Datum d, bool isnull, int16 typlen); + extern Datum TransferExpandedObject(Datum d, MemoryContext new_parent); + extern void DeleteExpandedObject(Datum d); + + #endif /* EXPANDEDDATUM_H */ diff --git a/src/pl/plpgsql/src/pl_comp.c b/src/pl/plpgsql/src/pl_comp.c index f364ce4..d021145 100644 *** a/src/pl/plpgsql/src/pl_comp.c --- b/src/pl/plpgsql/src/pl_comp.c *************** build_datatype(HeapTuple typeTup, int32 *** 2202,2207 **** --- 2202,2223 ---- typ->typbyval = typeStruct->typbyval; typ->typrelid = typeStruct->typrelid; typ->typioparam = getTypeIOParam(typeTup); + /* Detect if type is true array, or domain thereof */ + /* NB: this is only used to decide whether to apply expand_array */ + if (typeStruct->typtype == TYPTYPE_BASE) + { + /* this test should match what get_element_type() checks */ + typ->typisarray = (typeStruct->typlen == -1 && + OidIsValid(typeStruct->typelem)); + } + else if (typeStruct->typtype == TYPTYPE_DOMAIN) + { + /* we can short-circuit looking up base types if it's not varlena */ + typ->typisarray = (typeStruct->typlen == -1 && + OidIsValid(get_base_element_type(typeStruct->typbasetype))); + } + else + typ->typisarray = false; typ->collation = typeStruct->typcollation; if (OidIsValid(collation) && OidIsValid(typ->collation)) typ->collation = collation; diff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c index b7e3bc4..7d98ce1 100644 *** a/src/pl/plpgsql/src/pl_exec.c --- b/src/pl/plpgsql/src/pl_exec.c *************** static void exec_assign_value(PLpgSQL_ex *** 171,176 **** --- 171,177 ---- Datum value, Oid valtype, bool *isNull); static void exec_eval_datum(PLpgSQL_execstate *estate, PLpgSQL_datum *datum, + bool getrwpointer, Oid *typeid, int32 *typetypmod, Datum *value, *************** plpgsql_exec_function(PLpgSQL_function * *** 295,300 **** --- 296,339 ---- var->value = fcinfo->arg[i]; var->isnull = fcinfo->argnull[i]; var->freeval = false; + + /* + * Force any array-valued parameter to be stored in + * expanded form in our local variable, in hopes of + * improving efficiency of uses of the variable. (This is + * a hack, really: why only arrays? Need more thought + * about which cases are likely to win. See also + * typisarray-specific heuristic in exec_assign_value.) + * + * Special cases: If passed a R/W expanded pointer, assume + * we can commandeer the object rather than having to copy + * it. If passed a R/O expanded pointer, just keep it as + * the value of the variable for the moment. (We'll force + * it to R/W if the variable gets modified, but that may + * very well never happen.) + */ + if (!var->isnull && var->datatype->typisarray) + { + if (VARATT_IS_EXTERNAL_EXPANDED_RW(DatumGetPointer(var->value))) + { + /* take ownership of R/W object */ + var->value = TransferExpandedObject(var->value, + CurrentMemoryContext); + var->freeval = true; + } + else if (VARATT_IS_EXTERNAL_EXPANDED_RO(DatumGetPointer(var->value))) + { + /* R/O pointer, keep it as-is until assigned to */ + } + else + { + /* flat array, so force to expanded form */ + var->value = expand_array(var->value, + CurrentMemoryContext, + 0, 0, 0); + var->freeval = true; + } + } } break; *************** plpgsql_exec_function(PLpgSQL_function * *** 461,478 **** /* * If the function's return type isn't by value, copy the value ! * into upper executor memory context. */ if (!fcinfo->isnull && !func->fn_retbyval) ! { ! Size len; ! void *tmp; ! ! len = datumGetSize(estate.retval, false, func->fn_rettyplen); ! tmp = SPI_palloc(len); ! memcpy(tmp, DatumGetPointer(estate.retval), len); ! estate.retval = PointerGetDatum(tmp); ! } } } --- 500,513 ---- /* * If the function's return type isn't by value, copy the value ! * into upper executor memory context. However, if we have a R/W ! * expanded datum, we can just transfer its ownership out to the ! * upper executor context. */ if (!fcinfo->isnull && !func->fn_retbyval) ! estate.retval = SPI_datumTransfer(estate.retval, ! false, ! func->fn_rettyplen); } } *************** exec_stmt_return(PLpgSQL_execstate *esta *** 2455,2460 **** --- 2490,2502 ---- * Special case path when the RETURN expression is a simple variable * reference; in particular, this path is always taken in functions with * one or more OUT parameters. + * + * This special case is especially efficient for returning variables that + * have R/W expanded values: we can put the R/W pointer directly into + * estate->retval, leading to transferring the value to the caller's + * context cheaply. If we went through exec_eval_expr we'd end up with a + * R/O pointer. It's okay to skip MakeExpandedObjectReadOnly here since + * we know we won't need the variable's value within the function anymore. */ if (stmt->retvarno >= 0) { *************** exec_stmt_return_next(PLpgSQL_execstate *** 2580,2585 **** --- 2622,2632 ---- * Special case path when the RETURN NEXT expression is a simple variable * reference; in particular, this path is always taken in functions with * one or more OUT parameters. + * + * Unlike exec_statement_return, there's no special win here for R/W + * expanded values, since they'll have to get flattened to go into the + * tuplestore. Indeed, we'd better make them R/O to avoid any risk of the + * casting step changing them in-place. */ if (stmt->retvarno >= 0) { *************** exec_stmt_return_next(PLpgSQL_execstate *** 2598,2603 **** --- 2645,2655 ---- (errcode(ERRCODE_DATATYPE_MISMATCH), errmsg("wrong result type supplied in RETURN NEXT"))); + /* let's be very paranoid about the cast step */ + retval = MakeExpandedObjectReadOnly(retval, + isNull, + var->datatype->typlen); + /* coerce type if needed */ retval = exec_simple_cast_value(estate, retval, *************** exec_assign_value(PLpgSQL_execstate *est *** 4061,4086 **** /* * If type is by-reference, copy the new value (which is * probably in the eval_econtext) into the procedure's memory ! * context. */ if (!var->datatype->typbyval && !*isNull) ! newvalue = datumCopy(newvalue, ! false, ! var->datatype->typlen); /* ! * Now free the old value. (We can't do this any earlier ! * because of the possibility that we are assigning the var's ! * old value to it, eg "foo := foo". We could optimize out ! * the assignment altogether in such cases, but it's too ! * infrequent to be worth testing for.) */ ! free_var(var); var->value = newvalue; var->isnull = *isNull; ! if (!var->datatype->typbyval && !*isNull) ! var->freeval = true; break; } --- 4113,4162 ---- /* * If type is by-reference, copy the new value (which is * probably in the eval_econtext) into the procedure's memory ! * context. But if it's a read/write reference to an expanded ! * object, no physical copy needs to happen; at most we need ! * to reparent the object's memory context. ! * ! * If it's an array, we force the value to be stored in R/W ! * expanded form. This wins if the function later does, say, ! * a lot of array subscripting operations on the variable, and ! * otherwise might lose. We might need to use a different ! * heuristic, but it's too soon to tell. Also, are there ! * cases where it'd be useful to force non-array values into ! * expanded form? */ if (!var->datatype->typbyval && !*isNull) ! { ! if (var->datatype->typisarray && ! !VARATT_IS_EXTERNAL_EXPANDED_RW(DatumGetPointer(newvalue))) ! { ! /* array and not already R/W, so apply expand_array */ ! newvalue = expand_array(newvalue, CurrentMemoryContext, ! 0, 0, 0); ! } ! else ! { ! /* else transfer value if R/W, else just datumCopy */ ! newvalue = datumTransfer(newvalue, ! false, ! var->datatype->typlen); ! } ! } /* ! * Now free the old value, unless it's the same as the new ! * value (ie, we're doing "foo := foo"). Note that for ! * expanded objects, this test is necessary and cannot ! * reliably be made any earlier; we have to be looking at the ! * object's standard R/W pointer to be sure pointer equality ! * is meaningful. */ ! if (var->value != newvalue || var->isnull || *isNull) ! free_var(var); var->value = newvalue; var->isnull = *isNull; ! var->freeval = (!var->datatype->typbyval && !*isNull); break; } *************** exec_assign_value(PLpgSQL_execstate *est *** 4277,4283 **** } while (target->dtype == PLPGSQL_DTYPE_ARRAYELEM); /* Fetch current value of array datum */ ! exec_eval_datum(estate, target, &parenttypoid, &parenttypmod, &oldarraydatum, &oldarrayisnull); --- 4353,4359 ---- } while (target->dtype == PLPGSQL_DTYPE_ARRAYELEM); /* Fetch current value of array datum */ ! exec_eval_datum(estate, target, true, &parenttypoid, &parenttypmod, &oldarraydatum, &oldarrayisnull); *************** exec_assign_value(PLpgSQL_execstate *est *** 4423,4438 **** * * The type oid, typmod, value in Datum format, and null flag are returned. * * At present this doesn't handle PLpgSQL_expr or PLpgSQL_arrayelem datums. * ! * NOTE: caller must not modify the returned value, since it points right ! * at the stored value in the case of pass-by-reference datatypes. In some ! * cases we have to palloc a return value, and in such cases we put it into ! * the estate's short-term memory context. */ static void exec_eval_datum(PLpgSQL_execstate *estate, PLpgSQL_datum *datum, Oid *typeid, int32 *typetypmod, Datum *value, --- 4499,4522 ---- * * The type oid, typmod, value in Datum format, and null flag are returned. * + * If getrwpointer is TRUE, we'll return a R/W pointer to any variable that is + * an expanded object; otherwise we return a R/O pointer to such variables. + * * At present this doesn't handle PLpgSQL_expr or PLpgSQL_arrayelem datums. * ! * NOTE: the returned Datum points right at the stored value in the case of ! * pass-by-reference datatypes. Generally callers should take care not to ! * modify the stored value. Some callers intentionally manipulate variables ! * referenced by R/W expanded pointers, though; it is those callers' ! * responsibility that the results are semantically OK. ! * ! * In some cases we have to palloc a return value, and in such cases we put ! * it into the estate's short-term memory context. */ static void exec_eval_datum(PLpgSQL_execstate *estate, PLpgSQL_datum *datum, + bool getrwpointer, Oid *typeid, int32 *typetypmod, Datum *value, *************** exec_eval_datum(PLpgSQL_execstate *estat *** 4448,4454 **** *typeid = var->datatype->typoid; *typetypmod = var->datatype->atttypmod; ! *value = var->value; *isnull = var->isnull; break; } --- 4532,4543 ---- *typeid = var->datatype->typoid; *typetypmod = var->datatype->atttypmod; ! if (getrwpointer) ! *value = var->value; ! else ! *value = MakeExpandedObjectReadOnly(var->value, ! var->isnull, ! var->datatype->typlen); *isnull = var->isnull; break; } *************** setup_param_list(PLpgSQL_execstate *esta *** 5284,5290 **** PLpgSQL_var *var = (PLpgSQL_var *) datum; ParamExternData *prm = ¶mLI->params[dno]; ! prm->value = var->value; prm->isnull = var->isnull; prm->pflags = PARAM_FLAG_CONST; prm->ptype = var->datatype->typoid; --- 5373,5381 ---- PLpgSQL_var *var = (PLpgSQL_var *) datum; ParamExternData *prm = ¶mLI->params[dno]; ! prm->value = MakeExpandedObjectReadOnly(var->value, ! var->isnull, ! var->datatype->typlen); prm->isnull = var->isnull; prm->pflags = PARAM_FLAG_CONST; prm->ptype = var->datatype->typoid; *************** plpgsql_param_fetch(ParamListInfo params *** 5350,5356 **** /* OK, evaluate the value and store into the appropriate paramlist slot */ datum = estate->datums[dno]; prm = ¶ms->params[dno]; ! exec_eval_datum(estate, datum, &prm->ptype, &prmtypmod, &prm->value, &prm->isnull); } --- 5441,5447 ---- /* OK, evaluate the value and store into the appropriate paramlist slot */ datum = estate->datums[dno]; prm = ¶ms->params[dno]; ! exec_eval_datum(estate, datum, false, &prm->ptype, &prmtypmod, &prm->value, &prm->isnull); } *************** make_tuple_from_row(PLpgSQL_execstate *e *** 5542,5548 **** if (row->varnos[i] < 0) /* should not happen */ elog(ERROR, "dropped rowtype entry for non-dropped column"); ! exec_eval_datum(estate, estate->datums[row->varnos[i]], &fieldtypeid, &fieldtypmod, &dvalues[i], &nulls[i]); if (fieldtypeid != tupdesc->attrs[i]->atttypid) --- 5633,5639 ---- if (row->varnos[i] < 0) /* should not happen */ elog(ERROR, "dropped rowtype entry for non-dropped column"); ! exec_eval_datum(estate, estate->datums[row->varnos[i]], false, &fieldtypeid, &fieldtypmod, &dvalues[i], &nulls[i]); if (fieldtypeid != tupdesc->attrs[i]->atttypid) *************** free_var(PLpgSQL_var *var) *** 6335,6341 **** { if (var->freeval) { ! pfree(DatumGetPointer(var->value)); var->freeval = false; } } --- 6426,6437 ---- { if (var->freeval) { ! if (DatumIsReadWriteExpandedObject(var->value, ! var->isnull, ! var->datatype->typlen)) ! DeleteExpandedObject(var->value); ! else ! pfree(DatumGetPointer(var->value)); var->freeval = false; } } *************** format_expr_params(PLpgSQL_execstate *es *** 6542,6549 **** curvar = (PLpgSQL_var *) estate->datums[dno]; ! exec_eval_datum(estate, (PLpgSQL_datum *) curvar, ¶mtypeid, ! ¶mtypmod, ¶mdatum, ¶misnull); appendStringInfo(¶mstr, "%s%s = ", paramno > 0 ? ", " : "", --- 6638,6646 ---- curvar = (PLpgSQL_var *) estate->datums[dno]; ! exec_eval_datum(estate, (PLpgSQL_datum *) curvar, false, ! ¶mtypeid, ¶mtypmod, ! ¶mdatum, ¶misnull); appendStringInfo(¶mstr, "%s%s = ", paramno > 0 ? ", " : "", diff --git a/src/pl/plpgsql/src/plpgsql.h b/src/pl/plpgsql/src/plpgsql.h index 00f2f77..c95087c 100644 *** a/src/pl/plpgsql/src/plpgsql.h --- b/src/pl/plpgsql/src/plpgsql.h *************** typedef struct *** 180,185 **** --- 180,186 ---- bool typbyval; Oid typrelid; Oid typioparam; + bool typisarray; /* is "true" array, or domain over one */ Oid collation; /* from pg_type, but can be overridden */ FmgrInfo typinput; /* lookup info for typinput function */ int32 atttypmod; /* typmod (taken from someplace else) */
I wrote: > Here's an 0.4 version, in which I've written some user docs, refactored > the array-specific code into a more reasonable arrangement, and adjusted > a lot of the built-in array functions to support expanded arrays directly. > This is about as far as I feel a need to take the latter activity, at > least for now; there are a few remaining operations that might be worth > converting but it's not clear they'd really offer much benefit. Attached is an updated version. Aside from rebasing over some recent commits that touched the same areas, this improves one more case, which is plpgsql arrays with typmods. I noticed that while create or replace function arraysetnum(n int) returns numeric[] as $$ declare res numeric[] := '{}'; begin for i in 1 .. n loop res[i] := i; end loop; return res; end $$ language plpgsql strict; was nicely speedy, performance went back in the toilet again as soon as you stuck a typmod onto the array, for example declare res numeric(20,0)[] := '{}'; The reason is that exec_cast_value would then insist on feeding the whole array through I/O conversion to apply the typmod :-(. This was in fact completely useless activity because we had already carefully applied the typmod to the new array element; but exec_cast_value didn't know that. In the attached patch, I've dealt with this by teaching exec_eval_expr, exec_assign_value, exec_cast_value, etc to track typmods not just type OIDs. In this way we avoid a useless conversion whenever a value is known to match the desired typmod already. This is probably something that should've been done to plpgsql a very long time ago; the overhead is really minimal and the potential savings when dealing with length-constrained variables is significant. regards, tom lane diff --git a/doc/src/sgml/storage.sgml b/doc/src/sgml/storage.sgml index d8c5287..e5b7b4b 100644 *** a/doc/src/sgml/storage.sgml --- b/doc/src/sgml/storage.sgml *************** comparison table, in which all the HTML *** 503,510 **** <acronym>TOAST</> pointers can point to data that is not on disk, but is elsewhere in the memory of the current server process. Such pointers obviously cannot be long-lived, but they are nonetheless useful. There ! is currently just one sub-case: ! pointers to <firstterm>indirect</> data. </para> <para> --- 503,511 ---- <acronym>TOAST</> pointers can point to data that is not on disk, but is elsewhere in the memory of the current server process. Such pointers obviously cannot be long-lived, but they are nonetheless useful. There ! are currently two sub-cases: ! pointers to <firstterm>indirect</> data and ! pointers to <firstterm>expanded</> data. </para> <para> *************** and there is no infrastructure to help w *** 519,524 **** --- 520,562 ---- </para> <para> + Expanded <acronym>TOAST</> pointers are useful for complex data types + whose on-disk representation is not especially suited for computational + purposes. As an example, the standard varlena representation of a + <productname>PostgreSQL</> array includes dimensionality information, a + nulls bitmap if there are any null elements, then the values of all the + elements in order. When the element type itself is variable-length, the + only way to find the <replaceable>N</>'th element is to scan through all the + preceding elements. This representation is appropriate for on-disk storage + because of its compactness, but for computations with the array it's much + nicer to have an <quote>expanded</> or <quote>deconstructed</> + representation in which all the element starting locations have been + identified. The <acronym>TOAST</> pointer mechanism supports this need by + allowing a pass-by-reference Datum to point to either a standard varlena + value (the on-disk representation) or a <acronym>TOAST</> pointer that + points to an expanded representation somewhere in memory. The details of + this expanded representation are up to the data type, though it must have + a standard header and meet the other API requirements given + in <filename>src/include/utils/expandeddatum.h</>. C-level functions + working with the data type can choose to handle either representation. + Functions that do not know about the expanded representation, but simply + apply <function>PG_DETOAST_DATUM</> to their inputs, will automatically + receive the traditional varlena representation; so support for an expanded + representation can be introduced incrementally, one function at a time. + </para> + + <para> + <acronym>TOAST</> pointers to expanded values are further broken down + into <firstterm>read-write</> and <firstterm>read-only</> pointers. + The pointed-to representation is the same either way, but a function that + receives a read-write pointer is allowed to modify the referenced value + in-place, whereas one that receives a read-only pointer must not; it must + first create a copy if it wants to make a modified version of the value. + This distinction and some associated conventions make it possible to avoid + unnecessary copying of expanded values during query execution. + </para> + + <para> For all types of in-memory <acronym>TOAST</> pointer, the <acronym>TOAST</> management code ensures that no such pointer datum can accidentally get stored on disk. In-memory <acronym>TOAST</> pointers are automatically diff --git a/doc/src/sgml/xtypes.sgml b/doc/src/sgml/xtypes.sgml index 2459616..ac0b8a2 100644 *** a/doc/src/sgml/xtypes.sgml --- b/doc/src/sgml/xtypes.sgml *************** CREATE TYPE complex ( *** 300,305 **** --- 300,376 ---- </para> </note> + <para> + Another feature that's enabled by <acronym>TOAST</> support is the + possibility of having an <firstterm>expanded</> in-memory data + representation that is more convenient to work with than the format that + is stored on disk. The regular or <quote>flat</> varlena storage format + is ultimately just a blob of bytes; it cannot for example contain + pointers, since it may get copied to other locations in memory. + For complex data types, the flat format may be quite expensive to work + with, so <productname>PostgreSQL</> provides a way to <quote>expand</> + the flat format into a representation that is more suited to computation, + and then pass that format in-memory between functions of the data type. + </para> + + <para> + To use expanded storage, a data type must define an expanded format that + follows the rules given in <filename>src/include/utils/expandeddatum.h</>, + and provide functions to <quote>expand</> a flat varlena value into + expanded format and <quote>flatten</> the expanded format back to the + regular varlena representation. Then ensure that all C functions for + the data type can accept either representation, possibly by converting + one into the other immediately upon receipt. This does not require fixing + all existing functions for the data type at once, because the standard + <function>PG_DETOAST_DATUM</> macro is defined to convert expanded inputs + into regular flat format. Therefore, existing functions that work with + the flat varlena format will continue to work, though slightly + inefficiently, with expanded inputs; they need not be converted until and + unless better performance is important. + </para> + + <para> + C functions that know how to work with an expanded representation + typically fall into two categories: those that can only handle expanded + format, and those that can handle either expanded or flat varlena inputs. + The former are easier to write but may be less efficient overall, because + converting a flat input to expanded form for use by a single function may + cost more than is saved by operating on the expanded format. + When only expanded format need be handled, conversion of flat inputs to + expanded form can be hidden inside an argument-fetching macro, so that + the function appears no more complex than one working with traditional + varlena input. + To handle both types of input, write an argument-fetching function that + will detoast external, short-header, and compressed varlena inputs, but + not expanded inputs. Such a function can be defined as returning a + pointer to a union of the flat varlena format and the expanded format. + Callers can use the <function>VARATT_IS_EXPANDED_HEADER()</> macro to + determine which format they received. + </para> + + <para> + The <acronym>TOAST</> infrastructure not only allows regular varlena + values to be distinguished from expanded values, but also + distinguishes <quote>read-write</> and <quote>read-only</> pointers to + expanded values. C functions that only need to examine an expanded + value, or will only change it in safe and non-semantically-visible ways, + need not care which type of pointer they receive. C functions that + produce a modified version of an input value are allowed to modify an + expanded input value in-place if they receive a read-write pointer, but + must not modify the input if they receive a read-only pointer; in that + case they have to copy the value first, producing a new value to modify. + A C function that has constructed a new expanded value should always + return a read-write pointer to it. Also, a C function that is modifying + a read-write expanded value in-place should take care to leave the value + in a sane state if it fails partway through. + </para> + + <para> + For examples of working with expanded values, see the standard array + infrastructure, particularly + <filename>src/backend/utils/adt/array_expanded.c</>. + </para> + </sect2> </sect1> diff --git a/src/backend/access/common/heaptuple.c b/src/backend/access/common/heaptuple.c index 6cd4e8e..de7f02f 100644 *** a/src/backend/access/common/heaptuple.c --- b/src/backend/access/common/heaptuple.c *************** *** 60,65 **** --- 60,66 ---- #include "access/sysattr.h" #include "access/tuptoaster.h" #include "executor/tuptable.h" + #include "utils/expandeddatum.h" /* Does att's datatype allow packing into the 1-byte-header varlena format? */ *************** heap_compute_data_size(TupleDesc tupleDe *** 93,105 **** for (i = 0; i < numberOfAttributes; i++) { Datum val; if (isnull[i]) continue; val = values[i]; ! if (ATT_IS_PACKABLE(att[i]) && VARATT_CAN_MAKE_SHORT(DatumGetPointer(val))) { /* --- 94,108 ---- for (i = 0; i < numberOfAttributes; i++) { Datum val; + Form_pg_attribute atti; if (isnull[i]) continue; val = values[i]; + atti = att[i]; ! if (ATT_IS_PACKABLE(atti) && VARATT_CAN_MAKE_SHORT(DatumGetPointer(val))) { /* *************** heap_compute_data_size(TupleDesc tupleDe *** 108,118 **** */ data_length += VARATT_CONVERTED_SHORT_SIZE(DatumGetPointer(val)); } else { ! data_length = att_align_datum(data_length, att[i]->attalign, ! att[i]->attlen, val); ! data_length = att_addlength_datum(data_length, att[i]->attlen, val); } } --- 111,131 ---- */ data_length += VARATT_CONVERTED_SHORT_SIZE(DatumGetPointer(val)); } + else if (atti->attlen == -1 && + VARATT_IS_EXTERNAL_EXPANDED(DatumGetPointer(val))) + { + /* + * we want to flatten the expanded value so that the constructed + * tuple doesn't depend on it + */ + data_length = att_align_nominal(data_length, atti->attalign); + data_length += EOH_get_flat_size(DatumGetEOHP(val)); + } else { ! data_length = att_align_datum(data_length, atti->attalign, ! atti->attlen, val); ! data_length = att_addlength_datum(data_length, atti->attlen, val); } } *************** heap_fill_tuple(TupleDesc tupleDesc, *** 203,212 **** *infomask |= HEAP_HASVARWIDTH; if (VARATT_IS_EXTERNAL(val)) { ! *infomask |= HEAP_HASEXTERNAL; ! /* no alignment, since it's short by definition */ ! data_length = VARSIZE_EXTERNAL(val); ! memcpy(data, val, data_length); } else if (VARATT_IS_SHORT(val)) { --- 216,241 ---- *infomask |= HEAP_HASVARWIDTH; if (VARATT_IS_EXTERNAL(val)) { ! if (VARATT_IS_EXTERNAL_EXPANDED(val)) ! { ! /* ! * we want to flatten the expanded value so that the ! * constructed tuple doesn't depend on it ! */ ! ExpandedObjectHeader *eoh = DatumGetEOHP(values[i]); ! ! data = (char *) att_align_nominal(data, ! att[i]->attalign); ! data_length = EOH_get_flat_size(eoh); ! EOH_flatten_into(eoh, data, data_length); ! } ! else ! { ! *infomask |= HEAP_HASEXTERNAL; ! /* no alignment, since it's short by definition */ ! data_length = VARSIZE_EXTERNAL(val); ! memcpy(data, val, data_length); ! } } else if (VARATT_IS_SHORT(val)) { diff --git a/src/backend/access/heap/tuptoaster.c b/src/backend/access/heap/tuptoaster.c index 8464e87..c3ebbef 100644 *** a/src/backend/access/heap/tuptoaster.c --- b/src/backend/access/heap/tuptoaster.c *************** *** 37,42 **** --- 37,43 ---- #include "catalog/catalog.h" #include "common/pg_lzcompress.h" #include "miscadmin.h" + #include "utils/expandeddatum.h" #include "utils/fmgroids.h" #include "utils/rel.h" #include "utils/typcache.h" *************** heap_tuple_fetch_attr(struct varlena * a *** 130,135 **** --- 131,149 ---- result = (struct varlena *) palloc(VARSIZE_ANY(attr)); memcpy(result, attr, VARSIZE_ANY(attr)); } + else if (VARATT_IS_EXTERNAL_EXPANDED(attr)) + { + /* + * This is an expanded-object pointer --- get flat format + */ + ExpandedObjectHeader *eoh; + Size resultsize; + + eoh = DatumGetEOHP(PointerGetDatum(attr)); + resultsize = EOH_get_flat_size(eoh); + result = (struct varlena *) palloc(resultsize); + EOH_flatten_into(eoh, (void *) result, resultsize); + } else { /* *************** heap_tuple_untoast_attr(struct varlena * *** 196,201 **** --- 210,224 ---- attr = result; } } + else if (VARATT_IS_EXTERNAL_EXPANDED(attr)) + { + /* + * This is an expanded-object pointer --- get flat format + */ + attr = heap_tuple_fetch_attr(attr); + /* flatteners are not allowed to produce compressed/short output */ + Assert(!VARATT_IS_EXTENDED(attr)); + } else if (VARATT_IS_COMPRESSED(attr)) { /* *************** heap_tuple_untoast_attr_slice(struct var *** 263,268 **** --- 286,296 ---- return heap_tuple_untoast_attr_slice(redirect.pointer, sliceoffset, slicelength); } + else if (VARATT_IS_EXTERNAL_EXPANDED(attr)) + { + /* pass it off to heap_tuple_fetch_attr to flatten */ + preslice = heap_tuple_fetch_attr(attr); + } else preslice = attr; *************** toast_raw_datum_size(Datum value) *** 344,349 **** --- 372,381 ---- return toast_raw_datum_size(PointerGetDatum(toast_pointer.pointer)); } + else if (VARATT_IS_EXTERNAL_EXPANDED(attr)) + { + result = EOH_get_flat_size(DatumGetEOHP(value)); + } else if (VARATT_IS_COMPRESSED(attr)) { /* here, va_rawsize is just the payload size */ *************** toast_datum_size(Datum value) *** 400,405 **** --- 432,441 ---- return toast_datum_size(PointerGetDatum(toast_pointer.pointer)); } + else if (VARATT_IS_EXTERNAL_EXPANDED(attr)) + { + result = EOH_get_flat_size(DatumGetEOHP(value)); + } else if (VARATT_IS_SHORT(attr)) { result = VARSIZE_SHORT(attr); diff --git a/src/backend/executor/execQual.c b/src/backend/executor/execQual.c index fec76d4..7bdc201 100644 *** a/src/backend/executor/execQual.c --- b/src/backend/executor/execQual.c *************** ExecEvalArrayCoerceExpr(ArrayCoerceExprS *** 4246,4252 **** { ArrayCoerceExpr *acoerce = (ArrayCoerceExpr *) astate->xprstate.expr; Datum result; - ArrayType *array; FunctionCallInfoData locfcinfo; result = ExecEvalExpr(astate->arg, econtext, isNull, isDone); --- 4246,4251 ---- *************** ExecEvalArrayCoerceExpr(ArrayCoerceExprS *** 4263,4276 **** if (!OidIsValid(acoerce->elemfuncid)) { /* Detoast input array if necessary, and copy in any case */ ! array = DatumGetArrayTypePCopy(result); ARR_ELEMTYPE(array) = astate->resultelemtype; PG_RETURN_ARRAYTYPE_P(array); } - /* Detoast input array if necessary, but don't make a useless copy */ - array = DatumGetArrayTypeP(result); - /* Initialize function cache if first time through */ if (astate->elemfunc.fn_oid == InvalidOid) { --- 4262,4273 ---- if (!OidIsValid(acoerce->elemfuncid)) { /* Detoast input array if necessary, and copy in any case */ ! ArrayType *array = DatumGetArrayTypePCopy(result); ! ARR_ELEMTYPE(array) = astate->resultelemtype; PG_RETURN_ARRAYTYPE_P(array); } /* Initialize function cache if first time through */ if (astate->elemfunc.fn_oid == InvalidOid) { *************** ExecEvalArrayCoerceExpr(ArrayCoerceExprS *** 4300,4314 **** */ InitFunctionCallInfoData(locfcinfo, &(astate->elemfunc), 3, InvalidOid, NULL, NULL); ! locfcinfo.arg[0] = PointerGetDatum(array); locfcinfo.arg[1] = Int32GetDatum(acoerce->resulttypmod); locfcinfo.arg[2] = BoolGetDatum(acoerce->isExplicit); locfcinfo.argnull[0] = false; locfcinfo.argnull[1] = false; locfcinfo.argnull[2] = false; ! return array_map(&locfcinfo, ARR_ELEMTYPE(array), astate->resultelemtype, ! astate->amstate); } /* ---------------------------------------------------------------- --- 4297,4310 ---- */ InitFunctionCallInfoData(locfcinfo, &(astate->elemfunc), 3, InvalidOid, NULL, NULL); ! locfcinfo.arg[0] = result; locfcinfo.arg[1] = Int32GetDatum(acoerce->resulttypmod); locfcinfo.arg[2] = BoolGetDatum(acoerce->isExplicit); locfcinfo.argnull[0] = false; locfcinfo.argnull[1] = false; locfcinfo.argnull[2] = false; ! return array_map(&locfcinfo, astate->resultelemtype, astate->amstate); } /* ---------------------------------------------------------------- diff --git a/src/backend/executor/execTuples.c b/src/backend/executor/execTuples.c index 753754d..a05d8b1 100644 *** a/src/backend/executor/execTuples.c --- b/src/backend/executor/execTuples.c *************** *** 88,93 **** --- 88,94 ---- #include "nodes/nodeFuncs.h" #include "storage/bufmgr.h" #include "utils/builtins.h" + #include "utils/expandeddatum.h" #include "utils/lsyscache.h" #include "utils/typcache.h" *************** ExecCopySlot(TupleTableSlot *dstslot, Tu *** 812,817 **** --- 813,864 ---- return ExecStoreTuple(newTuple, dstslot, InvalidBuffer, true); } + /* -------------------------------- + * ExecMakeSlotContentsReadOnly + * Mark any R/W expanded datums in the slot as read-only. + * + * This is needed when a slot that might contain R/W datum references is to be + * used as input for general expression evaluation. Since the expression(s) + * might contain more than one Var referencing the same R/W datum, we could + * get wrong answers if functions acting on those Vars thought they could + * modify the expanded value in-place. + * + * For notational reasons, we return the same slot passed in. + * -------------------------------- + */ + TupleTableSlot * + ExecMakeSlotContentsReadOnly(TupleTableSlot *slot) + { + /* + * sanity checks + */ + Assert(slot != NULL); + Assert(slot->tts_tupleDescriptor != NULL); + Assert(!slot->tts_isempty); + + /* + * If the slot contains a physical tuple, it can't contain any expanded + * datums, because we flatten those when making a physical tuple. This + * might change later; but for now, we need do nothing unless the slot is + * virtual. + */ + if (slot->tts_tuple == NULL) + { + Form_pg_attribute *att = slot->tts_tupleDescriptor->attrs; + int attnum; + + for (attnum = 0; attnum < slot->tts_nvalid; attnum++) + { + slot->tts_values[attnum] = + MakeExpandedObjectReadOnly(slot->tts_values[attnum], + slot->tts_isnull[attnum], + att[attnum]->attlen); + } + } + + return slot; + } + /* ---------------------------------------------------------------- * convenience initialization routines diff --git a/src/backend/executor/nodeSubqueryscan.c b/src/backend/executor/nodeSubqueryscan.c index 3f66e24..e5d1e54 100644 *** a/src/backend/executor/nodeSubqueryscan.c --- b/src/backend/executor/nodeSubqueryscan.c *************** SubqueryNext(SubqueryScanState *node) *** 56,62 **** --- 56,70 ---- * We just return the subplan's result slot, rather than expending extra * cycles for ExecCopySlot(). (Our own ScanTupleSlot is used only for * EvalPlanQual rechecks.) + * + * We do need to mark the slot contents read-only to prevent interference + * between different functions reading the same datum from the slot. It's + * a bit hokey to do this to the subplan's slot, but should be safe + * enough. */ + if (!TupIsNull(slot)) + slot = ExecMakeSlotContentsReadOnly(slot); + return slot; } diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c index b3c0502..54b5f2c 100644 *** a/src/backend/executor/spi.c --- b/src/backend/executor/spi.c *************** SPI_pfree(void *pointer) *** 1014,1019 **** --- 1014,1040 ---- pfree(pointer); } + Datum + SPI_datumTransfer(Datum value, bool typByVal, int typLen) + { + MemoryContext oldcxt = NULL; + Datum result; + + if (_SPI_curid + 1 == _SPI_connected) /* connected */ + { + if (_SPI_current != &(_SPI_stack[_SPI_curid + 1])) + elog(ERROR, "SPI stack corrupted"); + oldcxt = MemoryContextSwitchTo(_SPI_current->savedcxt); + } + + result = datumTransfer(value, typByVal, typLen); + + if (oldcxt) + MemoryContextSwitchTo(oldcxt); + + return result; + } + void SPI_freetuple(HeapTuple tuple) { diff --git a/src/backend/utils/adt/Makefile b/src/backend/utils/adt/Makefile index 20e5ff1..d1ed33f 100644 *** a/src/backend/utils/adt/Makefile --- b/src/backend/utils/adt/Makefile *************** endif *** 16,25 **** endif # keep this list arranged alphabetically or it gets to be a mess ! OBJS = acl.o arrayfuncs.o array_selfuncs.o array_typanalyze.o \ ! array_userfuncs.o arrayutils.o ascii.o bool.o \ ! cash.o char.o date.o datetime.o datum.o dbsize.o domains.o \ ! encode.o enum.o float.o format_type.o formatting.o genfile.o \ geo_ops.o geo_selfuncs.o inet_cidr_ntop.o inet_net_pton.o int.o \ int8.o json.o jsonb.o jsonb_gin.o jsonb_op.o jsonb_util.o \ jsonfuncs.o like.o lockfuncs.o mac.o misc.o nabstime.o name.o \ --- 16,26 ---- endif # keep this list arranged alphabetically or it gets to be a mess ! OBJS = acl.o arrayfuncs.o array_expanded.o array_selfuncs.o \ ! array_typanalyze.o array_userfuncs.o arrayutils.o ascii.o \ ! bool.o cash.o char.o date.o datetime.o datum.o dbsize.o domains.o \ ! encode.o enum.o expandeddatum.o \ ! float.o format_type.o formatting.o genfile.o \ geo_ops.o geo_selfuncs.o inet_cidr_ntop.o inet_net_pton.o int.o \ int8.o json.o jsonb.o jsonb_gin.o jsonb_op.o jsonb_util.o \ jsonfuncs.o like.o lockfuncs.o mac.o misc.o nabstime.o name.o \ diff --git a/src/backend/utils/adt/array_expanded.c b/src/backend/utils/adt/array_expanded.c index ...6d3b724 . *** a/src/backend/utils/adt/array_expanded.c --- b/src/backend/utils/adt/array_expanded.c *************** *** 0 **** --- 1,374 ---- + /*------------------------------------------------------------------------- + * + * array_expanded.c + * Basic functions for manipulating expanded arrays. + * + * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group + * Portions Copyright (c) 1994, Regents of the University of California + * + * + * IDENTIFICATION + * src/backend/utils/adt/array_expanded.c + * + *------------------------------------------------------------------------- + */ + #include "postgres.h" + + #include "access/tupmacs.h" + #include "utils/array.h" + #include "utils/lsyscache.h" + #include "utils/memutils.h" + + + /* "Methods" required for an expanded object */ + static Size EA_get_flat_size(ExpandedObjectHeader *eohptr); + static void EA_flatten_into(ExpandedObjectHeader *eohptr, + void *result, Size allocated_size); + + static const ExpandedObjectMethods EA_methods = + { + EA_get_flat_size, + EA_flatten_into + }; + + + /* + * expand_array: convert an array Datum into an expanded array + * + * The expanded object will be a child of parentcontext. + * + * Some callers can provide cache space to avoid repeated lookups of element + * type data across calls; if so, pass a metacache pointer, making sure that + * metacache->element_type is initialized to InvalidOid before first call. + * If no cross-call caching is required, pass NULL for metacache. + */ + Datum + expand_array(Datum arraydatum, MemoryContext parentcontext, + ArrayMetaState *metacache) + { + ArrayType *array; + ExpandedArrayHeader *eah; + MemoryContext objcxt; + MemoryContext oldcxt; + + /* + * Allocate private context for expanded object. We start by assuming + * that the array won't be very large; but if it does grow a lot, don't + * constrain aset.c's large-context behavior. + */ + objcxt = AllocSetContextCreate(parentcontext, + "expanded array", + ALLOCSET_SMALL_MINSIZE, + ALLOCSET_SMALL_INITSIZE, + ALLOCSET_DEFAULT_MAXSIZE); + + /* Set up expanded array header */ + eah = (ExpandedArrayHeader *) + MemoryContextAlloc(objcxt, sizeof(ExpandedArrayHeader)); + + EOH_init_header(&eah->hdr, &EA_methods, objcxt); + eah->ea_magic = EA_MAGIC; + + /* + * Detoast and copy original array into private context, as a flat array. + * We flatten it even if it's in expanded form; it's not clear that adding + * a special-case path for that would be worth the trouble. + * + * Note that this coding risks leaking some memory in the private context + * if we have to fetch data from a TOAST table; however, experimentation + * says that the leak is minimal. Doing it this way saves a copy step, + * which seems worthwhile, especially if the array is large enough to need + * external storage. + */ + oldcxt = MemoryContextSwitchTo(objcxt); + array = DatumGetArrayTypePCopy(arraydatum); + MemoryContextSwitchTo(oldcxt); + + eah->ndims = ARR_NDIM(array); + /* note these pointers point into the fvalue header! */ + eah->dims = ARR_DIMS(array); + eah->lbound = ARR_LBOUND(array); + + /* Save array's element-type data for possible use later */ + eah->element_type = ARR_ELEMTYPE(array); + if (metacache && metacache->element_type == eah->element_type) + { + /* Caller provided valid cache of representational data */ + eah->typlen = metacache->typlen; + eah->typbyval = metacache->typbyval; + eah->typalign = metacache->typalign; + } + else + { + /* No, so look it up */ + get_typlenbyvalalign(eah->element_type, + &eah->typlen, + &eah->typbyval, + &eah->typalign); + /* Update cache if provided */ + if (metacache) + { + metacache->element_type = eah->element_type; + metacache->typlen = eah->typlen; + metacache->typbyval = eah->typbyval; + metacache->typalign = eah->typalign; + } + } + + /* we don't make a deconstructed representation now */ + eah->dvalues = NULL; + eah->dnulls = NULL; + eah->dvalueslen = 0; + eah->nelems = 0; + eah->flat_size = 0; + + /* remember we have a flat representation */ + eah->fvalue = array; + eah->fstartptr = ARR_DATA_PTR(array); + eah->fendptr = ((char *) array) + ARR_SIZE(array); + + /* return a R/W pointer to the expanded array */ + return EOHPGetRWDatum(&eah->hdr); + } + + /* + * get_flat_size method for expanded arrays + */ + static Size + EA_get_flat_size(ExpandedObjectHeader *eohptr) + { + ExpandedArrayHeader *eah = (ExpandedArrayHeader *) eohptr; + int nelems; + int ndims; + Datum *dvalues; + bool *dnulls; + Size nbytes; + int i; + + Assert(eah->ea_magic == EA_MAGIC); + + /* Easy if we have a valid flattened value */ + if (eah->fvalue) + return ARR_SIZE(eah->fvalue); + + /* If we have a cached size value, believe that */ + if (eah->flat_size) + return eah->flat_size; + + /* + * Compute space needed by examining dvalues/dnulls. Note that the result + * array will have a nulls bitmap if dnulls isn't NULL, even if the array + * doesn't actually contain any nulls now. + */ + nelems = eah->nelems; + ndims = eah->ndims; + Assert(nelems == ArrayGetNItems(ndims, eah->dims)); + dvalues = eah->dvalues; + dnulls = eah->dnulls; + nbytes = 0; + for (i = 0; i < nelems; i++) + { + if (dnulls && dnulls[i]) + continue; + nbytes = att_addlength_datum(nbytes, eah->typlen, dvalues[i]); + nbytes = att_align_nominal(nbytes, eah->typalign); + /* check for overflow of total request */ + if (!AllocSizeIsValid(nbytes)) + ereport(ERROR, + (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED), + errmsg("array size exceeds the maximum allowed (%d)", + (int) MaxAllocSize))); + } + + if (dnulls) + nbytes += ARR_OVERHEAD_WITHNULLS(ndims, nelems); + else + nbytes += ARR_OVERHEAD_NONULLS(ndims); + + /* cache for next time */ + eah->flat_size = nbytes; + + return nbytes; + } + + /* + * flatten_into method for expanded arrays + */ + static void + EA_flatten_into(ExpandedObjectHeader *eohptr, + void *result, Size allocated_size) + { + ExpandedArrayHeader *eah = (ExpandedArrayHeader *) eohptr; + ArrayType *aresult = (ArrayType *) result; + int nelems; + int ndims; + int32 dataoffset; + + Assert(eah->ea_magic == EA_MAGIC); + + /* Easy if we have a valid flattened value */ + if (eah->fvalue) + { + Assert(allocated_size == ARR_SIZE(eah->fvalue)); + memcpy(result, eah->fvalue, allocated_size); + return; + } + + /* Else allocation should match previous get_flat_size result */ + Assert(allocated_size == eah->flat_size); + + /* Fill result array from dvalues/dnulls */ + nelems = eah->nelems; + ndims = eah->ndims; + + if (eah->dnulls) + dataoffset = ARR_OVERHEAD_WITHNULLS(ndims, nelems); + else + dataoffset = 0; /* marker for no null bitmap */ + + /* We must ensure that any pad space is zero-filled */ + memset(aresult, 0, allocated_size); + + SET_VARSIZE(aresult, allocated_size); + aresult->ndim = ndims; + aresult->dataoffset = dataoffset; + aresult->elemtype = eah->element_type; + memcpy(ARR_DIMS(aresult), eah->dims, ndims * sizeof(int)); + memcpy(ARR_LBOUND(aresult), eah->lbound, ndims * sizeof(int)); + + CopyArrayEls(aresult, + eah->dvalues, eah->dnulls, nelems, + eah->typlen, eah->typbyval, eah->typalign, + false); + } + + /* + * Argument fetching support code + */ + + /* + * DatumGetExpandedArray: get a writable expanded array from an input argument + */ + ExpandedArrayHeader * + DatumGetExpandedArray(Datum d) + { + ExpandedArrayHeader *eah; + + /* If it's a writable expanded array already, just return it */ + if (VARATT_IS_EXTERNAL_EXPANDED_RW(DatumGetPointer(d))) + { + eah = (ExpandedArrayHeader *) DatumGetEOHP(d); + Assert(eah->ea_magic == EA_MAGIC); + return eah; + } + + /* + * If it's a non-writable expanded array, copy it, extracting the element + * representational data to save a catalog lookup. + */ + if (VARATT_IS_EXTERNAL_EXPANDED_RO(DatumGetPointer(d))) + { + ArrayMetaState fakecache; + + eah = (ExpandedArrayHeader *) DatumGetEOHP(d); + Assert(eah->ea_magic == EA_MAGIC); + fakecache.element_type = eah->element_type; + fakecache.typlen = eah->typlen; + fakecache.typbyval = eah->typbyval; + fakecache.typalign = eah->typalign; + d = expand_array(d, CurrentMemoryContext, &fakecache); + return (ExpandedArrayHeader *) DatumGetEOHP(d); + } + + /* Else expand the hard way */ + d = expand_array(d, CurrentMemoryContext, NULL); + return (ExpandedArrayHeader *) DatumGetEOHP(d); + } + + /* + * As above, when caller has the ability to cache element type info + */ + ExpandedArrayHeader * + DatumGetExpandedArrayX(Datum d, ArrayMetaState *metacache) + { + ExpandedArrayHeader *eah; + + /* If it's a writable expanded array already, just return it */ + if (VARATT_IS_EXTERNAL_EXPANDED_RW(DatumGetPointer(d))) + { + eah = (ExpandedArrayHeader *) DatumGetEOHP(d); + Assert(eah->ea_magic == EA_MAGIC); + /* Update cache if provided */ + if (metacache) + { + metacache->element_type = eah->element_type; + metacache->typlen = eah->typlen; + metacache->typbyval = eah->typbyval; + metacache->typalign = eah->typalign; + } + return eah; + } + + /* Else expand using caller's cache if any */ + d = expand_array(d, CurrentMemoryContext, metacache); + return (ExpandedArrayHeader *) DatumGetEOHP(d); + } + + /* + * DatumGetAnyArray: return either an expanded array or a detoasted varlena + * array. The result must not be modified in-place. + */ + AnyArrayType * + DatumGetAnyArray(Datum d) + { + ExpandedArrayHeader *eah; + + /* + * If it's an expanded array (RW or RO), return the header pointer. + */ + if (VARATT_IS_EXTERNAL_EXPANDED(DatumGetPointer(d))) + { + eah = (ExpandedArrayHeader *) DatumGetEOHP(d); + Assert(eah->ea_magic == EA_MAGIC); + return (AnyArrayType *) eah; + } + + /* Else do regular detoasting as needed */ + return (AnyArrayType *) PG_DETOAST_DATUM(d); + } + + /* + * Create the Datum/isnull representation of an expanded array object + * if we didn't do so previously + */ + void + deconstruct_expanded_array(ExpandedArrayHeader *eah) + { + if (eah->dvalues == NULL) + { + MemoryContext oldcxt = MemoryContextSwitchTo(eah->hdr.eoh_context); + Datum *dvalues; + bool *dnulls; + int nelems; + + dnulls = NULL; + deconstruct_array(eah->fvalue, + eah->element_type, + eah->typlen, eah->typbyval, eah->typalign, + &dvalues, + ARR_HASNULL(eah->fvalue) ? &dnulls : NULL, + &nelems); + + /* + * Update header only after successful completion of this step. If + * deconstruct_array fails partway through, worst consequence is some + * leaked memory in the object's context. If the caller fails at a + * later point, that's fine, since the deconstructed representation is + * valid anyhow. + */ + eah->dvalues = dvalues; + eah->dnulls = dnulls; + eah->dvalueslen = eah->nelems = nelems; + MemoryContextSwitchTo(oldcxt); + } + } diff --git a/src/backend/utils/adt/array_userfuncs.c b/src/backend/utils/adt/array_userfuncs.c index 6679333..1777d0d 100644 *** a/src/backend/utils/adt/array_userfuncs.c --- b/src/backend/utils/adt/array_userfuncs.c *************** *** 20,41 **** /* * fetch_array_arg_replace_nulls * ! * Fetch an array-valued argument; if it's null, construct an empty array ! * value of the proper data type. Also cache basic element type information ! * in fn_extra. */ ! static ArrayType * fetch_array_arg_replace_nulls(FunctionCallInfo fcinfo, int argno) { ! ArrayType *v; Oid element_type; ArrayMetaState *my_extra; ! /* First collect the array value */ if (!PG_ARGISNULL(argno)) { ! v = PG_GETARG_ARRAYTYPE_P(argno); ! element_type = ARR_ELEMTYPE(v); } else { --- 20,51 ---- /* * fetch_array_arg_replace_nulls * ! * Fetch an array-valued argument in expanded form; if it's null, construct an ! * empty array value of the proper data type. Also cache basic element type ! * information in fn_extra. */ ! static ExpandedArrayHeader * fetch_array_arg_replace_nulls(FunctionCallInfo fcinfo, int argno) { ! ExpandedArrayHeader *eah; Oid element_type; ArrayMetaState *my_extra; ! /* If first time through, create datatype cache struct */ ! my_extra = (ArrayMetaState *) fcinfo->flinfo->fn_extra; ! if (my_extra == NULL) ! { ! my_extra = (ArrayMetaState *) ! MemoryContextAlloc(fcinfo->flinfo->fn_mcxt, ! sizeof(ArrayMetaState)); ! my_extra->element_type = InvalidOid; ! fcinfo->flinfo->fn_extra = my_extra; ! } ! ! /* Now collect the array value */ if (!PG_ARGISNULL(argno)) { ! eah = PG_GETARG_EXPANDED_ARRAYX(argno, my_extra); } else { *************** fetch_array_arg_replace_nulls(FunctionCa *** 52,81 **** (errcode(ERRCODE_DATATYPE_MISMATCH), errmsg("input data type is not an array"))); ! v = construct_empty_array(element_type); ! } ! ! /* Now cache required info, which might change from call to call */ ! my_extra = (ArrayMetaState *) fcinfo->flinfo->fn_extra; ! if (my_extra == NULL) ! { ! my_extra = (ArrayMetaState *) ! MemoryContextAlloc(fcinfo->flinfo->fn_mcxt, ! sizeof(ArrayMetaState)); ! my_extra->element_type = InvalidOid; ! fcinfo->flinfo->fn_extra = my_extra; ! } ! ! if (my_extra->element_type != element_type) ! { ! get_typlenbyvalalign(element_type, ! &my_extra->typlen, ! &my_extra->typbyval, ! &my_extra->typalign); ! my_extra->element_type = element_type; } ! return v; } /*----------------------------------------------------------------------------- --- 62,73 ---- (errcode(ERRCODE_DATATYPE_MISMATCH), errmsg("input data type is not an array"))); ! eah = construct_empty_expanded_array(element_type, ! CurrentMemoryContext, ! my_extra); } ! return eah; } /*----------------------------------------------------------------------------- *************** fetch_array_arg_replace_nulls(FunctionCa *** 86,114 **** Datum array_append(PG_FUNCTION_ARGS) { ! ArrayType *v; Datum newelem; bool isNull; ! ArrayType *result; int *dimv, *lb; int indx; ArrayMetaState *my_extra; ! v = fetch_array_arg_replace_nulls(fcinfo, 0); isNull = PG_ARGISNULL(1); if (isNull) newelem = (Datum) 0; else newelem = PG_GETARG_DATUM(1); ! if (ARR_NDIM(v) == 1) { /* append newelem */ int ub; ! lb = ARR_LBOUND(v); ! dimv = ARR_DIMS(v); ub = dimv[0] + lb[0] - 1; indx = ub + 1; --- 78,106 ---- Datum array_append(PG_FUNCTION_ARGS) { ! ExpandedArrayHeader *eah; Datum newelem; bool isNull; ! Datum result; int *dimv, *lb; int indx; ArrayMetaState *my_extra; ! eah = fetch_array_arg_replace_nulls(fcinfo, 0); isNull = PG_ARGISNULL(1); if (isNull) newelem = (Datum) 0; else newelem = PG_GETARG_DATUM(1); ! if (eah->ndims == 1) { /* append newelem */ int ub; ! lb = eah->lbound; ! dimv = eah->dims; ub = dimv[0] + lb[0] - 1; indx = ub + 1; *************** array_append(PG_FUNCTION_ARGS) *** 118,124 **** (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE), errmsg("integer out of range"))); } ! else if (ARR_NDIM(v) == 0) indx = 1; else ereport(ERROR, --- 110,116 ---- (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE), errmsg("integer out of range"))); } ! else if (eah->ndims == 0) indx = 1; else ereport(ERROR, *************** array_append(PG_FUNCTION_ARGS) *** 128,137 **** /* Perform element insertion */ my_extra = (ArrayMetaState *) fcinfo->flinfo->fn_extra; ! result = array_set(v, 1, &indx, newelem, isNull, -1, my_extra->typlen, my_extra->typbyval, my_extra->typalign); ! PG_RETURN_ARRAYTYPE_P(result); } /*----------------------------------------------------------------------------- --- 120,130 ---- /* Perform element insertion */ my_extra = (ArrayMetaState *) fcinfo->flinfo->fn_extra; ! result = array_set_element(EOHPGetRWDatum(&eah->hdr), ! 1, &indx, newelem, isNull, -1, my_extra->typlen, my_extra->typbyval, my_extra->typalign); ! PG_RETURN_DATUM(result); } /*----------------------------------------------------------------------------- *************** array_append(PG_FUNCTION_ARGS) *** 142,153 **** Datum array_prepend(PG_FUNCTION_ARGS) { ! ArrayType *v; Datum newelem; bool isNull; ! ArrayType *result; ! int *lb; int indx; ArrayMetaState *my_extra; isNull = PG_ARGISNULL(0); --- 135,148 ---- Datum array_prepend(PG_FUNCTION_ARGS) { ! ExpandedArrayHeader *eah; Datum newelem; bool isNull; ! Datum result; ! int *dimv, ! *lb; int indx; + int lb0; ArrayMetaState *my_extra; isNull = PG_ARGISNULL(0); *************** array_prepend(PG_FUNCTION_ARGS) *** 155,167 **** newelem = (Datum) 0; else newelem = PG_GETARG_DATUM(0); ! v = fetch_array_arg_replace_nulls(fcinfo, 1); ! if (ARR_NDIM(v) == 1) { /* prepend newelem */ ! lb = ARR_LBOUND(v); indx = lb[0] - 1; /* overflow? */ if (indx > lb[0]) --- 150,164 ---- newelem = (Datum) 0; else newelem = PG_GETARG_DATUM(0); ! eah = fetch_array_arg_replace_nulls(fcinfo, 1); ! if (eah->ndims == 1) { /* prepend newelem */ ! lb = eah->lbound; ! dimv = eah->dims; indx = lb[0] - 1; + lb0 = lb[0]; /* overflow? */ if (indx > lb[0]) *************** array_prepend(PG_FUNCTION_ARGS) *** 169,176 **** (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE), errmsg("integer out of range"))); } ! else if (ARR_NDIM(v) == 0) indx = 1; else ereport(ERROR, (errcode(ERRCODE_DATA_EXCEPTION), --- 166,176 ---- (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE), errmsg("integer out of range"))); } ! else if (eah->ndims == 0) ! { indx = 1; + lb0 = 1; + } else ereport(ERROR, (errcode(ERRCODE_DATA_EXCEPTION), *************** array_prepend(PG_FUNCTION_ARGS) *** 179,192 **** /* Perform element insertion */ my_extra = (ArrayMetaState *) fcinfo->flinfo->fn_extra; ! result = array_set(v, 1, &indx, newelem, isNull, -1, my_extra->typlen, my_extra->typbyval, my_extra->typalign); /* Readjust result's LB to match the input's, as expected for prepend */ ! if (ARR_NDIM(v) == 1) ! ARR_LBOUND(result)[0] = ARR_LBOUND(v)[0]; ! PG_RETURN_ARRAYTYPE_P(result); } /*----------------------------------------------------------------------------- --- 179,197 ---- /* Perform element insertion */ my_extra = (ArrayMetaState *) fcinfo->flinfo->fn_extra; ! result = array_set_element(EOHPGetRWDatum(&eah->hdr), ! 1, &indx, newelem, isNull, -1, my_extra->typlen, my_extra->typbyval, my_extra->typalign); /* Readjust result's LB to match the input's, as expected for prepend */ ! Assert(result == EOHPGetRWDatum(&eah->hdr)); ! if (eah->ndims == 1) ! { ! /* This is ok whether we've deconstructed or not */ ! eah->lbound[0] = lb0; ! } ! PG_RETURN_DATUM(result); } /*----------------------------------------------------------------------------- diff --git a/src/backend/utils/adt/arrayfuncs.c b/src/backend/utils/adt/arrayfuncs.c index 54979fa..0ede54a 100644 *** a/src/backend/utils/adt/arrayfuncs.c --- b/src/backend/utils/adt/arrayfuncs.c *************** bool Array_nulls = true; *** 42,47 **** --- 42,53 ---- */ #define ASSGN "=" + #define AARR_FREE_IF_COPY(array,n) \ + do { \ + if (!VARATT_IS_EXPANDED_HEADER(array)) \ + PG_FREE_IF_COPY(array, n); \ + } while (0) + typedef enum { ARRAY_NO_LEVEL, *************** static void ReadArrayBinary(StringInfo b *** 93,102 **** int typlen, bool typbyval, char typalign, Datum *values, bool *nulls, bool *hasnulls, int32 *nbytes); ! static void CopyArrayEls(ArrayType *array, ! Datum *values, bool *nulls, int nitems, ! int typlen, bool typbyval, char typalign, ! bool freedata); static bool array_get_isnull(const bits8 *nullbitmap, int offset); static void array_set_isnull(bits8 *nullbitmap, int offset, bool isNull); static Datum ArrayCast(char *value, bool byval, int len); --- 99,114 ---- int typlen, bool typbyval, char typalign, Datum *values, bool *nulls, bool *hasnulls, int32 *nbytes); ! static Datum array_get_element_expanded(Datum arraydatum, ! int nSubscripts, int *indx, ! int arraytyplen, ! int elmlen, bool elmbyval, char elmalign, ! bool *isNull); ! static Datum array_set_element_expanded(Datum arraydatum, ! int nSubscripts, int *indx, ! Datum dataValue, bool isNull, ! int arraytyplen, ! int elmlen, bool elmbyval, char elmalign); static bool array_get_isnull(const bits8 *nullbitmap, int offset); static void array_set_isnull(bits8 *nullbitmap, int offset, bool isNull); static Datum ArrayCast(char *value, bool byval, int len); *************** ReadArrayStr(char *arrayStr, *** 939,945 **** * the values are not toasted. (Doing it here doesn't work since the * caller has already allocated space for the array...) */ ! static void CopyArrayEls(ArrayType *array, Datum *values, bool *nulls, --- 951,957 ---- * the values are not toasted. (Doing it here doesn't work since the * caller has already allocated space for the array...) */ ! void CopyArrayEls(ArrayType *array, Datum *values, bool *nulls, *************** CopyArrayEls(ArrayType *array, *** 997,1004 **** Datum array_out(PG_FUNCTION_ARGS) { ! ArrayType *v = PG_GETARG_ARRAYTYPE_P(0); ! Oid element_type = ARR_ELEMTYPE(v); int typlen; bool typbyval; char typalign; --- 1009,1016 ---- Datum array_out(PG_FUNCTION_ARGS) { ! AnyArrayType *v = PG_GETARG_ANY_ARRAY(0); ! Oid element_type = AARR_ELEMTYPE(v); int typlen; bool typbyval; char typalign; *************** array_out(PG_FUNCTION_ARGS) *** 1014,1021 **** * * +2 allows for assignment operator + trailing null */ - bits8 *bitmap; - int bitmask; bool *needquotes, needdims = false; int nitems, --- 1026,1031 ---- *************** array_out(PG_FUNCTION_ARGS) *** 1027,1032 **** --- 1037,1043 ---- int ndim, *dims, *lb; + ARRAY_ITER ARRAY_ITER_VARS(iter); ArrayMetaState *my_extra; /* *************** array_out(PG_FUNCTION_ARGS) *** 1061,1069 **** typalign = my_extra->typalign; typdelim = my_extra->typdelim; ! ndim = ARR_NDIM(v); ! dims = ARR_DIMS(v); ! lb = ARR_LBOUND(v); nitems = ArrayGetNItems(ndim, dims); if (nitems == 0) --- 1072,1080 ---- typalign = my_extra->typalign; typdelim = my_extra->typdelim; ! ndim = AARR_NDIM(v); ! dims = AARR_DIMS(v); ! lb = AARR_LBOUND(v); nitems = ArrayGetNItems(ndim, dims); if (nitems == 0) *************** array_out(PG_FUNCTION_ARGS) *** 1094,1109 **** needquotes = (bool *) palloc(nitems * sizeof(bool)); overall_length = 1; /* don't forget to count \0 at end. */ ! p = ARR_DATA_PTR(v); ! bitmap = ARR_NULLBITMAP(v); ! bitmask = 1; for (i = 0; i < nitems; i++) { bool needquote; /* Get source element, checking for NULL */ ! if (bitmap && (*bitmap & bitmask) == 0) { values[i] = pstrdup("NULL"); overall_length += 4; --- 1105,1122 ---- needquotes = (bool *) palloc(nitems * sizeof(bool)); overall_length = 1; /* don't forget to count \0 at end. */ ! ARRAY_ITER_SETUP(iter, v); for (i = 0; i < nitems; i++) { + Datum itemvalue; + bool isnull; bool needquote; /* Get source element, checking for NULL */ ! ARRAY_ITER_NEXT(iter, i, itemvalue, isnull, typlen, typbyval, typalign); ! ! if (isnull) { values[i] = pstrdup("NULL"); overall_length += 4; *************** array_out(PG_FUNCTION_ARGS) *** 1111,1122 **** } else { - Datum itemvalue; - - itemvalue = fetch_att(p, typbyval, typlen); values[i] = OutputFunctionCall(&my_extra->proc, itemvalue); - p = att_addlength_pointer(p, typlen, p); - p = (char *) att_align_nominal(p, typalign); /* count data plus backslashes; detect chars needing quotes */ if (values[i][0] == '\0') --- 1124,1130 ---- *************** array_out(PG_FUNCTION_ARGS) *** 1149,1165 **** overall_length += 2; /* and the comma */ overall_length += 1; - - /* advance bitmap pointer if any */ - if (bitmap) - { - bitmask <<= 1; - if (bitmask == 0x100) - { - bitmap++; - bitmask = 1; - } - } } /* --- 1157,1162 ---- *************** ReadArrayBinary(StringInfo buf, *** 1534,1552 **** Datum array_send(PG_FUNCTION_ARGS) { ! ArrayType *v = PG_GETARG_ARRAYTYPE_P(0); ! Oid element_type = ARR_ELEMTYPE(v); int typlen; bool typbyval; char typalign; - char *p; - bits8 *bitmap; - int bitmask; int nitems, i; int ndim, ! *dim; StringInfoData buf; ArrayMetaState *my_extra; /* --- 1531,1548 ---- Datum array_send(PG_FUNCTION_ARGS) { ! AnyArrayType *v = PG_GETARG_ANY_ARRAY(0); ! Oid element_type = AARR_ELEMTYPE(v); int typlen; bool typbyval; char typalign; int nitems, i; int ndim, ! *dim, ! *lb; StringInfoData buf; + ARRAY_ITER ARRAY_ITER_VARS(iter); ArrayMetaState *my_extra; /* *************** array_send(PG_FUNCTION_ARGS) *** 1583,1642 **** typbyval = my_extra->typbyval; typalign = my_extra->typalign; ! ndim = ARR_NDIM(v); ! dim = ARR_DIMS(v); nitems = ArrayGetNItems(ndim, dim); pq_begintypsend(&buf); /* Send the array header information */ pq_sendint(&buf, ndim, 4); ! pq_sendint(&buf, ARR_HASNULL(v) ? 1 : 0, 4); pq_sendint(&buf, element_type, sizeof(Oid)); for (i = 0; i < ndim; i++) { ! pq_sendint(&buf, ARR_DIMS(v)[i], 4); ! pq_sendint(&buf, ARR_LBOUND(v)[i], 4); } /* Send the array elements using the element's own sendproc */ ! p = ARR_DATA_PTR(v); ! bitmap = ARR_NULLBITMAP(v); ! bitmask = 1; for (i = 0; i < nitems; i++) { /* Get source element, checking for NULL */ ! if (bitmap && (*bitmap & bitmask) == 0) { /* -1 length means a NULL */ pq_sendint(&buf, -1, 4); } else { - Datum itemvalue; bytea *outputbytes; - itemvalue = fetch_att(p, typbyval, typlen); outputbytes = SendFunctionCall(&my_extra->proc, itemvalue); pq_sendint(&buf, VARSIZE(outputbytes) - VARHDRSZ, 4); pq_sendbytes(&buf, VARDATA(outputbytes), VARSIZE(outputbytes) - VARHDRSZ); pfree(outputbytes); - - p = att_addlength_pointer(p, typlen, p); - p = (char *) att_align_nominal(p, typalign); - } - - /* advance bitmap pointer if any */ - if (bitmap) - { - bitmask <<= 1; - if (bitmask == 0x100) - { - bitmap++; - bitmask = 1; - } } } --- 1579,1626 ---- typbyval = my_extra->typbyval; typalign = my_extra->typalign; ! ndim = AARR_NDIM(v); ! dim = AARR_DIMS(v); ! lb = AARR_LBOUND(v); nitems = ArrayGetNItems(ndim, dim); pq_begintypsend(&buf); /* Send the array header information */ pq_sendint(&buf, ndim, 4); ! pq_sendint(&buf, AARR_HASNULL(v) ? 1 : 0, 4); pq_sendint(&buf, element_type, sizeof(Oid)); for (i = 0; i < ndim; i++) { ! pq_sendint(&buf, dim[i], 4); ! pq_sendint(&buf, lb[i], 4); } /* Send the array elements using the element's own sendproc */ ! ARRAY_ITER_SETUP(iter, v); for (i = 0; i < nitems; i++) { + Datum itemvalue; + bool isnull; + /* Get source element, checking for NULL */ ! ARRAY_ITER_NEXT(iter, i, itemvalue, isnull, typlen, typbyval, typalign); ! ! if (isnull) { /* -1 length means a NULL */ pq_sendint(&buf, -1, 4); } else { bytea *outputbytes; outputbytes = SendFunctionCall(&my_extra->proc, itemvalue); pq_sendint(&buf, VARSIZE(outputbytes) - VARHDRSZ, 4); pq_sendbytes(&buf, VARDATA(outputbytes), VARSIZE(outputbytes) - VARHDRSZ); pfree(outputbytes); } } *************** array_send(PG_FUNCTION_ARGS) *** 1650,1662 **** Datum array_ndims(PG_FUNCTION_ARGS) { ! ArrayType *v = PG_GETARG_ARRAYTYPE_P(0); /* Sanity check: does it look like an array at all? */ ! if (ARR_NDIM(v) <= 0 || ARR_NDIM(v) > MAXDIM) PG_RETURN_NULL(); ! PG_RETURN_INT32(ARR_NDIM(v)); } /* --- 1634,1646 ---- Datum array_ndims(PG_FUNCTION_ARGS) { ! AnyArrayType *v = PG_GETARG_ANY_ARRAY(0); /* Sanity check: does it look like an array at all? */ ! if (AARR_NDIM(v) <= 0 || AARR_NDIM(v) > MAXDIM) PG_RETURN_NULL(); ! PG_RETURN_INT32(AARR_NDIM(v)); } /* *************** array_ndims(PG_FUNCTION_ARGS) *** 1666,1672 **** Datum array_dims(PG_FUNCTION_ARGS) { ! ArrayType *v = PG_GETARG_ARRAYTYPE_P(0); char *p; int i; int *dimv, --- 1650,1656 ---- Datum array_dims(PG_FUNCTION_ARGS) { ! AnyArrayType *v = PG_GETARG_ANY_ARRAY(0); char *p; int i; int *dimv, *************** array_dims(PG_FUNCTION_ARGS) *** 1680,1693 **** char buf[MAXDIM * 33 + 1]; /* Sanity check: does it look like an array at all? */ ! if (ARR_NDIM(v) <= 0 || ARR_NDIM(v) > MAXDIM) PG_RETURN_NULL(); ! dimv = ARR_DIMS(v); ! lb = ARR_LBOUND(v); p = buf; ! for (i = 0; i < ARR_NDIM(v); i++) { sprintf(p, "[%d:%d]", lb[i], dimv[i] + lb[i] - 1); p += strlen(p); --- 1664,1677 ---- char buf[MAXDIM * 33 + 1]; /* Sanity check: does it look like an array at all? */ ! if (AARR_NDIM(v) <= 0 || AARR_NDIM(v) > MAXDIM) PG_RETURN_NULL(); ! dimv = AARR_DIMS(v); ! lb = AARR_LBOUND(v); p = buf; ! for (i = 0; i < AARR_NDIM(v); i++) { sprintf(p, "[%d:%d]", lb[i], dimv[i] + lb[i] - 1); p += strlen(p); *************** array_dims(PG_FUNCTION_ARGS) *** 1704,1723 **** Datum array_lower(PG_FUNCTION_ARGS) { ! ArrayType *v = PG_GETARG_ARRAYTYPE_P(0); int reqdim = PG_GETARG_INT32(1); int *lb; int result; /* Sanity check: does it look like an array at all? */ ! if (ARR_NDIM(v) <= 0 || ARR_NDIM(v) > MAXDIM) PG_RETURN_NULL(); /* Sanity check: was the requested dim valid */ ! if (reqdim <= 0 || reqdim > ARR_NDIM(v)) PG_RETURN_NULL(); ! lb = ARR_LBOUND(v); result = lb[reqdim - 1]; PG_RETURN_INT32(result); --- 1688,1707 ---- Datum array_lower(PG_FUNCTION_ARGS) { ! AnyArrayType *v = PG_GETARG_ANY_ARRAY(0); int reqdim = PG_GETARG_INT32(1); int *lb; int result; /* Sanity check: does it look like an array at all? */ ! if (AARR_NDIM(v) <= 0 || AARR_NDIM(v) > MAXDIM) PG_RETURN_NULL(); /* Sanity check: was the requested dim valid */ ! if (reqdim <= 0 || reqdim > AARR_NDIM(v)) PG_RETURN_NULL(); ! lb = AARR_LBOUND(v); result = lb[reqdim - 1]; PG_RETURN_INT32(result); *************** array_lower(PG_FUNCTION_ARGS) *** 1731,1752 **** Datum array_upper(PG_FUNCTION_ARGS) { ! ArrayType *v = PG_GETARG_ARRAYTYPE_P(0); int reqdim = PG_GETARG_INT32(1); int *dimv, *lb; int result; /* Sanity check: does it look like an array at all? */ ! if (ARR_NDIM(v) <= 0 || ARR_NDIM(v) > MAXDIM) PG_RETURN_NULL(); /* Sanity check: was the requested dim valid */ ! if (reqdim <= 0 || reqdim > ARR_NDIM(v)) PG_RETURN_NULL(); ! lb = ARR_LBOUND(v); ! dimv = ARR_DIMS(v); result = dimv[reqdim - 1] + lb[reqdim - 1] - 1; --- 1715,1736 ---- Datum array_upper(PG_FUNCTION_ARGS) { ! AnyArrayType *v = PG_GETARG_ANY_ARRAY(0); int reqdim = PG_GETARG_INT32(1); int *dimv, *lb; int result; /* Sanity check: does it look like an array at all? */ ! if (AARR_NDIM(v) <= 0 || AARR_NDIM(v) > MAXDIM) PG_RETURN_NULL(); /* Sanity check: was the requested dim valid */ ! if (reqdim <= 0 || reqdim > AARR_NDIM(v)) PG_RETURN_NULL(); ! lb = AARR_LBOUND(v); ! dimv = AARR_DIMS(v); result = dimv[reqdim - 1] + lb[reqdim - 1] - 1; *************** array_upper(PG_FUNCTION_ARGS) *** 1761,1780 **** Datum array_length(PG_FUNCTION_ARGS) { ! ArrayType *v = PG_GETARG_ARRAYTYPE_P(0); int reqdim = PG_GETARG_INT32(1); int *dimv; int result; /* Sanity check: does it look like an array at all? */ ! if (ARR_NDIM(v) <= 0 || ARR_NDIM(v) > MAXDIM) PG_RETURN_NULL(); /* Sanity check: was the requested dim valid */ ! if (reqdim <= 0 || reqdim > ARR_NDIM(v)) PG_RETURN_NULL(); ! dimv = ARR_DIMS(v); result = dimv[reqdim - 1]; --- 1745,1764 ---- Datum array_length(PG_FUNCTION_ARGS) { ! AnyArrayType *v = PG_GETARG_ANY_ARRAY(0); int reqdim = PG_GETARG_INT32(1); int *dimv; int result; /* Sanity check: does it look like an array at all? */ ! if (AARR_NDIM(v) <= 0 || AARR_NDIM(v) > MAXDIM) PG_RETURN_NULL(); /* Sanity check: was the requested dim valid */ ! if (reqdim <= 0 || reqdim > AARR_NDIM(v)) PG_RETURN_NULL(); ! dimv = AARR_DIMS(v); result = dimv[reqdim - 1]; *************** array_length(PG_FUNCTION_ARGS) *** 1788,1796 **** Datum array_cardinality(PG_FUNCTION_ARGS) { ! ArrayType *v = PG_GETARG_ARRAYTYPE_P(0); ! PG_RETURN_INT32(ArrayGetNItems(ARR_NDIM(v), ARR_DIMS(v))); } --- 1772,1780 ---- Datum array_cardinality(PG_FUNCTION_ARGS) { ! AnyArrayType *v = PG_GETARG_ANY_ARRAY(0); ! PG_RETURN_INT32(ArrayGetNItems(AARR_NDIM(v), AARR_DIMS(v))); } *************** array_get_element(Datum arraydatum, *** 1825,1831 **** char elmalign, bool *isNull) { - ArrayType *array; int i, ndim, *dim, --- 1809,1814 ---- *************** array_get_element(Datum arraydatum, *** 1850,1859 **** arraydataptr = (char *) DatumGetPointer(arraydatum); arraynullsptr = NULL; } else { ! /* detoast input array if necessary */ ! array = DatumGetArrayTypeP(arraydatum); ndim = ARR_NDIM(array); dim = ARR_DIMS(array); --- 1833,1854 ---- arraydataptr = (char *) DatumGetPointer(arraydatum); arraynullsptr = NULL; } + else if (VARATT_IS_EXTERNAL_EXPANDED(DatumGetPointer(arraydatum))) + { + /* expanded array: let's do this in a separate function */ + return array_get_element_expanded(arraydatum, + nSubscripts, + indx, + arraytyplen, + elmlen, + elmbyval, + elmalign, + isNull); + } else { ! /* detoast array if necessary, producing normal varlena input */ ! ArrayType *array = DatumGetArrayTypeP(arraydatum); ndim = ARR_NDIM(array); dim = ARR_DIMS(array); *************** array_get_element(Datum arraydatum, *** 1903,1908 **** --- 1898,1985 ---- } /* + * Implementation of array_get_element() for an expanded array + */ + static Datum + array_get_element_expanded(Datum arraydatum, + int nSubscripts, int *indx, + int arraytyplen, + int elmlen, bool elmbyval, char elmalign, + bool *isNull) + { + ExpandedArrayHeader *eah; + int i, + ndim, + *dim, + *lb, + offset; + Datum *dvalues; + bool *dnulls; + + eah = (ExpandedArrayHeader *) DatumGetEOHP(arraydatum); + Assert(eah->ea_magic == EA_MAGIC); + + /* sanity-check caller's info against object */ + Assert(arraytyplen == -1); + Assert(elmlen == eah->typlen); + Assert(elmbyval == eah->typbyval); + Assert(elmalign == eah->typalign); + + ndim = eah->ndims; + dim = eah->dims; + lb = eah->lbound; + + /* + * Return NULL for invalid subscript + */ + if (ndim != nSubscripts || ndim <= 0 || ndim > MAXDIM) + { + *isNull = true; + return (Datum) 0; + } + for (i = 0; i < ndim; i++) + { + if (indx[i] < lb[i] || indx[i] >= (dim[i] + lb[i])) + { + *isNull = true; + return (Datum) 0; + } + } + + /* + * Calculate the element number + */ + offset = ArrayGetOffset(nSubscripts, dim, lb, indx); + + /* + * Deconstruct array if we didn't already. Note that we apply this even + * if the input is nominally read-only: it should be safe enough. + */ + deconstruct_expanded_array(eah); + + dvalues = eah->dvalues; + dnulls = eah->dnulls; + + /* + * Check for NULL array element + */ + if (dnulls && dnulls[offset]) + { + *isNull = true; + return (Datum) 0; + } + + /* + * OK, get the element. It's OK to return a pass-by-ref value as a + * pointer into the expanded array, for the same reason that regular + * array_get_element can return a pointer into flat arrays: the value is + * assumed not to change for as long as the Datum reference can exist. + */ + *isNull = false; + return dvalues[offset]; + } + + /* * array_get_slice : * This routine takes an array and a range of indices (upperIndex and * lowerIndx), creates a new array structure for the referred elements *************** array_get_slice(Datum arraydatum, *** 2083,2089 **** * * Result: * A new array is returned, just like the old except for the one ! * modified entry. The original array object is not changed. * * For one-dimensional arrays only, we allow the array to be extended * by assigning to a position outside the existing subscript range; any --- 2160,2168 ---- * * Result: * A new array is returned, just like the old except for the one ! * modified entry. The original array object is not changed, ! * unless what is passed is a read-write reference to an expanded ! * array object; in that case the expanded array is updated in-place. * * For one-dimensional arrays only, we allow the array to be extended * by assigning to a position outside the existing subscript range; any *************** array_set_element(Datum arraydatum, *** 2166,2171 **** --- 2245,2264 ---- if (elmlen == -1 && !isNull) dataValue = PointerGetDatum(PG_DETOAST_DATUM(dataValue)); + if (VARATT_IS_EXTERNAL_EXPANDED(DatumGetPointer(arraydatum))) + { + /* expanded array: let's do this in a separate function */ + return array_set_element_expanded(arraydatum, + nSubscripts, + indx, + dataValue, + isNull, + arraytyplen, + elmlen, + elmbyval, + elmalign); + } + /* detoast input array if necessary */ array = DatumGetArrayTypeP(arraydatum); *************** array_set_element(Datum arraydatum, *** 2355,2360 **** --- 2448,2697 ---- } /* + * Implementation of array_set_element() for an expanded array + * + * Note: as with any operation on a read/write expanded object, we must + * take pains not to leave the object in a corrupt state if we fail partway + * through. + */ + static Datum + array_set_element_expanded(Datum arraydatum, + int nSubscripts, int *indx, + Datum dataValue, bool isNull, + int arraytyplen, + int elmlen, bool elmbyval, char elmalign) + { + ExpandedArrayHeader *eah; + Datum *dvalues; + bool *dnulls; + int i, + ndim, + dim[MAXDIM], + lb[MAXDIM], + offset; + bool dimschanged, + newhasnulls; + int addedbefore, + addedafter; + char *oldValue; + + /* Convert to R/W object if not so already */ + eah = DatumGetExpandedArray(arraydatum); + + /* Sanity-check caller's info against object; we don't use it otherwise */ + Assert(arraytyplen == -1); + Assert(elmlen == eah->typlen); + Assert(elmbyval == eah->typbyval); + Assert(elmalign == eah->typalign); + + /* + * Copy dimension info into local storage. This allows us to modify the + * dimensions if needed, while not messing up the expanded value if we + * fail partway through. + */ + ndim = eah->ndims; + Assert(ndim >= 0 && ndim <= MAXDIM); + memcpy(dim, eah->dims, ndim * sizeof(int)); + memcpy(lb, eah->lbound, ndim * sizeof(int)); + dimschanged = false; + + /* + * if number of dims is zero, i.e. an empty array, create an array with + * nSubscripts dimensions, and set the lower bounds to the supplied + * subscripts. + */ + if (ndim == 0) + { + /* + * Allocate adequate space for new dimension info. This is harmless + * if we fail later. + */ + Assert(nSubscripts > 0 && nSubscripts <= MAXDIM); + eah->dims = (int *) MemoryContextAllocZero(eah->hdr.eoh_context, + nSubscripts * sizeof(int)); + eah->lbound = (int *) MemoryContextAllocZero(eah->hdr.eoh_context, + nSubscripts * sizeof(int)); + + /* Update local copies of dimension info */ + ndim = nSubscripts; + for (i = 0; i < nSubscripts; i++) + { + dim[i] = 0; + lb[i] = indx[i]; + } + dimschanged = true; + } + else if (ndim != nSubscripts) + ereport(ERROR, + (errcode(ERRCODE_ARRAY_SUBSCRIPT_ERROR), + errmsg("wrong number of array subscripts"))); + + /* + * Deconstruct array if we didn't already. (Someday maybe add a special + * case path for fixed-length, no-nulls cases, where we can overwrite an + * element in place without ever deconstructing. But today is not that + * day.) + */ + deconstruct_expanded_array(eah); + + /* + * Copy new element into array's context, if needed (we assume it's + * already detoasted, so no junk should be created). If we fail further + * down, this memory is leaked, but that's reasonably harmless. + */ + if (!eah->typbyval && !isNull) + { + MemoryContext oldcxt = MemoryContextSwitchTo(eah->hdr.eoh_context); + + dataValue = datumCopy(dataValue, false, eah->typlen); + MemoryContextSwitchTo(oldcxt); + } + + dvalues = eah->dvalues; + dnulls = eah->dnulls; + + newhasnulls = ((dnulls != NULL) || isNull); + addedbefore = addedafter = 0; + + /* + * Check subscripts (this logic matches original array_set_element) + */ + if (ndim == 1) + { + if (indx[0] < lb[0]) + { + addedbefore = lb[0] - indx[0]; + dim[0] += addedbefore; + lb[0] = indx[0]; + dimschanged = true; + if (addedbefore > 1) + newhasnulls = true; /* will insert nulls */ + } + if (indx[0] >= (dim[0] + lb[0])) + { + addedafter = indx[0] - (dim[0] + lb[0]) + 1; + dim[0] += addedafter; + dimschanged = true; + if (addedafter > 1) + newhasnulls = true; /* will insert nulls */ + } + } + else + { + /* + * XXX currently we do not support extending multi-dimensional arrays + * during assignment + */ + for (i = 0; i < ndim; i++) + { + if (indx[i] < lb[i] || + indx[i] >= (dim[i] + lb[i])) + ereport(ERROR, + (errcode(ERRCODE_ARRAY_SUBSCRIPT_ERROR), + errmsg("array subscript out of range"))); + } + } + + /* Now we can calculate linear offset of target item in array */ + offset = ArrayGetOffset(nSubscripts, dim, lb, indx); + + /* Physically enlarge existing dvalues/dnulls arrays if needed */ + if (dim[0] > eah->dvalueslen) + { + /* We want some extra space if we're enlarging */ + int newlen = dim[0] + dim[0] / 8; + + eah->dvalues = dvalues = (Datum *) + repalloc(dvalues, newlen * sizeof(Datum)); + if (dnulls) + eah->dnulls = dnulls = (bool *) + repalloc(dnulls, newlen * sizeof(bool)); + eah->dvalueslen = newlen; + } + + /* + * If we need a nulls bitmap and don't already have one, create it, being + * sure to mark all existing entries as not null. + */ + if (newhasnulls && dnulls == NULL) + eah->dnulls = dnulls = (bool *) + MemoryContextAllocZero(eah->hdr.eoh_context, + eah->dvalueslen * sizeof(bool)); + + /* + * We now have all the needed space allocated, so we're ready to make + * irreversible changes. Be very wary of allowing failure below here. + */ + + /* Flattened value will no longer represent array accurately */ + eah->fvalue = NULL; + /* And we don't know the flattened size either */ + eah->flat_size = 0; + + /* Update dimensionality info if needed */ + if (dimschanged) + { + eah->ndims = ndim; + memcpy(eah->dims, dim, ndim * sizeof(int)); + memcpy(eah->lbound, lb, ndim * sizeof(int)); + } + + /* Reposition items if needed, and fill addedbefore items with nulls */ + if (addedbefore > 0) + { + memmove(dvalues + addedbefore, dvalues, eah->nelems * sizeof(Datum)); + for (i = 0; i < addedbefore; i++) + dvalues[i] = (Datum) 0; + if (dnulls) + { + memmove(dnulls + addedbefore, dnulls, eah->nelems * sizeof(bool)); + for (i = 0; i < addedbefore; i++) + dnulls[i] = true; + } + eah->nelems += addedbefore; + } + + /* fill addedafter items with nulls */ + if (addedafter > 0) + { + for (i = 0; i < addedafter; i++) + dvalues[eah->nelems + i] = (Datum) 0; + if (dnulls) + { + for (i = 0; i < addedafter; i++) + dnulls[eah->nelems + i] = true; + } + eah->nelems += addedafter; + } + + /* Grab old element value for pfree'ing, if needed. */ + if (!eah->typbyval && (dnulls == NULL || !dnulls[offset])) + oldValue = (char *) DatumGetPointer(dvalues[offset]); + else + oldValue = NULL; + + /* And finally we can insert the new element. */ + dvalues[offset] = dataValue; + if (dnulls) + dnulls[offset] = isNull; + + /* + * Free old element if needed; this keeps repeated element replacements + * from bloating the array's storage. If the pfree somehow fails, it + * won't corrupt the array. + */ + if (oldValue) + { + /* Don't try to pfree a part of the original flat array */ + if (oldValue < eah->fstartptr || oldValue >= eah->fendptr) + pfree(oldValue); + } + + /* Done, return standard TOAST pointer for object */ + return EOHPGetRWDatum(&eah->hdr); + } + + /* * array_set_slice : * This routine sets the value of a range of array locations (specified * by upper and lower subscript values) to new values passed as *************** array_set(ArrayType *array, int nSubscri *** 2734,2741 **** * the function fn(), and if nargs > 1 then argument positions after the * first must be preset to the additional values to be passed. The * first argument position initially holds the input array value. - * * inpType: OID of element type of input array. This must be the same as, - * or binary-compatible with, the first argument type of fn(). * * retType: OID of element type of output array. This must be the same as, * or binary-compatible with, the result type of fn(). * * amstate: workspace for array_map. Must be zeroed by caller before --- 3071,3076 ---- *************** array_set(ArrayType *array, int nSubscri *** 2749,2762 **** * the array are OK however. */ Datum ! array_map(FunctionCallInfo fcinfo, Oid inpType, Oid retType, ! ArrayMapState *amstate) { ! ArrayType *v; ArrayType *result; Datum *values; bool *nulls; - Datum elt; int *dim; int ndim; int nitems; --- 3084,3095 ---- * the array are OK however. */ Datum ! array_map(FunctionCallInfo fcinfo, Oid retType, ArrayMapState *amstate) { ! AnyArrayType *v; ArrayType *result; Datum *values; bool *nulls; int *dim; int ndim; int nitems; *************** array_map(FunctionCallInfo fcinfo, Oid i *** 2764,2778 **** int32 nbytes = 0; int32 dataoffset; bool hasnulls; int inp_typlen; bool inp_typbyval; char inp_typalign; int typlen; bool typbyval; char typalign; ! char *s; ! bits8 *bitmap; ! int bitmask; ArrayMetaState *inp_extra; ArrayMetaState *ret_extra; --- 3097,3110 ---- int32 nbytes = 0; int32 dataoffset; bool hasnulls; + Oid inpType; int inp_typlen; bool inp_typbyval; char inp_typalign; int typlen; bool typbyval; char typalign; ! ARRAY_ITER ARRAY_ITER_VARS(iter); ArrayMetaState *inp_extra; ArrayMetaState *ret_extra; *************** array_map(FunctionCallInfo fcinfo, Oid i *** 2781,2792 **** elog(ERROR, "invalid nargs: %d", fcinfo->nargs); if (PG_ARGISNULL(0)) elog(ERROR, "null input array"); ! v = PG_GETARG_ARRAYTYPE_P(0); ! ! Assert(ARR_ELEMTYPE(v) == inpType); ! ndim = ARR_NDIM(v); ! dim = ARR_DIMS(v); nitems = ArrayGetNItems(ndim, dim); /* Check for empty array */ --- 3113,3123 ---- elog(ERROR, "invalid nargs: %d", fcinfo->nargs); if (PG_ARGISNULL(0)) elog(ERROR, "null input array"); ! v = PG_GETARG_ANY_ARRAY(0); ! inpType = AARR_ELEMTYPE(v); ! ndim = AARR_NDIM(v); ! dim = AARR_DIMS(v); nitems = ArrayGetNItems(ndim, dim); /* Check for empty array */ *************** array_map(FunctionCallInfo fcinfo, Oid i *** 2833,2841 **** nulls = (bool *) palloc(nitems * sizeof(bool)); /* Loop over source data */ ! s = ARR_DATA_PTR(v); ! bitmap = ARR_NULLBITMAP(v); ! bitmask = 1; hasnulls = false; for (i = 0; i < nitems; i++) --- 3164,3170 ---- nulls = (bool *) palloc(nitems * sizeof(bool)); /* Loop over source data */ ! ARRAY_ITER_SETUP(iter, v); hasnulls = false; for (i = 0; i < nitems; i++) *************** array_map(FunctionCallInfo fcinfo, Oid i *** 2843,2860 **** bool callit = true; /* Get source element, checking for NULL */ ! if (bitmap && (*bitmap & bitmask) == 0) ! { ! fcinfo->argnull[0] = true; ! } ! else ! { ! elt = fetch_att(s, inp_typbyval, inp_typlen); ! s = att_addlength_datum(s, inp_typlen, elt); ! s = (char *) att_align_nominal(s, inp_typalign); ! fcinfo->arg[0] = elt; ! fcinfo->argnull[0] = false; ! } /* * Apply the given function to source elt and extra args. --- 3172,3179 ---- bool callit = true; /* Get source element, checking for NULL */ ! ARRAY_ITER_NEXT(iter, i, fcinfo->arg[0], fcinfo->argnull[0], ! inp_typlen, inp_typbyval, inp_typalign); /* * Apply the given function to source elt and extra args. *************** array_map(FunctionCallInfo fcinfo, Oid i *** 2899,2915 **** errmsg("array size exceeds the maximum allowed (%d)", (int) MaxAllocSize))); } - - /* advance bitmap pointer if any */ - if (bitmap) - { - bitmask <<= 1; - if (bitmask == 0x100) - { - bitmap++; - bitmask = 1; - } - } } /* Allocate and initialize the result array */ --- 3218,3223 ---- *************** array_map(FunctionCallInfo fcinfo, Oid i *** 2928,2934 **** result->ndim = ndim; result->dataoffset = dataoffset; result->elemtype = retType; ! memcpy(ARR_DIMS(result), ARR_DIMS(v), 2 * ndim * sizeof(int)); /* * Note: do not risk trying to pfree the results of the called function --- 3236,3243 ---- result->ndim = ndim; result->dataoffset = dataoffset; result->elemtype = retType; ! memcpy(ARR_DIMS(result), AARR_DIMS(v), ndim * sizeof(int)); ! memcpy(ARR_LBOUND(result), AARR_LBOUND(v), ndim * sizeof(int)); /* * Note: do not risk trying to pfree the results of the called function *************** construct_empty_array(Oid elmtype) *** 3092,3097 **** --- 3401,3423 ---- } /* + * construct_empty_expanded_array: make an empty expanded array + * given only type information. (metacache can be NULL if not needed.) + */ + ExpandedArrayHeader * + construct_empty_expanded_array(Oid element_type, + MemoryContext parentcontext, + ArrayMetaState *metacache) + { + ArrayType *array = construct_empty_array(element_type); + Datum d; + + d = expand_array(PointerGetDatum(array), parentcontext, metacache); + pfree(array); + return (ExpandedArrayHeader *) DatumGetEOHP(d); + } + + /* * deconstruct_array --- simple method for extracting data from an array * * array: array object to examine (must not be NULL) *************** array_contains_nulls(ArrayType *array) *** 3229,3264 **** Datum array_eq(PG_FUNCTION_ARGS) { ! ArrayType *array1 = PG_GETARG_ARRAYTYPE_P(0); ! ArrayType *array2 = PG_GETARG_ARRAYTYPE_P(1); Oid collation = PG_GET_COLLATION(); ! int ndims1 = ARR_NDIM(array1); ! int ndims2 = ARR_NDIM(array2); ! int *dims1 = ARR_DIMS(array1); ! int *dims2 = ARR_DIMS(array2); ! Oid element_type = ARR_ELEMTYPE(array1); bool result = true; int nitems; TypeCacheEntry *typentry; int typlen; bool typbyval; char typalign; ! char *ptr1; ! char *ptr2; ! bits8 *bitmap1; ! bits8 *bitmap2; ! int bitmask; int i; FunctionCallInfoData locfcinfo; ! if (element_type != ARR_ELEMTYPE(array2)) ereport(ERROR, (errcode(ERRCODE_DATATYPE_MISMATCH), errmsg("cannot compare arrays of different element types"))); /* fast path if the arrays do not have the same dimensionality */ if (ndims1 != ndims2 || ! memcmp(dims1, dims2, 2 * ndims1 * sizeof(int)) != 0) result = false; else { --- 3555,3590 ---- Datum array_eq(PG_FUNCTION_ARGS) { ! AnyArrayType *array1 = PG_GETARG_ANY_ARRAY(0); ! AnyArrayType *array2 = PG_GETARG_ANY_ARRAY(1); Oid collation = PG_GET_COLLATION(); ! int ndims1 = AARR_NDIM(array1); ! int ndims2 = AARR_NDIM(array2); ! int *dims1 = AARR_DIMS(array1); ! int *dims2 = AARR_DIMS(array2); ! int *lbs1 = AARR_LBOUND(array1); ! int *lbs2 = AARR_LBOUND(array2); ! Oid element_type = AARR_ELEMTYPE(array1); bool result = true; int nitems; TypeCacheEntry *typentry; int typlen; bool typbyval; char typalign; ! ARRAY_ITER ARRAY_ITER_VARS(it1); ! ARRAY_ITER ARRAY_ITER_VARS(it2); int i; FunctionCallInfoData locfcinfo; ! if (element_type != AARR_ELEMTYPE(array2)) ereport(ERROR, (errcode(ERRCODE_DATATYPE_MISMATCH), errmsg("cannot compare arrays of different element types"))); /* fast path if the arrays do not have the same dimensionality */ if (ndims1 != ndims2 || ! memcmp(dims1, dims2, ndims1 * sizeof(int)) != 0 || ! memcmp(lbs1, lbs2, ndims1 * sizeof(int)) != 0) result = false; else { *************** array_eq(PG_FUNCTION_ARGS) *** 3293,3303 **** /* Loop over source data */ nitems = ArrayGetNItems(ndims1, dims1); ! ptr1 = ARR_DATA_PTR(array1); ! ptr2 = ARR_DATA_PTR(array2); ! bitmap1 = ARR_NULLBITMAP(array1); ! bitmap2 = ARR_NULLBITMAP(array2); ! bitmask = 1; /* use same bitmask for both arrays */ for (i = 0; i < nitems; i++) { --- 3619,3626 ---- /* Loop over source data */ nitems = ArrayGetNItems(ndims1, dims1); ! ARRAY_ITER_SETUP(it1, array1); ! ARRAY_ITER_SETUP(it2, array2); for (i = 0; i < nitems; i++) { *************** array_eq(PG_FUNCTION_ARGS) *** 3308,3349 **** bool oprresult; /* Get elements, checking for NULL */ ! if (bitmap1 && (*bitmap1 & bitmask) == 0) ! { ! isnull1 = true; ! elt1 = (Datum) 0; ! } ! else ! { ! isnull1 = false; ! elt1 = fetch_att(ptr1, typbyval, typlen); ! ptr1 = att_addlength_pointer(ptr1, typlen, ptr1); ! ptr1 = (char *) att_align_nominal(ptr1, typalign); ! } ! ! if (bitmap2 && (*bitmap2 & bitmask) == 0) ! { ! isnull2 = true; ! elt2 = (Datum) 0; ! } ! else ! { ! isnull2 = false; ! elt2 = fetch_att(ptr2, typbyval, typlen); ! ptr2 = att_addlength_pointer(ptr2, typlen, ptr2); ! ptr2 = (char *) att_align_nominal(ptr2, typalign); ! } ! ! /* advance bitmap pointers if any */ ! bitmask <<= 1; ! if (bitmask == 0x100) ! { ! if (bitmap1) ! bitmap1++; ! if (bitmap2) ! bitmap2++; ! bitmask = 1; ! } /* * We consider two NULLs equal; NULL and not-NULL are unequal. --- 3631,3638 ---- bool oprresult; /* Get elements, checking for NULL */ ! ARRAY_ITER_NEXT(it1, i, elt1, isnull1, typlen, typbyval, typalign); ! ARRAY_ITER_NEXT(it2, i, elt2, isnull2, typlen, typbyval, typalign); /* * We consider two NULLs equal; NULL and not-NULL are unequal. *************** array_eq(PG_FUNCTION_ARGS) *** 3374,3381 **** } /* Avoid leaking memory when handed toasted input. */ ! PG_FREE_IF_COPY(array1, 0); ! PG_FREE_IF_COPY(array2, 1); PG_RETURN_BOOL(result); } --- 3663,3670 ---- } /* Avoid leaking memory when handed toasted input. */ ! AARR_FREE_IF_COPY(array1, 0); ! AARR_FREE_IF_COPY(array2, 1); PG_RETURN_BOOL(result); } *************** btarraycmp(PG_FUNCTION_ARGS) *** 3435,3465 **** static int array_cmp(FunctionCallInfo fcinfo) { ! ArrayType *array1 = PG_GETARG_ARRAYTYPE_P(0); ! ArrayType *array2 = PG_GETARG_ARRAYTYPE_P(1); Oid collation = PG_GET_COLLATION(); ! int ndims1 = ARR_NDIM(array1); ! int ndims2 = ARR_NDIM(array2); ! int *dims1 = ARR_DIMS(array1); ! int *dims2 = ARR_DIMS(array2); int nitems1 = ArrayGetNItems(ndims1, dims1); int nitems2 = ArrayGetNItems(ndims2, dims2); ! Oid element_type = ARR_ELEMTYPE(array1); int result = 0; TypeCacheEntry *typentry; int typlen; bool typbyval; char typalign; int min_nitems; ! char *ptr1; ! char *ptr2; ! bits8 *bitmap1; ! bits8 *bitmap2; ! int bitmask; int i; FunctionCallInfoData locfcinfo; ! if (element_type != ARR_ELEMTYPE(array2)) ereport(ERROR, (errcode(ERRCODE_DATATYPE_MISMATCH), errmsg("cannot compare arrays of different element types"))); --- 3724,3751 ---- static int array_cmp(FunctionCallInfo fcinfo) { ! AnyArrayType *array1 = PG_GETARG_ANY_ARRAY(0); ! AnyArrayType *array2 = PG_GETARG_ANY_ARRAY(1); Oid collation = PG_GET_COLLATION(); ! int ndims1 = AARR_NDIM(array1); ! int ndims2 = AARR_NDIM(array2); ! int *dims1 = AARR_DIMS(array1); ! int *dims2 = AARR_DIMS(array2); int nitems1 = ArrayGetNItems(ndims1, dims1); int nitems2 = ArrayGetNItems(ndims2, dims2); ! Oid element_type = AARR_ELEMTYPE(array1); int result = 0; TypeCacheEntry *typentry; int typlen; bool typbyval; char typalign; int min_nitems; ! ARRAY_ITER ARRAY_ITER_VARS(it1); ! ARRAY_ITER ARRAY_ITER_VARS(it2); int i; FunctionCallInfoData locfcinfo; ! if (element_type != AARR_ELEMTYPE(array2)) ereport(ERROR, (errcode(ERRCODE_DATATYPE_MISMATCH), errmsg("cannot compare arrays of different element types"))); *************** array_cmp(FunctionCallInfo fcinfo) *** 3495,3505 **** /* Loop over source data */ min_nitems = Min(nitems1, nitems2); ! ptr1 = ARR_DATA_PTR(array1); ! ptr2 = ARR_DATA_PTR(array2); ! bitmap1 = ARR_NULLBITMAP(array1); ! bitmap2 = ARR_NULLBITMAP(array2); ! bitmask = 1; /* use same bitmask for both arrays */ for (i = 0; i < min_nitems; i++) { --- 3781,3788 ---- /* Loop over source data */ min_nitems = Min(nitems1, nitems2); ! ARRAY_ITER_SETUP(it1, array1); ! ARRAY_ITER_SETUP(it2, array2); for (i = 0; i < min_nitems; i++) { *************** array_cmp(FunctionCallInfo fcinfo) *** 3510,3551 **** int32 cmpresult; /* Get elements, checking for NULL */ ! if (bitmap1 && (*bitmap1 & bitmask) == 0) ! { ! isnull1 = true; ! elt1 = (Datum) 0; ! } ! else ! { ! isnull1 = false; ! elt1 = fetch_att(ptr1, typbyval, typlen); ! ptr1 = att_addlength_pointer(ptr1, typlen, ptr1); ! ptr1 = (char *) att_align_nominal(ptr1, typalign); ! } ! ! if (bitmap2 && (*bitmap2 & bitmask) == 0) ! { ! isnull2 = true; ! elt2 = (Datum) 0; ! } ! else ! { ! isnull2 = false; ! elt2 = fetch_att(ptr2, typbyval, typlen); ! ptr2 = att_addlength_pointer(ptr2, typlen, ptr2); ! ptr2 = (char *) att_align_nominal(ptr2, typalign); ! } ! ! /* advance bitmap pointers if any */ ! bitmask <<= 1; ! if (bitmask == 0x100) ! { ! if (bitmap1) ! bitmap1++; ! if (bitmap2) ! bitmap2++; ! bitmask = 1; ! } /* * We consider two NULLs equal; NULL > not-NULL. --- 3793,3800 ---- int32 cmpresult; /* Get elements, checking for NULL */ ! ARRAY_ITER_NEXT(it1, i, elt1, isnull1, typlen, typbyval, typalign); ! ARRAY_ITER_NEXT(it2, i, elt2, isnull2, typlen, typbyval, typalign); /* * We consider two NULLs equal; NULL > not-NULL. *************** array_cmp(FunctionCallInfo fcinfo) *** 3604,3611 **** result = (ndims1 < ndims2) ? -1 : 1; else { ! /* this relies on LB array immediately following DIMS array */ ! for (i = 0; i < ndims1 * 2; i++) { if (dims1[i] != dims2[i]) { --- 3853,3859 ---- result = (ndims1 < ndims2) ? -1 : 1; else { ! for (i = 0; i < ndims1; i++) { if (dims1[i] != dims2[i]) { *************** array_cmp(FunctionCallInfo fcinfo) *** 3613,3624 **** break; } } } } /* Avoid leaking memory when handed toasted input. */ ! PG_FREE_IF_COPY(array1, 0); ! PG_FREE_IF_COPY(array2, 1); return result; } --- 3861,3886 ---- break; } } + if (result == 0) + { + int *lbound1 = AARR_LBOUND(array1); + int *lbound2 = AARR_LBOUND(array2); + + for (i = 0; i < ndims1; i++) + { + if (lbound1[i] != lbound2[i]) + { + result = (lbound1[i] < lbound2[i]) ? -1 : 1; + break; + } + } + } } } /* Avoid leaking memory when handed toasted input. */ ! AARR_FREE_IF_COPY(array1, 0); ! AARR_FREE_IF_COPY(array2, 1); return result; } *************** array_cmp(FunctionCallInfo fcinfo) *** 3633,3652 **** Datum hash_array(PG_FUNCTION_ARGS) { ! ArrayType *array = PG_GETARG_ARRAYTYPE_P(0); ! int ndims = ARR_NDIM(array); ! int *dims = ARR_DIMS(array); ! Oid element_type = ARR_ELEMTYPE(array); uint32 result = 1; int nitems; TypeCacheEntry *typentry; int typlen; bool typbyval; char typalign; - char *ptr; - bits8 *bitmap; - int bitmask; int i; FunctionCallInfoData locfcinfo; /* --- 3895,3912 ---- Datum hash_array(PG_FUNCTION_ARGS) { ! AnyArrayType *array = PG_GETARG_ANY_ARRAY(0); ! int ndims = AARR_NDIM(array); ! int *dims = AARR_DIMS(array); ! Oid element_type = AARR_ELEMTYPE(array); uint32 result = 1; int nitems; TypeCacheEntry *typentry; int typlen; bool typbyval; char typalign; int i; + ARRAY_ITER ARRAY_ITER_VARS(iter); FunctionCallInfoData locfcinfo; /* *************** hash_array(PG_FUNCTION_ARGS) *** 3680,3707 **** /* Loop over source data */ nitems = ArrayGetNItems(ndims, dims); ! ptr = ARR_DATA_PTR(array); ! bitmap = ARR_NULLBITMAP(array); ! bitmask = 1; for (i = 0; i < nitems; i++) { uint32 elthash; /* Get element, checking for NULL */ ! if (bitmap && (*bitmap & bitmask) == 0) { /* Treat nulls as having hashvalue 0 */ elthash = 0; } else { - Datum elt; - - elt = fetch_att(ptr, typbyval, typlen); - ptr = att_addlength_pointer(ptr, typlen, ptr); - ptr = (char *) att_align_nominal(ptr, typalign); - /* Apply the hash function */ locfcinfo.arg[0] = elt; locfcinfo.argnull[0] = false; --- 3940,3963 ---- /* Loop over source data */ nitems = ArrayGetNItems(ndims, dims); ! ARRAY_ITER_SETUP(iter, array); for (i = 0; i < nitems; i++) { + Datum elt; + bool isnull; uint32 elthash; /* Get element, checking for NULL */ ! ARRAY_ITER_NEXT(iter, i, elt, isnull, typlen, typbyval, typalign); ! ! if (isnull) { /* Treat nulls as having hashvalue 0 */ elthash = 0; } else { /* Apply the hash function */ locfcinfo.arg[0] = elt; locfcinfo.argnull[0] = false; *************** hash_array(PG_FUNCTION_ARGS) *** 3709,3725 **** elthash = DatumGetUInt32(FunctionCallInvoke(&locfcinfo)); } - /* advance bitmap pointer if any */ - if (bitmap) - { - bitmask <<= 1; - if (bitmask == 0x100) - { - bitmap++; - bitmask = 1; - } - } - /* * Combine hash values of successive elements by multiplying the * current value by 31 and adding on the new element's hash value. --- 3965,3970 ---- *************** hash_array(PG_FUNCTION_ARGS) *** 3735,3741 **** } /* Avoid leaking memory when handed toasted input. */ ! PG_FREE_IF_COPY(array, 0); PG_RETURN_UINT32(result); } --- 3980,3986 ---- } /* Avoid leaking memory when handed toasted input. */ ! AARR_FREE_IF_COPY(array, 0); PG_RETURN_UINT32(result); } *************** hash_array(PG_FUNCTION_ARGS) *** 3756,3766 **** * When matchall is false, return true if any members of array1 are in array2. */ static bool ! array_contain_compare(ArrayType *array1, ArrayType *array2, Oid collation, bool matchall, void **fn_extra) { bool result = matchall; ! Oid element_type = ARR_ELEMTYPE(array1); TypeCacheEntry *typentry; int nelems1; Datum *values2; --- 4001,4011 ---- * When matchall is false, return true if any members of array1 are in array2. */ static bool ! array_contain_compare(AnyArrayType *array1, AnyArrayType *array2, Oid collation, bool matchall, void **fn_extra) { bool result = matchall; ! Oid element_type = AARR_ELEMTYPE(array1); TypeCacheEntry *typentry; int nelems1; Datum *values2; *************** array_contain_compare(ArrayType *array1, *** 3769,3782 **** int typlen; bool typbyval; char typalign; - char *ptr1; - bits8 *bitmap1; - int bitmask; int i; int j; FunctionCallInfoData locfcinfo; ! if (element_type != ARR_ELEMTYPE(array2)) ereport(ERROR, (errcode(ERRCODE_DATATYPE_MISMATCH), errmsg("cannot compare arrays of different element types"))); --- 4014,4025 ---- int typlen; bool typbyval; char typalign; int i; int j; + ARRAY_ITER ARRAY_ITER_VARS(it1); FunctionCallInfoData locfcinfo; ! if (element_type != AARR_ELEMTYPE(array2)) ereport(ERROR, (errcode(ERRCODE_DATATYPE_MISMATCH), errmsg("cannot compare arrays of different element types"))); *************** array_contain_compare(ArrayType *array1, *** 3809,3816 **** * worthwhile to use deconstruct_array on it. We scan array1 the hard way * however, since we very likely won't need to look at all of it. */ ! deconstruct_array(array2, element_type, typlen, typbyval, typalign, ! &values2, &nulls2, &nelems2); /* * Apply the comparison operator to each pair of array elements. --- 4052,4069 ---- * worthwhile to use deconstruct_array on it. We scan array1 the hard way * however, since we very likely won't need to look at all of it. */ ! if (VARATT_IS_EXPANDED_HEADER(array2)) ! { ! /* This should be safe even if input is read-only */ ! deconstruct_expanded_array(&(array2->xpn)); ! values2 = array2->xpn.dvalues; ! nulls2 = array2->xpn.dnulls; ! nelems2 = array2->xpn.nelems; ! } ! else ! deconstruct_array(&(array2->flt), ! element_type, typlen, typbyval, typalign, ! &values2, &nulls2, &nelems2); /* * Apply the comparison operator to each pair of array elements. *************** array_contain_compare(ArrayType *array1, *** 3819,3828 **** collation, NULL, NULL); /* Loop over source data */ ! nelems1 = ArrayGetNItems(ARR_NDIM(array1), ARR_DIMS(array1)); ! ptr1 = ARR_DATA_PTR(array1); ! bitmap1 = ARR_NULLBITMAP(array1); ! bitmask = 1; for (i = 0; i < nelems1; i++) { --- 4072,4079 ---- collation, NULL, NULL); /* Loop over source data */ ! nelems1 = ArrayGetNItems(AARR_NDIM(array1), AARR_DIMS(array1)); ! ARRAY_ITER_SETUP(it1, array1); for (i = 0; i < nelems1; i++) { *************** array_contain_compare(ArrayType *array1, *** 3830,3856 **** bool isnull1; /* Get element, checking for NULL */ ! if (bitmap1 && (*bitmap1 & bitmask) == 0) ! { ! isnull1 = true; ! elt1 = (Datum) 0; ! } ! else ! { ! isnull1 = false; ! elt1 = fetch_att(ptr1, typbyval, typlen); ! ptr1 = att_addlength_pointer(ptr1, typlen, ptr1); ! ptr1 = (char *) att_align_nominal(ptr1, typalign); ! } ! ! /* advance bitmap pointer if any */ ! bitmask <<= 1; ! if (bitmask == 0x100) ! { ! if (bitmap1) ! bitmap1++; ! bitmask = 1; ! } /* * We assume that the comparison operator is strict, so a NULL can't --- 4081,4087 ---- bool isnull1; /* Get element, checking for NULL */ ! ARRAY_ITER_NEXT(it1, i, elt1, isnull1, typlen, typbyval, typalign); /* * We assume that the comparison operator is strict, so a NULL can't *************** array_contain_compare(ArrayType *array1, *** 3909,3925 **** } } - pfree(values2); - pfree(nulls2); - return result; } Datum arrayoverlap(PG_FUNCTION_ARGS) { ! ArrayType *array1 = PG_GETARG_ARRAYTYPE_P(0); ! ArrayType *array2 = PG_GETARG_ARRAYTYPE_P(1); Oid collation = PG_GET_COLLATION(); bool result; --- 4140,4153 ---- } } return result; } Datum arrayoverlap(PG_FUNCTION_ARGS) { ! AnyArrayType *array1 = PG_GETARG_ANY_ARRAY(0); ! AnyArrayType *array2 = PG_GETARG_ANY_ARRAY(1); Oid collation = PG_GET_COLLATION(); bool result; *************** arrayoverlap(PG_FUNCTION_ARGS) *** 3927,3934 **** &fcinfo->flinfo->fn_extra); /* Avoid leaking memory when handed toasted input. */ ! PG_FREE_IF_COPY(array1, 0); ! PG_FREE_IF_COPY(array2, 1); PG_RETURN_BOOL(result); } --- 4155,4162 ---- &fcinfo->flinfo->fn_extra); /* Avoid leaking memory when handed toasted input. */ ! AARR_FREE_IF_COPY(array1, 0); ! AARR_FREE_IF_COPY(array2, 1); PG_RETURN_BOOL(result); } *************** arrayoverlap(PG_FUNCTION_ARGS) *** 3936,3943 **** Datum arraycontains(PG_FUNCTION_ARGS) { ! ArrayType *array1 = PG_GETARG_ARRAYTYPE_P(0); ! ArrayType *array2 = PG_GETARG_ARRAYTYPE_P(1); Oid collation = PG_GET_COLLATION(); bool result; --- 4164,4171 ---- Datum arraycontains(PG_FUNCTION_ARGS) { ! AnyArrayType *array1 = PG_GETARG_ANY_ARRAY(0); ! AnyArrayType *array2 = PG_GETARG_ANY_ARRAY(1); Oid collation = PG_GET_COLLATION(); bool result; *************** arraycontains(PG_FUNCTION_ARGS) *** 3945,3952 **** &fcinfo->flinfo->fn_extra); /* Avoid leaking memory when handed toasted input. */ ! PG_FREE_IF_COPY(array1, 0); ! PG_FREE_IF_COPY(array2, 1); PG_RETURN_BOOL(result); } --- 4173,4180 ---- &fcinfo->flinfo->fn_extra); /* Avoid leaking memory when handed toasted input. */ ! AARR_FREE_IF_COPY(array1, 0); ! AARR_FREE_IF_COPY(array2, 1); PG_RETURN_BOOL(result); } *************** arraycontains(PG_FUNCTION_ARGS) *** 3954,3961 **** Datum arraycontained(PG_FUNCTION_ARGS) { ! ArrayType *array1 = PG_GETARG_ARRAYTYPE_P(0); ! ArrayType *array2 = PG_GETARG_ARRAYTYPE_P(1); Oid collation = PG_GET_COLLATION(); bool result; --- 4182,4189 ---- Datum arraycontained(PG_FUNCTION_ARGS) { ! AnyArrayType *array1 = PG_GETARG_ANY_ARRAY(0); ! AnyArrayType *array2 = PG_GETARG_ANY_ARRAY(1); Oid collation = PG_GET_COLLATION(); bool result; *************** arraycontained(PG_FUNCTION_ARGS) *** 3963,3970 **** &fcinfo->flinfo->fn_extra); /* Avoid leaking memory when handed toasted input. */ ! PG_FREE_IF_COPY(array1, 0); ! PG_FREE_IF_COPY(array2, 1); PG_RETURN_BOOL(result); } --- 4191,4198 ---- &fcinfo->flinfo->fn_extra); /* Avoid leaking memory when handed toasted input. */ ! AARR_FREE_IF_COPY(array1, 0); ! AARR_FREE_IF_COPY(array2, 1); PG_RETURN_BOOL(result); } *************** initArrayResult(Oid element_type, Memory *** 4692,4698 **** MemoryContextAlloc(arr_context, sizeof(ArrayBuildState)); astate->mcontext = arr_context; astate->private_cxt = subcontext; ! astate->alen = (subcontext ? 64 : 8); /* arbitrary starting array size */ astate->dvalues = (Datum *) MemoryContextAlloc(arr_context, astate->alen * sizeof(Datum)); astate->dnulls = (bool *) --- 4920,4927 ---- MemoryContextAlloc(arr_context, sizeof(ArrayBuildState)); astate->mcontext = arr_context; astate->private_cxt = subcontext; ! astate->alen = (subcontext ? 64 : 8); /* arbitrary starting array ! * size */ astate->dvalues = (Datum *) MemoryContextAlloc(arr_context, astate->alen * sizeof(Datum)); astate->dnulls = (bool *) *************** initArrayResultArr(Oid array_type, Oid e *** 4868,4877 **** bool subcontext) { ArrayBuildStateArr *astate; ! MemoryContext arr_context = rcontext; /* by default use the parent ctx */ /* Lookup element type, unless element_type already provided */ ! if (! OidIsValid(element_type)) { element_type = get_element_type(array_type); --- 5097,5107 ---- bool subcontext) { ArrayBuildStateArr *astate; ! MemoryContext arr_context = rcontext; /* by default use the parent ! * ctx */ /* Lookup element type, unless element_type already provided */ ! if (!OidIsValid(element_type)) { element_type = get_element_type(array_type); *************** makeArrayResultAny(ArrayBuildStateAny *a *** 5249,5279 **** Datum array_larger(PG_FUNCTION_ARGS) { ! ArrayType *v1, ! *v2, ! *result; ! ! v1 = PG_GETARG_ARRAYTYPE_P(0); ! v2 = PG_GETARG_ARRAYTYPE_P(1); ! ! result = ((array_cmp(fcinfo) > 0) ? v1 : v2); ! ! PG_RETURN_ARRAYTYPE_P(result); } Datum array_smaller(PG_FUNCTION_ARGS) { ! ArrayType *v1, ! *v2, ! *result; ! ! v1 = PG_GETARG_ARRAYTYPE_P(0); ! v2 = PG_GETARG_ARRAYTYPE_P(1); ! ! result = ((array_cmp(fcinfo) < 0) ? v1 : v2); ! ! PG_RETURN_ARRAYTYPE_P(result); } --- 5479,5497 ---- Datum array_larger(PG_FUNCTION_ARGS) { ! if (array_cmp(fcinfo) > 0) ! PG_RETURN_DATUM(PG_GETARG_DATUM(0)); ! else ! PG_RETURN_DATUM(PG_GETARG_DATUM(1)); } Datum array_smaller(PG_FUNCTION_ARGS) { ! if (array_cmp(fcinfo) < 0) ! PG_RETURN_DATUM(PG_GETARG_DATUM(0)); ! else ! PG_RETURN_DATUM(PG_GETARG_DATUM(1)); } *************** generate_subscripts(PG_FUNCTION_ARGS) *** 5298,5304 **** /* stuff done only on the first call of the function */ if (SRF_IS_FIRSTCALL()) { ! ArrayType *v = PG_GETARG_ARRAYTYPE_P(0); int reqdim = PG_GETARG_INT32(1); int *lb, *dimv; --- 5516,5522 ---- /* stuff done only on the first call of the function */ if (SRF_IS_FIRSTCALL()) { ! AnyArrayType *v = PG_GETARG_ANY_ARRAY(0); int reqdim = PG_GETARG_INT32(1); int *lb, *dimv; *************** generate_subscripts(PG_FUNCTION_ARGS) *** 5307,5317 **** funcctx = SRF_FIRSTCALL_INIT(); /* Sanity check: does it look like an array at all? */ ! if (ARR_NDIM(v) <= 0 || ARR_NDIM(v) > MAXDIM) SRF_RETURN_DONE(funcctx); /* Sanity check: was the requested dim valid */ ! if (reqdim <= 0 || reqdim > ARR_NDIM(v)) SRF_RETURN_DONE(funcctx); /* --- 5525,5535 ---- funcctx = SRF_FIRSTCALL_INIT(); /* Sanity check: does it look like an array at all? */ ! if (AARR_NDIM(v) <= 0 || AARR_NDIM(v) > MAXDIM) SRF_RETURN_DONE(funcctx); /* Sanity check: was the requested dim valid */ ! if (reqdim <= 0 || reqdim > AARR_NDIM(v)) SRF_RETURN_DONE(funcctx); /* *************** generate_subscripts(PG_FUNCTION_ARGS) *** 5320,5327 **** oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx); fctx = (generate_subscripts_fctx *) palloc(sizeof(generate_subscripts_fctx)); ! lb = ARR_LBOUND(v); ! dimv = ARR_DIMS(v); fctx->lower = lb[reqdim - 1]; fctx->upper = dimv[reqdim - 1] + lb[reqdim - 1] - 1; --- 5538,5545 ---- oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx); fctx = (generate_subscripts_fctx *) palloc(sizeof(generate_subscripts_fctx)); ! lb = AARR_LBOUND(v); ! dimv = AARR_DIMS(v); fctx->lower = lb[reqdim - 1]; fctx->upper = dimv[reqdim - 1] + lb[reqdim - 1] - 1; *************** array_unnest(PG_FUNCTION_ARGS) *** 5640,5650 **** { typedef struct { ! ArrayType *arr; int nextelem; int numelems; - char *elemdataptr; /* this moves with nextelem */ - bits8 *arraynullsptr; /* this does not */ int16 elmlen; bool elmbyval; char elmalign; --- 5858,5866 ---- { typedef struct { ! ARRAY_ITER ARRAY_ITER_VARS(iter); int nextelem; int numelems; int16 elmlen; bool elmbyval; char elmalign; *************** array_unnest(PG_FUNCTION_ARGS) *** 5657,5663 **** /* stuff done only on the first call of the function */ if (SRF_IS_FIRSTCALL()) { ! ArrayType *arr; /* create a function context for cross-call persistence */ funcctx = SRF_FIRSTCALL_INIT(); --- 5873,5879 ---- /* stuff done only on the first call of the function */ if (SRF_IS_FIRSTCALL()) { ! AnyArrayType *arr; /* create a function context for cross-call persistence */ funcctx = SRF_FIRSTCALL_INIT(); *************** array_unnest(PG_FUNCTION_ARGS) *** 5674,5696 **** * and not before. (If no detoast happens, we assume the originally * passed array will stick around till then.) */ ! arr = PG_GETARG_ARRAYTYPE_P(0); /* allocate memory for user context */ fctx = (array_unnest_fctx *) palloc(sizeof(array_unnest_fctx)); /* initialize state */ ! fctx->arr = arr; fctx->nextelem = 0; ! fctx->numelems = ArrayGetNItems(ARR_NDIM(arr), ARR_DIMS(arr)); ! ! fctx->elemdataptr = ARR_DATA_PTR(arr); ! fctx->arraynullsptr = ARR_NULLBITMAP(arr); ! get_typlenbyvalalign(ARR_ELEMTYPE(arr), ! &fctx->elmlen, ! &fctx->elmbyval, ! &fctx->elmalign); funcctx->user_fctx = fctx; MemoryContextSwitchTo(oldcontext); --- 5890,5917 ---- * and not before. (If no detoast happens, we assume the originally * passed array will stick around till then.) */ ! arr = PG_GETARG_ANY_ARRAY(0); /* allocate memory for user context */ fctx = (array_unnest_fctx *) palloc(sizeof(array_unnest_fctx)); /* initialize state */ ! ARRAY_ITER_SETUP(fctx->iter, arr); fctx->nextelem = 0; ! fctx->numelems = ArrayGetNItems(AARR_NDIM(arr), AARR_DIMS(arr)); ! if (VARATT_IS_EXPANDED_HEADER(arr)) ! { ! /* we can just grab the type data from expanded array */ ! fctx->elmlen = arr->xpn.typlen; ! fctx->elmbyval = arr->xpn.typbyval; ! fctx->elmalign = arr->xpn.typalign; ! } ! else ! get_typlenbyvalalign(AARR_ELEMTYPE(arr), ! &fctx->elmlen, ! &fctx->elmbyval, ! &fctx->elmalign); funcctx->user_fctx = fctx; MemoryContextSwitchTo(oldcontext); *************** array_unnest(PG_FUNCTION_ARGS) *** 5705,5736 **** int offset = fctx->nextelem++; Datum elem; ! /* ! * Check for NULL array element ! */ ! if (array_get_isnull(fctx->arraynullsptr, offset)) ! { ! fcinfo->isnull = true; ! elem = (Datum) 0; ! /* elemdataptr does not move */ ! } ! else ! { ! /* ! * OK, get the element ! */ ! char *ptr = fctx->elemdataptr; ! ! fcinfo->isnull = false; ! elem = ArrayCast(ptr, fctx->elmbyval, fctx->elmlen); ! ! /* ! * Advance elemdataptr over it ! */ ! ptr = att_addlength_pointer(ptr, fctx->elmlen, ptr); ! ptr = (char *) att_align_nominal(ptr, fctx->elmalign); ! fctx->elemdataptr = ptr; ! } SRF_RETURN_NEXT(funcctx, elem); } --- 5926,5933 ---- int offset = fctx->nextelem++; Datum elem; ! ARRAY_ITER_NEXT(fctx->iter, offset, elem, fcinfo->isnull, ! fctx->elmlen, fctx->elmbyval, fctx->elmalign); SRF_RETURN_NEXT(funcctx, elem); } *************** array_replace_internal(ArrayType *array, *** 5982,5988 **** result->ndim = ndim; result->dataoffset = dataoffset; result->elemtype = element_type; ! memcpy(ARR_DIMS(result), ARR_DIMS(array), 2 * ndim * sizeof(int)); if (remove) { --- 6179,6186 ---- result->ndim = ndim; result->dataoffset = dataoffset; result->elemtype = element_type; ! memcpy(ARR_DIMS(result), ARR_DIMS(array), ndim * sizeof(int)); ! memcpy(ARR_LBOUND(result), ARR_LBOUND(array), ndim * sizeof(int)); if (remove) { diff --git a/src/backend/utils/adt/datum.c b/src/backend/utils/adt/datum.c index 014eca5..e8af030 100644 *** a/src/backend/utils/adt/datum.c --- b/src/backend/utils/adt/datum.c *************** *** 12,19 **** * *------------------------------------------------------------------------- */ /* ! * In the implementation of the next routines we assume the following: * * A) if a type is "byVal" then all the information is stored in the * Datum itself (i.e. no pointers involved!). In this case the --- 12,20 ---- * *------------------------------------------------------------------------- */ + /* ! * In the implementation of these routines we assume the following: * * A) if a type is "byVal" then all the information is stored in the * Datum itself (i.e. no pointers involved!). In this case the *************** *** 34,44 **** --- 35,49 ---- * * Note that we do not treat "toasted" datums specially; therefore what * will be copied or compared is the compressed data or toast reference. + * An exception is made for datumCopy() of an expanded object, however, + * because most callers expect to get a simple contiguous (and pfree'able) + * result from datumCopy(). See also datumTransfer(). */ #include "postgres.h" #include "utils/datum.h" + #include "utils/expandeddatum.h" /*------------------------------------------------------------------------- *************** *** 46,51 **** --- 51,57 ---- * * Find the "real" size of a datum, given the datum value, * whether it is a "by value", and the declared type length. + * (For TOAST pointer datums, this is the size of the pointer datum.) * * This is essentially an out-of-line version of the att_addlength_datum() * macro in access/tupmacs.h. We do a tad more error checking though. *************** datumGetSize(Datum value, bool typByVal, *** 106,114 **** /*------------------------------------------------------------------------- * datumCopy * ! * make a copy of a datum * * If the datatype is pass-by-reference, memory is obtained with palloc(). *------------------------------------------------------------------------- */ Datum --- 112,127 ---- /*------------------------------------------------------------------------- * datumCopy * ! * Make a copy of a non-NULL datum. * * If the datatype is pass-by-reference, memory is obtained with palloc(). + * + * If the value is a reference to an expanded object, we flatten into memory + * obtained with palloc(). We need to copy because one of the main uses of + * this function is to copy a datum out of a transient memory context that's + * about to be destroyed, and the expanded object is probably in a child + * context that will also go away. Moreover, many callers assume that the + * result is a single pfree-able chunk. *------------------------------------------------------------------------- */ Datum *************** datumCopy(Datum value, bool typByVal, in *** 118,161 **** if (typByVal) res = value; else { Size realSize; ! char *s; ! ! if (DatumGetPointer(value) == NULL) ! return PointerGetDatum(NULL); realSize = datumGetSize(value, typByVal, typLen); ! s = (char *) palloc(realSize); ! memcpy(s, DatumGetPointer(value), realSize); ! res = PointerGetDatum(s); } return res; } /*------------------------------------------------------------------------- ! * datumFree * ! * Free the space occupied by a datum CREATED BY "datumCopy" * ! * NOTE: DO NOT USE THIS ROUTINE with datums returned by heap_getattr() etc. ! * ONLY datums created by "datumCopy" can be freed! *------------------------------------------------------------------------- */ ! #ifdef NOT_USED ! void ! datumFree(Datum value, bool typByVal, int typLen) { ! if (!typByVal) ! { ! Pointer s = DatumGetPointer(value); ! ! pfree(s); ! } } - #endif /*------------------------------------------------------------------------- * datumIsEqual --- 131,201 ---- if (typByVal) res = value; + else if (typLen == -1) + { + /* It is a varlena datatype */ + struct varlena *vl = (struct varlena *) DatumGetPointer(value); + + if (VARATT_IS_EXTERNAL_EXPANDED(vl)) + { + /* Flatten into the caller's memory context */ + ExpandedObjectHeader *eoh = DatumGetEOHP(value); + Size resultsize; + char *resultptr; + + resultsize = EOH_get_flat_size(eoh); + resultptr = (char *) palloc(resultsize); + EOH_flatten_into(eoh, (void *) resultptr, resultsize); + res = PointerGetDatum(resultptr); + } + else + { + /* Otherwise, just copy the varlena datum verbatim */ + Size realSize; + char *resultptr; + + realSize = (Size) VARSIZE_ANY(vl); + resultptr = (char *) palloc(realSize); + memcpy(resultptr, vl, realSize); + res = PointerGetDatum(resultptr); + } + } else { + /* Pass by reference, but not varlena, so not toasted */ Size realSize; ! char *resultptr; realSize = datumGetSize(value, typByVal, typLen); ! resultptr = (char *) palloc(realSize); ! memcpy(resultptr, DatumGetPointer(value), realSize); ! res = PointerGetDatum(resultptr); } return res; } /*------------------------------------------------------------------------- ! * datumTransfer * ! * Transfer a non-NULL datum into the current memory context. * ! * This is equivalent to datumCopy() except when the datum is a read-write ! * pointer to an expanded object. In that case we merely reparent the object ! * into the current context, and return its standard R/W pointer (in case the ! * given one is a transient pointer of shorter lifespan). *------------------------------------------------------------------------- */ ! Datum ! datumTransfer(Datum value, bool typByVal, int typLen) { ! if (!typByVal && typLen == -1 && ! VARATT_IS_EXTERNAL_EXPANDED_RW(DatumGetPointer(value))) ! value = TransferExpandedObject(value, CurrentMemoryContext); ! else ! value = datumCopy(value, typByVal, typLen); ! return value; } /*------------------------------------------------------------------------- * datumIsEqual diff --git a/src/backend/utils/adt/expandeddatum.c b/src/backend/utils/adt/expandeddatum.c index ...039671b . *** a/src/backend/utils/adt/expandeddatum.c --- b/src/backend/utils/adt/expandeddatum.c *************** *** 0 **** --- 1,163 ---- + /*------------------------------------------------------------------------- + * + * expandeddatum.c + * Support functions for "expanded" value representations. + * + * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group + * Portions Copyright (c) 1994, Regents of the University of California + * + * + * IDENTIFICATION + * src/backend/utils/adt/expandeddatum.c + * + *------------------------------------------------------------------------- + */ + #include "postgres.h" + + #include "utils/expandeddatum.h" + #include "utils/memutils.h" + + /* + * DatumGetEOHP + * + * Given a Datum that is an expanded-object reference, extract the pointer. + * + * This is a bit tedious since the pointer may not be properly aligned; + * compare VARATT_EXTERNAL_GET_POINTER(). + */ + ExpandedObjectHeader * + DatumGetEOHP(Datum d) + { + varattrib_1b_e *datum = (varattrib_1b_e *) DatumGetPointer(d); + varatt_expanded ptr; + + Assert(VARATT_IS_EXTERNAL_EXPANDED(datum)); + memcpy(&ptr, VARDATA_EXTERNAL(datum), sizeof(ptr)); + Assert(VARATT_IS_EXPANDED_HEADER(ptr.eohptr)); + return ptr.eohptr; + } + + /* + * EOH_init_header + * + * Initialize the common header of an expanded object. + * + * The main thing this encapsulates is initializing the TOAST pointers. + */ + void + EOH_init_header(ExpandedObjectHeader *eohptr, + const ExpandedObjectMethods *methods, + MemoryContext obj_context) + { + varatt_expanded ptr; + + eohptr->vl_len_ = EOH_HEADER_MAGIC; + eohptr->eoh_methods = methods; + eohptr->eoh_context = obj_context; + + ptr.eohptr = eohptr; + + SET_VARTAG_EXTERNAL(eohptr->eoh_rw_ptr, VARTAG_EXPANDED_RW); + memcpy(VARDATA_EXTERNAL(eohptr->eoh_rw_ptr), &ptr, sizeof(ptr)); + + SET_VARTAG_EXTERNAL(eohptr->eoh_ro_ptr, VARTAG_EXPANDED_RO); + memcpy(VARDATA_EXTERNAL(eohptr->eoh_ro_ptr), &ptr, sizeof(ptr)); + } + + /* + * EOH_get_flat_size + * EOH_flatten_into + * + * Convenience functions for invoking the "methods" of an expanded object. + */ + + Size + EOH_get_flat_size(ExpandedObjectHeader *eohptr) + { + return (*eohptr->eoh_methods->get_flat_size) (eohptr); + } + + void + EOH_flatten_into(ExpandedObjectHeader *eohptr, + void *result, Size allocated_size) + { + (*eohptr->eoh_methods->flatten_into) (eohptr, result, allocated_size); + } + + /* + * Does the Datum represent a writable expanded object? + */ + bool + DatumIsReadWriteExpandedObject(Datum d, bool isnull, int16 typlen) + { + /* Reject if it's NULL or not a varlena type */ + if (isnull || typlen != -1) + return false; + + /* Reject if not a read-write expanded-object pointer */ + if (!VARATT_IS_EXTERNAL_EXPANDED_RW(DatumGetPointer(d))) + return false; + + return true; + } + + /* + * If the Datum represents a R/W expanded object, change it to R/O. + * Otherwise return the original Datum. + */ + Datum + MakeExpandedObjectReadOnly(Datum d, bool isnull, int16 typlen) + { + ExpandedObjectHeader *eohptr; + + /* Nothing to do if it's NULL or not a varlena type */ + if (isnull || typlen != -1) + return d; + + /* Nothing to do if not a read-write expanded-object pointer */ + if (!VARATT_IS_EXTERNAL_EXPANDED_RW(DatumGetPointer(d))) + return d; + + /* Now safe to extract the object pointer */ + eohptr = DatumGetEOHP(d); + + /* Return the built-in read-only pointer instead of given pointer */ + return EOHPGetRODatum(eohptr); + } + + /* + * Transfer ownership of an expanded object to a new parent memory context. + * The object must be referenced by a R/W pointer, and what we return is + * always its "standard" R/W pointer, which is certain to have the same + * lifespan as the object itself. (The passed-in pointer might not, and + * in any case wouldn't provide a unique identifier if it's not that one.) + */ + Datum + TransferExpandedObject(Datum d, MemoryContext new_parent) + { + ExpandedObjectHeader *eohptr = DatumGetEOHP(d); + + /* Assert caller gave a R/W pointer */ + Assert(VARATT_IS_EXTERNAL_EXPANDED_RW(DatumGetPointer(d))); + + /* Transfer ownership */ + MemoryContextSetParent(eohptr->eoh_context, new_parent); + + /* Return the object's standard read-write pointer */ + return EOHPGetRWDatum(eohptr); + } + + /* + * Delete an expanded object (must be referenced by a R/W pointer). + */ + void + DeleteExpandedObject(Datum d) + { + ExpandedObjectHeader *eohptr = DatumGetEOHP(d); + + /* Assert caller gave a R/W pointer */ + Assert(VARATT_IS_EXTERNAL_EXPANDED_RW(DatumGetPointer(d))); + + /* Kill it */ + MemoryContextDelete(eohptr->eoh_context); + } diff --git a/src/backend/utils/mmgr/mcxt.c b/src/backend/utils/mmgr/mcxt.c index 202bc78..4b24066 100644 *** a/src/backend/utils/mmgr/mcxt.c --- b/src/backend/utils/mmgr/mcxt.c *************** MemoryContextSetParent(MemoryContext con *** 266,271 **** --- 266,275 ---- AssertArg(MemoryContextIsValid(context)); AssertArg(context != new_parent); + /* Fast path if it's got correct parent already */ + if (new_parent == context->parent) + return; + /* Delink from existing parent, if any */ if (context->parent) { diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h index 40fde83..a98a7af 100644 *** a/src/include/executor/executor.h --- b/src/include/executor/executor.h *************** extern void FreeExprContext(ExprContext *** 312,318 **** extern void ReScanExprContext(ExprContext *econtext); #define ResetExprContext(econtext) \ ! MemoryContextReset((econtext)->ecxt_per_tuple_memory) extern ExprContext *MakePerTupleExprContext(EState *estate); --- 312,318 ---- extern void ReScanExprContext(ExprContext *econtext); #define ResetExprContext(econtext) \ ! MemoryContextResetAndDeleteChildren((econtext)->ecxt_per_tuple_memory) extern ExprContext *MakePerTupleExprContext(EState *estate); diff --git a/src/include/executor/spi.h b/src/include/executor/spi.h index 9e912ba..fbcae0c 100644 *** a/src/include/executor/spi.h --- b/src/include/executor/spi.h *************** extern char *SPI_getnspname(Relation rel *** 124,129 **** --- 124,130 ---- extern void *SPI_palloc(Size size); extern void *SPI_repalloc(void *pointer, Size size); extern void SPI_pfree(void *pointer); + extern Datum SPI_datumTransfer(Datum value, bool typByVal, int typLen); extern void SPI_freetuple(HeapTuple pointer); extern void SPI_freetuptable(SPITupleTable *tuptable); diff --git a/src/include/executor/tuptable.h b/src/include/executor/tuptable.h index 48f84bf..00686b0 100644 *** a/src/include/executor/tuptable.h --- b/src/include/executor/tuptable.h *************** extern Datum ExecFetchSlotTupleDatum(Tup *** 163,168 **** --- 163,169 ---- extern HeapTuple ExecMaterializeSlot(TupleTableSlot *slot); extern TupleTableSlot *ExecCopySlot(TupleTableSlot *dstslot, TupleTableSlot *srcslot); + extern TupleTableSlot *ExecMakeSlotContentsReadOnly(TupleTableSlot *slot); /* in access/common/heaptuple.c */ extern Datum slot_getattr(TupleTableSlot *slot, int attnum, bool *isnull); diff --git a/src/include/nodes/primnodes.h b/src/include/nodes/primnodes.h index dbc5a35..4f5cc8b 100644 *** a/src/include/nodes/primnodes.h --- b/src/include/nodes/primnodes.h *************** typedef struct WindowFunc *** 310,315 **** --- 310,319 ---- * Note: the result datatype is the element type when fetching a single * element; but it is the array type when doing subarray fetch or either * type of store. + * + * Note: for the cases where an array is returned, if refexpr yields a R/W + * expanded array, then the implementation is allowed to modify that object + * in-place and return the same object.) * ---------------- */ typedef struct ArrayRef diff --git a/src/include/postgres.h b/src/include/postgres.h index cbb7f79..08713bc 100644 *** a/src/include/postgres.h --- b/src/include/postgres.h *************** typedef struct varatt_indirect *** 88,93 **** --- 88,110 ---- } varatt_indirect; /* + * struct varatt_expanded is a "TOAST pointer" representing an out-of-line + * Datum that is stored in memory, in some type-specific, not necessarily + * physically contiguous format that is convenient for computation not + * storage. APIs for this, in particular the definition of struct + * ExpandedObjectHeader, are in src/include/utils/expandeddatum.h. + * + * Note that just as for struct varatt_external, this struct is stored + * unaligned within any containing tuple. + */ + typedef struct ExpandedObjectHeader ExpandedObjectHeader; + + typedef struct varatt_expanded + { + ExpandedObjectHeader *eohptr; + } varatt_expanded; + + /* * Type tag for the various sorts of "TOAST pointer" datums. The peculiar * value for VARTAG_ONDISK comes from a requirement for on-disk compatibility * with a previous notion that the tag field was the pointer datum's length. *************** typedef struct varatt_indirect *** 95,105 **** --- 112,129 ---- typedef enum vartag_external { VARTAG_INDIRECT = 1, + VARTAG_EXPANDED_RO = 2, + VARTAG_EXPANDED_RW = 3, VARTAG_ONDISK = 18 } vartag_external; + /* this test relies on the specific tag values above */ + #define VARTAG_IS_EXPANDED(tag) \ + (((tag) & ~1) == VARTAG_EXPANDED_RO) + #define VARTAG_SIZE(tag) \ ((tag) == VARTAG_INDIRECT ? sizeof(varatt_indirect) : \ + VARTAG_IS_EXPANDED(tag) ? sizeof(varatt_expanded) : \ (tag) == VARTAG_ONDISK ? sizeof(varatt_external) : \ TrapMacro(true, "unrecognized TOAST vartag")) *************** typedef struct *** 294,299 **** --- 318,329 ---- (VARATT_IS_EXTERNAL(PTR) && VARTAG_EXTERNAL(PTR) == VARTAG_ONDISK) #define VARATT_IS_EXTERNAL_INDIRECT(PTR) \ (VARATT_IS_EXTERNAL(PTR) && VARTAG_EXTERNAL(PTR) == VARTAG_INDIRECT) + #define VARATT_IS_EXTERNAL_EXPANDED_RO(PTR) \ + (VARATT_IS_EXTERNAL(PTR) && VARTAG_EXTERNAL(PTR) == VARTAG_EXPANDED_RO) + #define VARATT_IS_EXTERNAL_EXPANDED_RW(PTR) \ + (VARATT_IS_EXTERNAL(PTR) && VARTAG_EXTERNAL(PTR) == VARTAG_EXPANDED_RW) + #define VARATT_IS_EXTERNAL_EXPANDED(PTR) \ + (VARATT_IS_EXTERNAL(PTR) && VARTAG_IS_EXPANDED(VARTAG_EXTERNAL(PTR))) #define VARATT_IS_SHORT(PTR) VARATT_IS_1B(PTR) #define VARATT_IS_EXTENDED(PTR) (!VARATT_IS_4B_U(PTR)) diff --git a/src/include/utils/array.h b/src/include/utils/array.h index 649688c..e4e3a3d 100644 *** a/src/include/utils/array.h --- b/src/include/utils/array.h *************** *** 45,50 **** --- 45,55 ---- * We support subscripting on these types, but array_in() and array_out() * only work with varlena arrays. * + * In addition, arrays are a major user of the "expanded object" TOAST + * infrastructure. This allows a varlena array to be converted to a + * separate representation that may include "deconstructed" Datum/isnull + * arrays holding the elements. + * * * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group * Portions Copyright (c) 1994, Regents of the University of California *************** *** 57,62 **** --- 62,69 ---- #define ARRAY_H #include "fmgr.h" + #include "utils/expandeddatum.h" + /* * Arrays are varlena objects, so must meet the varlena convention that *************** typedef struct *** 75,80 **** --- 82,167 ---- } ArrayType; /* + * An expanded array is contained within a private memory context (as + * all expanded objects must be) and has a control structure as below. + * + * The expanded array might contain a regular "flat" array if that was the + * original input and we've not modified it significantly. Otherwise, the + * contents are represented by Datum/isnull arrays plus dimensionality and + * type information. We could also have both forms, if we've deconstructed + * the original array for access purposes but not yet changed it. For pass- + * by-reference element types, the Datums would point into the flat array in + * this situation. Once we start modifying array elements, new pass-by-ref + * elements are separately palloc'd within the memory context. + */ + #define EA_MAGIC 689375833 /* ID for debugging crosschecks */ + + typedef struct ExpandedArrayHeader + { + /* Standard header for expanded objects */ + ExpandedObjectHeader hdr; + + /* Magic value identifying an expanded array (for debugging only) */ + int ea_magic; + + /* Dimensionality info (always valid) */ + int ndims; /* # of dimensions */ + int *dims; /* array dimensions */ + int *lbound; /* index lower bounds for each dimension */ + + /* Element type info (always valid) */ + Oid element_type; /* element type OID */ + int16 typlen; /* needed info about element datatype */ + bool typbyval; + char typalign; + + /* + * If we have a Datum-array representation of the array, it's kept here; + * else dvalues/dnulls are NULL. The dvalues and dnulls arrays are always + * palloc'd within the object private context, but may change size from + * time to time. For pass-by-ref element types, dvalues entries might + * point either into the fstartptr..fendptr area, or to separately + * palloc'd chunks. Elements should always be fully detoasted, as they + * are in the standard flat representation. + * + * Even when dvalues is valid, dnulls can be NULL if there are no null + * elements. + */ + Datum *dvalues; /* array of Datums */ + bool *dnulls; /* array of is-null flags for Datums */ + int dvalueslen; /* allocated length of above arrays */ + int nelems; /* number of valid entries in above arrays */ + + /* + * flat_size is the current space requirement for the flat equivalent of + * the expanded array, if known; otherwise it's 0. We store this to make + * consecutive calls of get_flat_size cheap. + */ + Size flat_size; + + /* + * fvalue points to the flat representation if it is valid, else it is + * NULL. If we have or ever had a flat representation then + * fstartptr/fendptr point to the start and end+1 of its data area; this + * is so that we can tell which Datum pointers point into the flat + * representation rather than being pointers to separately palloc'd data. + */ + ArrayType *fvalue; /* must be a fully detoasted array */ + char *fstartptr; /* start of its data area */ + char *fendptr; /* end+1 of its data area */ + } ExpandedArrayHeader; + + /* + * Functions that can handle either a "flat" varlena array or an expanded + * array use this union to work with their input. + */ + typedef union AnyArrayType + { + ArrayType flt; + ExpandedArrayHeader xpn; + } AnyArrayType; + + /* * working state for accumArrayResult() and friends * note that the input must be scalars (legal array elements) */ *************** typedef struct ArrayMapState *** 151,167 **** /* ArrayIteratorData is private in arrayfuncs.c */ typedef struct ArrayIteratorData *ArrayIterator; ! /* ! * fmgr macros for array objects ! */ #define DatumGetArrayTypeP(X) ((ArrayType *) PG_DETOAST_DATUM(X)) #define DatumGetArrayTypePCopy(X) ((ArrayType *) PG_DETOAST_DATUM_COPY(X)) #define PG_GETARG_ARRAYTYPE_P(n) DatumGetArrayTypeP(PG_GETARG_DATUM(n)) #define PG_GETARG_ARRAYTYPE_P_COPY(n) DatumGetArrayTypePCopy(PG_GETARG_DATUM(n)) #define PG_RETURN_ARRAYTYPE_P(x) PG_RETURN_POINTER(x) /* ! * Access macros for array header fields. * * ARR_DIMS returns a pointer to an array of array dimensions (number of * elements along the various array axes). --- 238,261 ---- /* ArrayIteratorData is private in arrayfuncs.c */ typedef struct ArrayIteratorData *ArrayIterator; ! /* fmgr macros for regular varlena array objects */ #define DatumGetArrayTypeP(X) ((ArrayType *) PG_DETOAST_DATUM(X)) #define DatumGetArrayTypePCopy(X) ((ArrayType *) PG_DETOAST_DATUM_COPY(X)) #define PG_GETARG_ARRAYTYPE_P(n) DatumGetArrayTypeP(PG_GETARG_DATUM(n)) #define PG_GETARG_ARRAYTYPE_P_COPY(n) DatumGetArrayTypePCopy(PG_GETARG_DATUM(n)) #define PG_RETURN_ARRAYTYPE_P(x) PG_RETURN_POINTER(x) + /* fmgr macros for expanded array objects */ + #define PG_GETARG_EXPANDED_ARRAY(n) DatumGetExpandedArray(PG_GETARG_DATUM(n)) + #define PG_GETARG_EXPANDED_ARRAYX(n, metacache) \ + DatumGetExpandedArrayX(PG_GETARG_DATUM(n), metacache) + #define PG_RETURN_EXPANDED_ARRAY(x) PG_RETURN_DATUM(EOHPGetRWDatum(&(x)->hdr)) + + /* fmgr macros for AnyArrayType (ie, get either varlena or expanded form) */ + #define PG_GETARG_ANY_ARRAY(n) DatumGetAnyArray(PG_GETARG_DATUM(n)) + /* ! * Access macros for varlena array header fields. * * ARR_DIMS returns a pointer to an array of array dimensions (number of * elements along the various array axes). *************** typedef struct ArrayIteratorData *ArrayI *** 209,214 **** --- 303,404 ---- #define ARR_DATA_PTR(a) \ (((char *) (a)) + ARR_DATA_OFFSET(a)) + /* + * Macros for working with AnyArrayType inputs. Beware multiple references! + */ + #define AARR_NDIM(a) \ + (VARATT_IS_EXPANDED_HEADER(a) ? (a)->xpn.ndims : ARR_NDIM(&(a)->flt)) + #define AARR_HASNULL(a) \ + (VARATT_IS_EXPANDED_HEADER(a) ? \ + ((a)->xpn.dvalues != NULL ? (a)->xpn.dnulls != NULL : ARR_HASNULL((a)->xpn.fvalue)) : \ + ARR_HASNULL(&(a)->flt)) + #define AARR_ELEMTYPE(a) \ + (VARATT_IS_EXPANDED_HEADER(a) ? (a)->xpn.element_type : ARR_ELEMTYPE(&(a)->flt)) + #define AARR_DIMS(a) \ + (VARATT_IS_EXPANDED_HEADER(a) ? (a)->xpn.dims : ARR_DIMS(&(a)->flt)) + #define AARR_LBOUND(a) \ + (VARATT_IS_EXPANDED_HEADER(a) ? (a)->xpn.lbound : ARR_LBOUND(&(a)->flt)) + + /* + * Macros for iterating through elements of a flat or expanded array. + * Use "ARRAY_ITER ARRAY_ITER_VARS(name);" to declare the local variables + * needed for an iterator (more than one set can be used in the same function, + * if they have different names). + * Use "ARRAY_ITER_SETUP(name, arrayptr);" to prepare to iterate, and + * "ARRAY_ITER_NEXT(name, index, datumvar, isnullvar, ...);" to fetch the + * next element into datumvar/isnullvar. "index" must be the zero-origin + * element number; we make caller provide this since caller is generally + * counting the elements anyway. + */ + #define ARRAY_ITER /* dummy type name to keep pgindent happy */ + + #define ARRAY_ITER_VARS(iter) \ + Datum *iter##datumptr; \ + bool *iter##isnullptr; \ + char *iter##dataptr; \ + bits8 *iter##bitmapptr; \ + int iter##bitmask + + #define ARRAY_ITER_SETUP(iter, arrayptr) \ + do { \ + if (VARATT_IS_EXPANDED_HEADER(arrayptr)) \ + { \ + if ((arrayptr)->xpn.dvalues) \ + { \ + (iter##datumptr) = (arrayptr)->xpn.dvalues; \ + (iter##isnullptr) = (arrayptr)->xpn.dnulls; \ + (iter##dataptr) = NULL; \ + (iter##bitmapptr) = NULL; \ + } \ + else \ + { \ + (iter##datumptr) = NULL; \ + (iter##isnullptr) = NULL; \ + (iter##dataptr) = ARR_DATA_PTR((arrayptr)->xpn.fvalue); \ + (iter##bitmapptr) = ARR_NULLBITMAP((arrayptr)->xpn.fvalue); \ + } \ + } \ + else \ + { \ + (iter##datumptr) = NULL; \ + (iter##isnullptr) = NULL; \ + (iter##dataptr) = ARR_DATA_PTR(&(arrayptr)->flt); \ + (iter##bitmapptr) = ARR_NULLBITMAP(&(arrayptr)->flt); \ + } \ + (iter##bitmask) = 1; \ + } while (0) + + #define ARRAY_ITER_NEXT(iter,i, datumvar,isnullvar, elmlen,elmbyval,elmalign) \ + do { \ + if (iter##datumptr) \ + { \ + (datumvar) = (iter##datumptr)[i]; \ + (isnullvar) = (iter##isnullptr) ? (iter##isnullptr)[i] : false; \ + } \ + else \ + { \ + if ((iter##bitmapptr) && (*(iter##bitmapptr) & (iter##bitmask)) == 0) \ + { \ + (isnullvar) = true; \ + (datumvar) = (Datum) 0; \ + } \ + else \ + { \ + (isnullvar) = false; \ + (datumvar) = fetch_att(iter##dataptr, elmbyval, elmlen); \ + (iter##dataptr) = att_addlength_pointer(iter##dataptr, elmlen, iter##dataptr); \ + (iter##dataptr) = (char *) att_align_nominal(iter##dataptr, elmalign); \ + } \ + (iter##bitmask) <<= 1; \ + if ((iter##bitmask) == 0x100) \ + { \ + if (iter##bitmapptr) \ + (iter##bitmapptr)++; \ + (iter##bitmask) = 1; \ + } \ + } \ + } while (0) + /* * GUC parameter *************** extern Datum array_remove(PG_FUNCTION_AR *** 250,255 **** --- 440,454 ---- extern Datum array_replace(PG_FUNCTION_ARGS); extern Datum width_bucket_array(PG_FUNCTION_ARGS); + extern void CopyArrayEls(ArrayType *array, + Datum *values, + bool *nulls, + int nitems, + int typlen, + bool typbyval, + char typalign, + bool freedata); + extern Datum array_get_element(Datum arraydatum, int nSubscripts, int *indx, int arraytyplen, int elmlen, bool elmbyval, char elmalign, bool *isNull); *************** extern ArrayType *array_set(ArrayType *a *** 271,277 **** Datum dataValue, bool isNull, int arraytyplen, int elmlen, bool elmbyval, char elmalign); ! extern Datum array_map(FunctionCallInfo fcinfo, Oid inpType, Oid retType, ArrayMapState *amstate); extern void array_bitmap_copy(bits8 *destbitmap, int destoffset, --- 470,476 ---- Datum dataValue, bool isNull, int arraytyplen, int elmlen, bool elmbyval, char elmalign); ! extern Datum array_map(FunctionCallInfo fcinfo, Oid retType, ArrayMapState *amstate); extern void array_bitmap_copy(bits8 *destbitmap, int destoffset, *************** extern ArrayType *construct_md_array(Dat *** 288,293 **** --- 487,495 ---- int *lbs, Oid elmtype, int elmlen, bool elmbyval, char elmalign); extern ArrayType *construct_empty_array(Oid elmtype); + extern ExpandedArrayHeader *construct_empty_expanded_array(Oid element_type, + MemoryContext parentcontext, + ArrayMetaState *metacache); extern void deconstruct_array(ArrayType *array, Oid elmtype, int elmlen, bool elmbyval, char elmalign, *************** extern int mda_next_tuple(int n, int *cu *** 341,346 **** --- 543,559 ---- extern int32 *ArrayGetIntegerTypmods(ArrayType *arr, int *n); /* + * prototypes for functions defined in array_expanded.c + */ + extern Datum expand_array(Datum arraydatum, MemoryContext parentcontext, + ArrayMetaState *metacache); + extern ExpandedArrayHeader *DatumGetExpandedArray(Datum d); + extern ExpandedArrayHeader *DatumGetExpandedArrayX(Datum d, + ArrayMetaState *metacache); + extern AnyArrayType *DatumGetAnyArray(Datum d); + extern void deconstruct_expanded_array(ExpandedArrayHeader *eah); + + /* * prototypes for functions defined in array_userfuncs.c */ extern Datum array_append(PG_FUNCTION_ARGS); diff --git a/src/include/utils/datum.h b/src/include/utils/datum.h index 663414b..c572f79 100644 *** a/src/include/utils/datum.h --- b/src/include/utils/datum.h *************** *** 24,41 **** extern Size datumGetSize(Datum value, bool typByVal, int typLen); /* ! * datumCopy - make a copy of a datum. * * If the datatype is pass-by-reference, memory is obtained with palloc(). */ extern Datum datumCopy(Datum value, bool typByVal, int typLen); /* ! * datumFree - free a datum previously allocated by datumCopy, if any. * ! * Does nothing if datatype is pass-by-value. */ ! extern void datumFree(Datum value, bool typByVal, int typLen); /* * datumIsEqual --- 24,41 ---- extern Size datumGetSize(Datum value, bool typByVal, int typLen); /* ! * datumCopy - make a copy of a non-NULL datum. * * If the datatype is pass-by-reference, memory is obtained with palloc(). */ extern Datum datumCopy(Datum value, bool typByVal, int typLen); /* ! * datumTransfer - transfer a non-NULL datum into the current memory context. * ! * Differs from datumCopy() in its handling of read-write expanded objects. */ ! extern Datum datumTransfer(Datum value, bool typByVal, int typLen); /* * datumIsEqual diff --git a/src/include/utils/expandeddatum.h b/src/include/utils/expandeddatum.h index ...3a8336e . *** a/src/include/utils/expandeddatum.h --- b/src/include/utils/expandeddatum.h *************** *** 0 **** --- 1,148 ---- + /*------------------------------------------------------------------------- + * + * expandeddatum.h + * Declarations for access to "expanded" value representations. + * + * Complex data types, particularly container types such as arrays and + * records, usually have on-disk representations that are compact but not + * especially convenient to modify. What's more, when we do modify them, + * having to recopy all the rest of the value can be extremely inefficient. + * Therefore, we provide a notion of an "expanded" representation that is used + * only in memory and is optimized more for computation than storage. + * The format appearing on disk is called the data type's "flattened" + * representation, since it is required to be a contiguous blob of bytes -- + * but the type can have an expanded representation that is not. Data types + * must provide means to translate an expanded representation back to + * flattened form. + * + * An expanded object is meant to survive across multiple operations, but + * not to be enormously long-lived; for example it might be a local variable + * in a PL/pgSQL procedure. So its extra bulk compared to the on-disk format + * is a worthwhile trade-off. + * + * References to expanded objects are a type of TOAST pointer. + * Because of longstanding conventions in Postgres, this means that the + * flattened form of such an object must always be a varlena object. + * Fortunately that's no restriction in practice. + * + * There are actually two kinds of TOAST pointers for expanded objects: + * read-only and read-write pointers. Possession of one of the latter + * authorizes a function to modify the value in-place rather than copying it + * as would normally be required. Functions should always return a read-write + * pointer to any new expanded object they create. Functions that modify an + * argument value in-place must take care that they do not corrupt the old + * value if they fail partway through. + * + * + * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group + * Portions Copyright (c) 1994, Regents of the University of California + * + * src/include/utils/expandeddatum.h + * + *------------------------------------------------------------------------- + */ + #ifndef EXPANDEDDATUM_H + #define EXPANDEDDATUM_H + + /* Size of an EXTERNAL datum that contains a pointer to an expanded object */ + #define EXPANDED_POINTER_SIZE (VARHDRSZ_EXTERNAL + sizeof(varatt_expanded)) + + /* + * "Methods" that must be provided for any expanded object. + * + * get_flat_size: compute space needed for flattened representation (which + * must be a valid in-line, non-compressed, 4-byte-header varlena object). + * + * flatten_into: construct flattened representation in the caller-allocated + * space at *result, of size allocated_size (which will always be the result + * of a preceding get_flat_size call; it's passed for cross-checking). + * + * Note: construction of a heap tuple from an expanded datum calls + * get_flat_size twice, so it's worthwhile to make sure that that doesn't + * incur too much overhead. + */ + typedef Size (*EOM_get_flat_size_method) (ExpandedObjectHeader *eohptr); + typedef void (*EOM_flatten_into_method) (ExpandedObjectHeader *eohptr, + void *result, Size allocated_size); + + /* Struct of function pointers for an expanded object's methods */ + typedef struct ExpandedObjectMethods + { + EOM_get_flat_size_method get_flat_size; + EOM_flatten_into_method flatten_into; + } ExpandedObjectMethods; + + /* + * Every expanded object must contain this header; typically the header + * is embedded in some larger struct that adds type-specific fields. + * + * It is presumed that the header object and all subsidiary data are stored + * in eoh_context, so that the object can be freed by deleting that context, + * or its storage lifespan can be altered by reparenting the context. + * (In principle the object could own additional resources, such as malloc'd + * storage, and use a memory context reset callback to free them upon reset or + * deletion of eoh_context.) + * + * We set up two TOAST pointers within the standard header, one read-write + * and one read-only. This allows functions to return either kind of pointer + * without making an additional allocation, and in particular without worrying + * whether a separately palloc'd object would have sufficient lifespan. + * But note that these pointers are just a convenience; a pointer object + * appearing somewhere else would still be legal. + * + * The typedef declaration for this appears in postgres.h. + */ + struct ExpandedObjectHeader + { + /* Phony varlena header */ + int32 vl_len_; /* always EOH_HEADER_MAGIC, see below */ + + /* Pointer to methods required for object type */ + const ExpandedObjectMethods *eoh_methods; + + /* Memory context containing this header and subsidiary data */ + MemoryContext eoh_context; + + /* Standard R/W TOAST pointer for this object is kept here */ + char eoh_rw_ptr[EXPANDED_POINTER_SIZE]; + + /* Standard R/O TOAST pointer for this object is kept here */ + char eoh_ro_ptr[EXPANDED_POINTER_SIZE]; + }; + + /* + * Particularly for read-only functions, it is handy to be able to work with + * either regular "flat" varlena inputs or expanded inputs of the same data + * type. To allow determining which case an argument-fetching function has + * returned, the first int32 of an ExpandedObjectHeader always contains -1 + * (EOH_HEADER_MAGIC to the code). This works since no 4-byte-header varlena + * could have that as its first 4 bytes. Caution: we could not reliably tell + * the difference between an ExpandedObjectHeader and a short-header object + * with this trick. However, it works fine if the argument fetching code + * always returns either a 4-byte-header flat object or an expanded object. + */ + #define EOH_HEADER_MAGIC (-1) + #define VARATT_IS_EXPANDED_HEADER(PTR) \ + (((ExpandedObjectHeader *) (PTR))->vl_len_ == EOH_HEADER_MAGIC) + + /* + * Generic support functions for expanded objects. + * (More of these might be worth inlining later.) + */ + + #define EOHPGetRWDatum(eohptr) PointerGetDatum((eohptr)->eoh_rw_ptr) + #define EOHPGetRODatum(eohptr) PointerGetDatum((eohptr)->eoh_ro_ptr) + + extern ExpandedObjectHeader *DatumGetEOHP(Datum d); + extern void EOH_init_header(ExpandedObjectHeader *eohptr, + const ExpandedObjectMethods *methods, + MemoryContext obj_context); + extern Size EOH_get_flat_size(ExpandedObjectHeader *eohptr); + extern void EOH_flatten_into(ExpandedObjectHeader *eohptr, + void *result, Size allocated_size); + extern bool DatumIsReadWriteExpandedObject(Datum d, bool isnull, int16 typlen); + extern Datum MakeExpandedObjectReadOnly(Datum d, bool isnull, int16 typlen); + extern Datum TransferExpandedObject(Datum d, MemoryContext new_parent); + extern void DeleteExpandedObject(Datum d); + + #endif /* EXPANDEDDATUM_H */ diff --git a/src/pl/plpgsql/src/pl_comp.c b/src/pl/plpgsql/src/pl_comp.c index f364ce4..d021145 100644 *** a/src/pl/plpgsql/src/pl_comp.c --- b/src/pl/plpgsql/src/pl_comp.c *************** build_datatype(HeapTuple typeTup, int32 *** 2202,2207 **** --- 2202,2223 ---- typ->typbyval = typeStruct->typbyval; typ->typrelid = typeStruct->typrelid; typ->typioparam = getTypeIOParam(typeTup); + /* Detect if type is true array, or domain thereof */ + /* NB: this is only used to decide whether to apply expand_array */ + if (typeStruct->typtype == TYPTYPE_BASE) + { + /* this test should match what get_element_type() checks */ + typ->typisarray = (typeStruct->typlen == -1 && + OidIsValid(typeStruct->typelem)); + } + else if (typeStruct->typtype == TYPTYPE_DOMAIN) + { + /* we can short-circuit looking up base types if it's not varlena */ + typ->typisarray = (typeStruct->typlen == -1 && + OidIsValid(get_base_element_type(typeStruct->typbasetype))); + } + else + typ->typisarray = false; typ->collation = typeStruct->typcollation; if (OidIsValid(collation) && OidIsValid(typ->collation)) typ->collation = collation; diff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c index edcb230..c242eea 100644 *** a/src/pl/plpgsql/src/pl_exec.c --- b/src/pl/plpgsql/src/pl_exec.c *************** static bool exec_eval_simple_expr(PLpgSQ *** 158,164 **** PLpgSQL_expr *expr, Datum *result, bool *isNull, ! Oid *rettype); static void exec_assign_expr(PLpgSQL_execstate *estate, PLpgSQL_datum *target, --- 158,165 ---- PLpgSQL_expr *expr, Datum *result, bool *isNull, ! Oid *rettype, ! int32 *rettypmod); static void exec_assign_expr(PLpgSQL_execstate *estate, PLpgSQL_datum *target, *************** static void exec_assign_c_string(PLpgSQL *** 168,176 **** const char *str); static void exec_assign_value(PLpgSQL_execstate *estate, PLpgSQL_datum *target, ! Datum value, Oid valtype, bool *isNull); static void exec_eval_datum(PLpgSQL_execstate *estate, PLpgSQL_datum *datum, Oid *typeid, int32 *typetypmod, Datum *value, --- 169,179 ---- const char *str); static void exec_assign_value(PLpgSQL_execstate *estate, PLpgSQL_datum *target, ! Datum value, Oid valtype, int32 valtypmod, ! bool *isNull); static void exec_eval_datum(PLpgSQL_execstate *estate, PLpgSQL_datum *datum, + bool getrwpointer, Oid *typeid, int32 *typetypmod, Datum *value, *************** static bool exec_eval_boolean(PLpgSQL_ex *** 184,190 **** static Datum exec_eval_expr(PLpgSQL_execstate *estate, PLpgSQL_expr *expr, bool *isNull, ! Oid *rettype); static int exec_run_select(PLpgSQL_execstate *estate, PLpgSQL_expr *expr, long maxtuples, Portal *portalP); static int exec_for_query(PLpgSQL_execstate *estate, PLpgSQL_stmt_forq *stmt, --- 187,194 ---- static Datum exec_eval_expr(PLpgSQL_execstate *estate, PLpgSQL_expr *expr, bool *isNull, ! Oid *rettype, ! int32 *rettypmod); static int exec_run_select(PLpgSQL_execstate *estate, PLpgSQL_expr *expr, long maxtuples, Portal *portalP); static int exec_for_query(PLpgSQL_execstate *estate, PLpgSQL_stmt_forq *stmt, *************** static void exec_move_row_from_datum(PLp *** 208,221 **** static char *convert_value_to_string(PLpgSQL_execstate *estate, Datum value, Oid valtype); static Datum exec_cast_value(PLpgSQL_execstate *estate, ! Datum value, Oid valtype, Oid reqtype, FmgrInfo *reqinput, Oid reqtypioparam, int32 reqtypmod, bool isnull); static Datum exec_simple_cast_value(PLpgSQL_execstate *estate, ! Datum value, Oid valtype, Oid reqtype, int32 reqtypmod, bool isnull); static void exec_init_tuple_store(PLpgSQL_execstate *estate); --- 212,225 ---- static char *convert_value_to_string(PLpgSQL_execstate *estate, Datum value, Oid valtype); static Datum exec_cast_value(PLpgSQL_execstate *estate, ! Datum value, Oid valtype, int32 valtypmod, Oid reqtype, FmgrInfo *reqinput, Oid reqtypioparam, int32 reqtypmod, bool isnull); static Datum exec_simple_cast_value(PLpgSQL_execstate *estate, ! Datum value, Oid valtype, int32 valtypmod, Oid reqtype, int32 reqtypmod, bool isnull); static void exec_init_tuple_store(PLpgSQL_execstate *estate); *************** plpgsql_exec_function(PLpgSQL_function * *** 295,300 **** --- 299,342 ---- var->value = fcinfo->arg[i]; var->isnull = fcinfo->argnull[i]; var->freeval = false; + + /* + * Force any array-valued parameter to be stored in + * expanded form in our local variable, in hopes of + * improving efficiency of uses of the variable. (This is + * a hack, really: why only arrays? Need more thought + * about which cases are likely to win. See also + * typisarray-specific heuristic in exec_assign_value.) + * + * Special cases: If passed a R/W expanded pointer, assume + * we can commandeer the object rather than having to copy + * it. If passed a R/O expanded pointer, just keep it as + * the value of the variable for the moment. (We'll force + * it to R/W if the variable gets modified, but that may + * very well never happen.) + */ + if (!var->isnull && var->datatype->typisarray) + { + if (VARATT_IS_EXTERNAL_EXPANDED_RW(DatumGetPointer(var->value))) + { + /* take ownership of R/W object */ + var->value = TransferExpandedObject(var->value, + CurrentMemoryContext); + var->freeval = true; + } + else if (VARATT_IS_EXTERNAL_EXPANDED_RO(DatumGetPointer(var->value))) + { + /* R/O pointer, keep it as-is until assigned to */ + } + else + { + /* flat array, so force to expanded form */ + var->value = expand_array(var->value, + CurrentMemoryContext, + NULL); + var->freeval = true; + } + } } break; *************** plpgsql_exec_function(PLpgSQL_function * *** 453,458 **** --- 495,501 ---- estate.retval = exec_cast_value(&estate, estate.retval, estate.rettype, + -1, func->fn_rettype, &(func->fn_retinput), func->fn_rettypioparam, *************** plpgsql_exec_function(PLpgSQL_function * *** 461,478 **** /* * If the function's return type isn't by value, copy the value ! * into upper executor memory context. */ if (!fcinfo->isnull && !func->fn_retbyval) ! { ! Size len; ! void *tmp; ! ! len = datumGetSize(estate.retval, false, func->fn_rettyplen); ! tmp = SPI_palloc(len); ! memcpy(tmp, DatumGetPointer(estate.retval), len); ! estate.retval = PointerGetDatum(tmp); ! } } } --- 504,517 ---- /* * If the function's return type isn't by value, copy the value ! * into upper executor memory context. However, if we have a R/W ! * expanded datum, we can just transfer its ownership out to the ! * upper executor context. */ if (!fcinfo->isnull && !func->fn_retbyval) ! estate.retval = SPI_datumTransfer(estate.retval, ! false, ! func->fn_rettyplen); } } *************** exec_stmt_block(PLpgSQL_execstate *estat *** 1084,1089 **** --- 1123,1129 ---- (PLpgSQL_datum *) var, (Datum) 0, UNKNOWNOID, + -1, &valIsNull); } if (var->notnull) *************** exec_stmt_getdiag(PLpgSQL_execstate *est *** 1563,1575 **** case PLPGSQL_GETDIAG_ROW_COUNT: exec_assign_value(estate, var, UInt32GetDatum(estate->eval_processed), ! INT4OID, &isnull); break; case PLPGSQL_GETDIAG_RESULT_OID: exec_assign_value(estate, var, ObjectIdGetDatum(estate->eval_lastoid), ! OIDOID, &isnull); break; case PLPGSQL_GETDIAG_ERROR_CONTEXT: --- 1603,1615 ---- case PLPGSQL_GETDIAG_ROW_COUNT: exec_assign_value(estate, var, UInt32GetDatum(estate->eval_processed), ! INT4OID, -1, &isnull); break; case PLPGSQL_GETDIAG_RESULT_OID: exec_assign_value(estate, var, ObjectIdGetDatum(estate->eval_lastoid), ! OIDOID, -1, &isnull); break; case PLPGSQL_GETDIAG_ERROR_CONTEXT: *************** exec_stmt_case(PLpgSQL_execstate *estate *** 1688,1696 **** { /* simple case */ Datum t_val; ! Oid t_oid; ! t_val = exec_eval_expr(estate, stmt->t_expr, &isnull, &t_oid); t_var = (PLpgSQL_var *) estate->datums[stmt->t_varno]; --- 1728,1738 ---- { /* simple case */ Datum t_val; ! Oid t_typoid; ! int32 t_typmod; ! t_val = exec_eval_expr(estate, stmt->t_expr, ! &isnull, &t_typoid, &t_typmod); t_var = (PLpgSQL_var *) estate->datums[stmt->t_varno]; *************** exec_stmt_case(PLpgSQL_execstate *estate *** 1699,1714 **** * what we're modifying here is an execution copy of the datum, so * this doesn't affect the originally stored function parse tree. */ ! if (t_var->datatype->typoid != t_oid) ! t_var->datatype = plpgsql_build_datatype(t_oid, ! -1, estate->func->fn_input_collation); /* now we can assign to the variable */ exec_assign_value(estate, (PLpgSQL_datum *) t_var, t_val, ! t_oid, &isnull); exec_eval_cleanup(estate); --- 1741,1758 ---- * what we're modifying here is an execution copy of the datum, so * this doesn't affect the originally stored function parse tree. */ ! if (t_var->datatype->typoid != t_typoid || ! t_var->datatype->atttypmod != t_typmod) ! t_var->datatype = plpgsql_build_datatype(t_typoid, ! t_typmod, estate->func->fn_input_collation); /* now we can assign to the variable */ exec_assign_value(estate, (PLpgSQL_datum *) t_var, t_val, ! t_typoid, ! t_typmod, &isnull); exec_eval_cleanup(estate); *************** exec_stmt_fori(PLpgSQL_execstate *estate *** 1885,1890 **** --- 1929,1935 ---- Datum value; bool isnull; Oid valtype; + int32 valtypmod; int32 loop_value; int32 end_value; int32 step_value; *************** exec_stmt_fori(PLpgSQL_execstate *estate *** 1896,1903 **** /* * Get the value of the lower bound */ ! value = exec_eval_expr(estate, stmt->lower, &isnull, &valtype); ! value = exec_cast_value(estate, value, valtype, var->datatype->typoid, &(var->datatype->typinput), var->datatype->typioparam, var->datatype->atttypmod, isnull); --- 1941,1950 ---- /* * Get the value of the lower bound */ ! value = exec_eval_expr(estate, stmt->lower, ! &isnull, &valtype, &valtypmod); ! value = exec_cast_value(estate, value, valtype, valtypmod, ! var->datatype->typoid, &(var->datatype->typinput), var->datatype->typioparam, var->datatype->atttypmod, isnull); *************** exec_stmt_fori(PLpgSQL_execstate *estate *** 1911,1918 **** /* * Get the value of the upper bound */ ! value = exec_eval_expr(estate, stmt->upper, &isnull, &valtype); ! value = exec_cast_value(estate, value, valtype, var->datatype->typoid, &(var->datatype->typinput), var->datatype->typioparam, var->datatype->atttypmod, isnull); --- 1958,1967 ---- /* * Get the value of the upper bound */ ! value = exec_eval_expr(estate, stmt->upper, ! &isnull, &valtype, &valtypmod); ! value = exec_cast_value(estate, value, valtype, valtypmod, ! var->datatype->typoid, &(var->datatype->typinput), var->datatype->typioparam, var->datatype->atttypmod, isnull); *************** exec_stmt_fori(PLpgSQL_execstate *estate *** 1928,1935 **** */ if (stmt->step) { ! value = exec_eval_expr(estate, stmt->step, &isnull, &valtype); ! value = exec_cast_value(estate, value, valtype, var->datatype->typoid, &(var->datatype->typinput), var->datatype->typioparam, var->datatype->atttypmod, isnull); --- 1977,1986 ---- */ if (stmt->step) { ! value = exec_eval_expr(estate, stmt->step, ! &isnull, &valtype, &valtypmod); ! value = exec_cast_value(estate, value, valtype, valtypmod, ! var->datatype->typoid, &(var->datatype->typinput), var->datatype->typioparam, var->datatype->atttypmod, isnull); *************** exec_stmt_foreach_a(PLpgSQL_execstate *e *** 2227,2243 **** { ArrayType *arr; Oid arrtype; PLpgSQL_datum *loop_var; Oid loop_var_elem_type; bool found = false; int rc = PLPGSQL_RC_OK; ArrayIterator array_iterator; Oid iterator_result_type; Datum value; bool isnull; /* get the value of the array expression */ ! value = exec_eval_expr(estate, stmt->expr, &isnull, &arrtype); if (isnull) ereport(ERROR, (errcode(ERRCODE_NULL_VALUE_NOT_ALLOWED), --- 2278,2296 ---- { ArrayType *arr; Oid arrtype; + int32 arrtypmod; PLpgSQL_datum *loop_var; Oid loop_var_elem_type; bool found = false; int rc = PLPGSQL_RC_OK; ArrayIterator array_iterator; Oid iterator_result_type; + int32 iterator_result_typmod; Datum value; bool isnull; /* get the value of the array expression */ ! value = exec_eval_expr(estate, stmt->expr, &isnull, &arrtype, &arrtypmod); if (isnull) ereport(ERROR, (errcode(ERRCODE_NULL_VALUE_NOT_ALLOWED), *************** exec_stmt_foreach_a(PLpgSQL_execstate *e *** 2305,2315 **** --- 2358,2370 ---- { /* When slicing, nominal type of result is same as array type */ iterator_result_type = arrtype; + iterator_result_typmod = arrtypmod; } else { /* Without slicing, results are individual array elements */ iterator_result_type = ARR_ELEMTYPE(arr); + iterator_result_typmod = arrtypmod; } /* Iterate over the array elements or slices */ *************** exec_stmt_foreach_a(PLpgSQL_execstate *e *** 2318,2324 **** found = true; /* looped at least once */ /* Assign current element/slice to the loop variable */ ! exec_assign_value(estate, loop_var, value, iterator_result_type, &isnull); /* In slice case, value is temporary; must free it to avoid leakage */ --- 2373,2380 ---- found = true; /* looped at least once */ /* Assign current element/slice to the loop variable */ ! exec_assign_value(estate, loop_var, value, ! iterator_result_type, iterator_result_typmod, &isnull); /* In slice case, value is temporary; must free it to avoid leakage */ *************** exec_stmt_return(PLpgSQL_execstate *esta *** 2449,2454 **** --- 2505,2517 ---- * Special case path when the RETURN expression is a simple variable * reference; in particular, this path is always taken in functions with * one or more OUT parameters. + * + * This special case is especially efficient for returning variables that + * have R/W expanded values: we can put the R/W pointer directly into + * estate->retval, leading to transferring the value to the caller's + * context cheaply. If we went through exec_eval_expr we'd end up with a + * R/O pointer. It's okay to skip MakeExpandedObjectReadOnly here since + * we know we won't need the variable's value within the function anymore. */ if (stmt->retvarno >= 0) { *************** exec_stmt_return(PLpgSQL_execstate *esta *** 2503,2511 **** if (stmt->expr != NULL) { estate->retval = exec_eval_expr(estate, stmt->expr, &(estate->retisnull), ! &(estate->rettype)); if (estate->retistuple && !estate->retisnull) { --- 2566,2577 ---- if (stmt->expr != NULL) { + int32 rettypmod; + estate->retval = exec_eval_expr(estate, stmt->expr, &(estate->retisnull), ! &(estate->rettype), ! &rettypmod); if (estate->retistuple && !estate->retisnull) { *************** exec_stmt_return_next(PLpgSQL_execstate *** 2574,2579 **** --- 2640,2650 ---- * Special case path when the RETURN NEXT expression is a simple variable * reference; in particular, this path is always taken in functions with * one or more OUT parameters. + * + * Unlike exec_statement_return, there's no special win here for R/W + * expanded values, since they'll have to get flattened to go into the + * tuplestore. Indeed, we'd better make them R/O to avoid any risk of the + * casting step changing them in-place. */ if (stmt->retvarno >= 0) { *************** exec_stmt_return_next(PLpgSQL_execstate *** 2592,2601 **** --- 2663,2678 ---- (errcode(ERRCODE_DATATYPE_MISMATCH), errmsg("wrong result type supplied in RETURN NEXT"))); + /* let's be very paranoid about the cast step */ + retval = MakeExpandedObjectReadOnly(retval, + isNull, + var->datatype->typlen); + /* coerce type if needed */ retval = exec_simple_cast_value(estate, retval, var->datatype->typoid, + var->datatype->atttypmod, tupdesc->attrs[0]->atttypid, tupdesc->attrs[0]->atttypmod, isNull); *************** exec_stmt_return_next(PLpgSQL_execstate *** 2654,2664 **** Datum retval; bool isNull; Oid rettype; retval = exec_eval_expr(estate, stmt->expr, &isNull, ! &rettype); if (estate->retistuple) { --- 2731,2743 ---- Datum retval; bool isNull; Oid rettype; + int32 rettypmod; retval = exec_eval_expr(estate, stmt->expr, &isNull, ! &rettype, ! &rettypmod); if (estate->retistuple) { *************** exec_stmt_return_next(PLpgSQL_execstate *** 2719,2724 **** --- 2798,2804 ---- retval = exec_simple_cast_value(estate, retval, rettype, + rettypmod, tupdesc->attrs[0]->atttypid, tupdesc->attrs[0]->atttypmod, isNull); *************** exec_stmt_raise(PLpgSQL_execstate *estat *** 2924,2929 **** --- 3004,3010 ---- if (cp[0] == '%') { Oid paramtypeid; + int32 paramtypmod; Datum paramvalue; bool paramisnull; char *extval; *************** exec_stmt_raise(PLpgSQL_execstate *estat *** 2942,2948 **** paramvalue = exec_eval_expr(estate, (PLpgSQL_expr *) lfirst(current_param), ¶misnull, ! ¶mtypeid); if (paramisnull) extval = "<NULL>"; --- 3023,3030 ---- paramvalue = exec_eval_expr(estate, (PLpgSQL_expr *) lfirst(current_param), ¶misnull, ! ¶mtypeid, ! ¶mtypmod); if (paramisnull) extval = "<NULL>"; *************** exec_stmt_raise(PLpgSQL_execstate *estat *** 2972,2982 **** Datum optionvalue; bool optionisnull; Oid optiontypeid; char *extval; optionvalue = exec_eval_expr(estate, opt->expr, &optionisnull, ! &optiontypeid); if (optionisnull) ereport(ERROR, (errcode(ERRCODE_NULL_VALUE_NOT_ALLOWED), --- 3054,3066 ---- Datum optionvalue; bool optionisnull; Oid optiontypeid; + int32 optiontypmod; char *extval; optionvalue = exec_eval_expr(estate, opt->expr, &optionisnull, ! &optiontypeid, ! &optiontypmod); if (optionisnull) ereport(ERROR, (errcode(ERRCODE_NULL_VALUE_NOT_ALLOWED), *************** exec_stmt_dynexecute(PLpgSQL_execstate * *** 3480,3485 **** --- 3564,3570 ---- Datum query; bool isnull = false; Oid restype; + int32 restypmod; char *querystr; int exec_res; PreparedParamsData *ppd = NULL; *************** exec_stmt_dynexecute(PLpgSQL_execstate * *** 3488,3494 **** * First we evaluate the string expression after the EXECUTE keyword. Its * result is the querystring we have to execute. */ ! query = exec_eval_expr(estate, stmt->query, &isnull, &restype); if (isnull) ereport(ERROR, (errcode(ERRCODE_NULL_VALUE_NOT_ALLOWED), --- 3573,3579 ---- * First we evaluate the string expression after the EXECUTE keyword. Its * result is the querystring we have to execute. */ ! query = exec_eval_expr(estate, stmt->query, &isnull, &restype, &restypmod); if (isnull) ereport(ERROR, (errcode(ERRCODE_NULL_VALUE_NOT_ALLOWED), *************** exec_assign_expr(PLpgSQL_execstate *esta *** 3983,3992 **** { Datum value; Oid valtype; bool isnull = false; ! value = exec_eval_expr(estate, expr, &isnull, &valtype); ! exec_assign_value(estate, target, value, valtype, &isnull); exec_eval_cleanup(estate); } --- 4068,4078 ---- { Datum value; Oid valtype; + int32 valtypmod; bool isnull = false; ! value = exec_eval_expr(estate, expr, &isnull, &valtype, &valtypmod); ! exec_assign_value(estate, target, value, valtype, valtypmod, &isnull); exec_eval_cleanup(estate); } *************** exec_assign_c_string(PLpgSQL_execstate * *** 4009,4015 **** else value = cstring_to_text(""); exec_assign_value(estate, target, PointerGetDatum(value), ! TEXTOID, &isnull); pfree(value); } --- 4095,4101 ---- else value = cstring_to_text(""); exec_assign_value(estate, target, PointerGetDatum(value), ! TEXTOID, -1, &isnull); pfree(value); } *************** exec_assign_c_string(PLpgSQL_execstate * *** 4025,4031 **** static void exec_assign_value(PLpgSQL_execstate *estate, PLpgSQL_datum *target, ! Datum value, Oid valtype, bool *isNull) { switch (target->dtype) { --- 4111,4118 ---- static void exec_assign_value(PLpgSQL_execstate *estate, PLpgSQL_datum *target, ! Datum value, Oid valtype, int32 valtypmod, ! bool *isNull) { switch (target->dtype) { *************** exec_assign_value(PLpgSQL_execstate *est *** 4040,4045 **** --- 4127,4133 ---- newvalue = exec_cast_value(estate, value, valtype, + valtypmod, var->datatype->typoid, &(var->datatype->typinput), var->datatype->typioparam, *************** exec_assign_value(PLpgSQL_execstate *est *** 4055,4080 **** /* * If type is by-reference, copy the new value (which is * probably in the eval_econtext) into the procedure's memory ! * context. */ if (!var->datatype->typbyval && !*isNull) ! newvalue = datumCopy(newvalue, ! false, ! var->datatype->typlen); /* ! * Now free the old value. (We can't do this any earlier ! * because of the possibility that we are assigning the var's ! * old value to it, eg "foo := foo". We could optimize out ! * the assignment altogether in such cases, but it's too ! * infrequent to be worth testing for.) */ ! free_var(var); var->value = newvalue; var->isnull = *isNull; ! if (!var->datatype->typbyval && !*isNull) ! var->freeval = true; break; } --- 4143,4193 ---- /* * If type is by-reference, copy the new value (which is * probably in the eval_econtext) into the procedure's memory ! * context. But if it's a read/write reference to an expanded ! * object, no physical copy needs to happen; at most we need ! * to reparent the object's memory context. ! * ! * If it's an array, we force the value to be stored in R/W ! * expanded form. This wins if the function later does, say, ! * a lot of array subscripting operations on the variable, and ! * otherwise might lose. We might need to use a different ! * heuristic, but it's too soon to tell. Also, are there ! * cases where it'd be useful to force non-array values into ! * expanded form? */ if (!var->datatype->typbyval && !*isNull) ! { ! if (var->datatype->typisarray && ! !VARATT_IS_EXTERNAL_EXPANDED_RW(DatumGetPointer(newvalue))) ! { ! /* array and not already R/W, so apply expand_array */ ! newvalue = expand_array(newvalue, ! CurrentMemoryContext, ! NULL); ! } ! else ! { ! /* else transfer value if R/W, else just datumCopy */ ! newvalue = datumTransfer(newvalue, ! false, ! var->datatype->typlen); ! } ! } /* ! * Now free the old value, unless it's the same as the new ! * value (ie, we're doing "foo := foo"). Note that for ! * expanded objects, this test is necessary and cannot ! * reliably be made any earlier; we have to be looking at the ! * object's standard R/W pointer to be sure pointer equality ! * is meaningful. */ ! if (var->value != newvalue || var->isnull || *isNull) ! free_var(var); var->value = newvalue; var->isnull = *isNull; ! var->freeval = (!var->datatype->typbyval && !*isNull); break; } *************** exec_assign_value(PLpgSQL_execstate *est *** 4193,4198 **** --- 4306,4312 ---- values[fno] = exec_simple_cast_value(estate, value, valtype, + valtypmod, atttype, atttypmod, attisnull); *************** exec_assign_value(PLpgSQL_execstate *est *** 4271,4277 **** } while (target->dtype == PLPGSQL_DTYPE_ARRAYELEM); /* Fetch current value of array datum */ ! exec_eval_datum(estate, target, &parenttypoid, &parenttypmod, &oldarraydatum, &oldarrayisnull); --- 4385,4391 ---- } while (target->dtype == PLPGSQL_DTYPE_ARRAYELEM); /* Fetch current value of array datum */ ! exec_eval_datum(estate, target, true, &parenttypoid, &parenttypmod, &oldarraydatum, &oldarrayisnull); *************** exec_assign_value(PLpgSQL_execstate *est *** 4355,4360 **** --- 4469,4475 ---- coerced_value = exec_simple_cast_value(estate, value, valtype, + valtypmod, arrayelem->elemtypoid, arrayelem->arraytypmod, *isNull); *************** exec_assign_value(PLpgSQL_execstate *est *** 4403,4409 **** *isNull = false; exec_assign_value(estate, target, newarraydatum, ! arrayelem->arraytypoid, isNull); break; } --- 4518,4526 ---- *isNull = false; exec_assign_value(estate, target, newarraydatum, ! arrayelem->arraytypoid, ! arrayelem->arraytypmod, ! isNull); break; } *************** exec_assign_value(PLpgSQL_execstate *est *** 4417,4432 **** * * The type oid, typmod, value in Datum format, and null flag are returned. * * At present this doesn't handle PLpgSQL_expr or PLpgSQL_arrayelem datums. * ! * NOTE: caller must not modify the returned value, since it points right ! * at the stored value in the case of pass-by-reference datatypes. In some ! * cases we have to palloc a return value, and in such cases we put it into ! * the estate's short-term memory context. */ static void exec_eval_datum(PLpgSQL_execstate *estate, PLpgSQL_datum *datum, Oid *typeid, int32 *typetypmod, Datum *value, --- 4534,4557 ---- * * The type oid, typmod, value in Datum format, and null flag are returned. * + * If getrwpointer is TRUE, we'll return a R/W pointer to any variable that is + * an expanded object; otherwise we return a R/O pointer to such variables. + * * At present this doesn't handle PLpgSQL_expr or PLpgSQL_arrayelem datums. * ! * NOTE: the returned Datum points right at the stored value in the case of ! * pass-by-reference datatypes. Generally callers should take care not to ! * modify the stored value. Some callers intentionally manipulate variables ! * referenced by R/W expanded pointers, though; it is those callers' ! * responsibility that the results are semantically OK. ! * ! * In some cases we have to palloc a return value, and in such cases we put ! * it into the estate's short-term memory context. */ static void exec_eval_datum(PLpgSQL_execstate *estate, PLpgSQL_datum *datum, + bool getrwpointer, Oid *typeid, int32 *typetypmod, Datum *value, *************** exec_eval_datum(PLpgSQL_execstate *estat *** 4442,4448 **** *typeid = var->datatype->typoid; *typetypmod = var->datatype->atttypmod; ! *value = var->value; *isnull = var->isnull; break; } --- 4567,4578 ---- *typeid = var->datatype->typoid; *typetypmod = var->datatype->atttypmod; ! if (getrwpointer) ! *value = var->value; ! else ! *value = MakeExpandedObjectReadOnly(var->value, ! var->isnull, ! var->datatype->typlen); *isnull = var->isnull; break; } *************** exec_eval_integer(PLpgSQL_execstate *est *** 4724,4732 **** { Datum exprdatum; Oid exprtypeid; ! exprdatum = exec_eval_expr(estate, expr, isNull, &exprtypeid); ! exprdatum = exec_simple_cast_value(estate, exprdatum, exprtypeid, INT4OID, -1, *isNull); return DatumGetInt32(exprdatum); --- 4854,4864 ---- { Datum exprdatum; Oid exprtypeid; + int32 exprtypmod; ! exprdatum = exec_eval_expr(estate, expr, isNull, &exprtypeid, &exprtypmod); ! exprdatum = exec_simple_cast_value(estate, exprdatum, ! exprtypeid, exprtypmod, INT4OID, -1, *isNull); return DatumGetInt32(exprdatum); *************** exec_eval_boolean(PLpgSQL_execstate *est *** 4746,4754 **** { Datum exprdatum; Oid exprtypeid; ! exprdatum = exec_eval_expr(estate, expr, isNull, &exprtypeid); ! exprdatum = exec_simple_cast_value(estate, exprdatum, exprtypeid, BOOLOID, -1, *isNull); return DatumGetBool(exprdatum); --- 4878,4888 ---- { Datum exprdatum; Oid exprtypeid; + int32 exprtypmod; ! exprdatum = exec_eval_expr(estate, expr, isNull, &exprtypeid, &exprtypmod); ! exprdatum = exec_simple_cast_value(estate, exprdatum, ! exprtypeid, exprtypmod, BOOLOID, -1, *isNull); return DatumGetBool(exprdatum); *************** exec_eval_boolean(PLpgSQL_execstate *est *** 4756,4762 **** /* ---------- * exec_eval_expr Evaluate an expression and return ! * the result Datum. * * NOTE: caller must do exec_eval_cleanup when done with the Datum. * ---------- --- 4890,4896 ---- /* ---------- * exec_eval_expr Evaluate an expression and return ! * the result Datum, along with data type/typmod. * * NOTE: caller must do exec_eval_cleanup when done with the Datum. * ---------- *************** static Datum *** 4765,4771 **** exec_eval_expr(PLpgSQL_execstate *estate, PLpgSQL_expr *expr, bool *isNull, ! Oid *rettype) { Datum result = 0; int rc; --- 4899,4906 ---- exec_eval_expr(PLpgSQL_execstate *estate, PLpgSQL_expr *expr, bool *isNull, ! Oid *rettype, ! int32 *rettypmod) { Datum result = 0; int rc; *************** exec_eval_expr(PLpgSQL_execstate *estate *** 4780,4786 **** * If this is a simple expression, bypass SPI and use the executor * directly */ ! if (exec_eval_simple_expr(estate, expr, &result, isNull, rettype)) return result; /* --- 4915,4922 ---- * If this is a simple expression, bypass SPI and use the executor * directly */ ! if (exec_eval_simple_expr(estate, expr, ! &result, isNull, rettype, rettypmod)) return result; /* *************** exec_eval_expr(PLpgSQL_execstate *estate *** 4807,4813 **** /* * ... and get the column's datatype. */ ! *rettype = SPI_gettypeid(estate->eval_tuptable->tupdesc, 1); /* * If there are no rows selected, the result is a NULL of that type. --- 4943,4950 ---- /* * ... and get the column's datatype. */ ! *rettype = estate->eval_tuptable->tupdesc->attrs[0]->atttypid; ! *rettypmod = estate->eval_tuptable->tupdesc->attrs[0]->atttypmod; /* * If there are no rows selected, the result is a NULL of that type. *************** loop_exit: *** 5060,5067 **** * exec_eval_simple_expr - Evaluate a simple expression returning * a Datum by directly calling ExecEvalExpr(). * ! * If successful, store results into *result, *isNull, *rettype and return ! * TRUE. If the expression cannot be handled by simple evaluation, * return FALSE. * * Because we only store one execution tree for a simple expression, we --- 5197,5204 ---- * exec_eval_simple_expr - Evaluate a simple expression returning * a Datum by directly calling ExecEvalExpr(). * ! * If successful, store results into *result, *isNull, *rettype, *rettypmod ! * and return TRUE. If the expression cannot be handled by simple evaluation, * return FALSE. * * Because we only store one execution tree for a simple expression, we *************** exec_eval_simple_expr(PLpgSQL_execstate *** 5092,5098 **** PLpgSQL_expr *expr, Datum *result, bool *isNull, ! Oid *rettype) { ExprContext *econtext = estate->eval_econtext; LocalTransactionId curlxid = MyProc->lxid; --- 5229,5236 ---- PLpgSQL_expr *expr, Datum *result, bool *isNull, ! Oid *rettype, ! int32 *rettypmod) { ExprContext *econtext = estate->eval_econtext; LocalTransactionId curlxid = MyProc->lxid; *************** exec_eval_simple_expr(PLpgSQL_execstate *** 5142,5147 **** --- 5280,5286 ---- * Pass back previously-determined result type. */ *rettype = expr->expr_simple_type; + *rettypmod = expr->expr_simple_typmod; /* * Prepare the expression for execution, if it's not been done already in *************** setup_param_list(PLpgSQL_execstate *esta *** 5278,5284 **** PLpgSQL_var *var = (PLpgSQL_var *) datum; ParamExternData *prm = ¶mLI->params[dno]; ! prm->value = var->value; prm->isnull = var->isnull; prm->pflags = PARAM_FLAG_CONST; prm->ptype = var->datatype->typoid; --- 5417,5425 ---- PLpgSQL_var *var = (PLpgSQL_var *) datum; ParamExternData *prm = ¶mLI->params[dno]; ! prm->value = MakeExpandedObjectReadOnly(var->value, ! var->isnull, ! var->datatype->typlen); prm->isnull = var->isnull; prm->pflags = PARAM_FLAG_CONST; prm->ptype = var->datatype->typoid; *************** plpgsql_param_fetch(ParamListInfo params *** 5344,5350 **** /* OK, evaluate the value and store into the appropriate paramlist slot */ datum = estate->datums[dno]; prm = ¶ms->params[dno]; ! exec_eval_datum(estate, datum, &prm->ptype, &prmtypmod, &prm->value, &prm->isnull); } --- 5485,5491 ---- /* OK, evaluate the value and store into the appropriate paramlist slot */ datum = estate->datums[dno]; prm = ¶ms->params[dno]; ! exec_eval_datum(estate, datum, false, &prm->ptype, &prmtypmod, &prm->value, &prm->isnull); } *************** exec_move_row(PLpgSQL_execstate *estate, *** 5457,5462 **** --- 5598,5604 ---- Datum value; bool isnull; Oid valtype; + int32 valtypmod; if (row->varnos[fnum] < 0) continue; /* skip dropped column in row struct */ *************** exec_move_row(PLpgSQL_execstate *estate, *** 5475,5481 **** value = (Datum) 0; isnull = true; } ! valtype = SPI_gettypeid(tupdesc, anum + 1); anum++; } else --- 5617,5624 ---- value = (Datum) 0; isnull = true; } ! valtype = tupdesc->attrs[anum]->atttypid; ! valtypmod = tupdesc->attrs[anum]->atttypmod; anum++; } else *************** exec_move_row(PLpgSQL_execstate *estate, *** 5488,5497 **** * about the type of a source NULL */ valtype = InvalidOid; } exec_assign_value(estate, (PLpgSQL_datum *) var, ! value, valtype, &isnull); } return; --- 5631,5641 ---- * about the type of a source NULL */ valtype = InvalidOid; + valtypmod = -1; } exec_assign_value(estate, (PLpgSQL_datum *) var, ! value, valtype, valtypmod, &isnull); } return; *************** make_tuple_from_row(PLpgSQL_execstate *e *** 5536,5542 **** if (row->varnos[i] < 0) /* should not happen */ elog(ERROR, "dropped rowtype entry for non-dropped column"); ! exec_eval_datum(estate, estate->datums[row->varnos[i]], &fieldtypeid, &fieldtypmod, &dvalues[i], &nulls[i]); if (fieldtypeid != tupdesc->attrs[i]->atttypid) --- 5680,5686 ---- if (row->varnos[i] < 0) /* should not happen */ elog(ERROR, "dropped rowtype entry for non-dropped column"); ! exec_eval_datum(estate, estate->datums[row->varnos[i]], false, &fieldtypeid, &fieldtypmod, &dvalues[i], &nulls[i]); if (fieldtypeid != tupdesc->attrs[i]->atttypid) *************** convert_value_to_string(PLpgSQL_execstat *** 5675,5681 **** */ static Datum exec_cast_value(PLpgSQL_execstate *estate, ! Datum value, Oid valtype, Oid reqtype, FmgrInfo *reqinput, Oid reqtypioparam, --- 5819,5825 ---- */ static Datum exec_cast_value(PLpgSQL_execstate *estate, ! Datum value, Oid valtype, int32 valtypmod, Oid reqtype, FmgrInfo *reqinput, Oid reqtypioparam, *************** exec_cast_value(PLpgSQL_execstate *estat *** 5685,5691 **** /* * If the type of the given value isn't what's requested, convert it. */ ! if (valtype != reqtype || reqtypmod != -1) { MemoryContext oldcontext; --- 5829,5836 ---- /* * If the type of the given value isn't what's requested, convert it. */ ! if (valtype != reqtype || ! (valtypmod != reqtypmod && reqtypmod != -1)) { MemoryContext oldcontext; *************** exec_cast_value(PLpgSQL_execstate *estat *** 5719,5729 **** */ static Datum exec_simple_cast_value(PLpgSQL_execstate *estate, ! Datum value, Oid valtype, Oid reqtype, int32 reqtypmod, bool isnull) { ! if (valtype != reqtype || reqtypmod != -1) { Oid typinput; Oid typioparam; --- 5864,5875 ---- */ static Datum exec_simple_cast_value(PLpgSQL_execstate *estate, ! Datum value, Oid valtype, int32 valtypmod, Oid reqtype, int32 reqtypmod, bool isnull) { ! if (valtype != reqtype || ! (valtypmod != reqtypmod && reqtypmod != -1)) { Oid typinput; Oid typioparam; *************** exec_simple_cast_value(PLpgSQL_execstate *** 5736,5741 **** --- 5882,5888 ---- value = exec_cast_value(estate, value, valtype, + valtypmod, reqtype, &finfo_input, typioparam, *************** exec_simple_recheck_plan(PLpgSQL_expr *e *** 6171,6176 **** --- 6318,6324 ---- expr->expr_simple_lxid = InvalidLocalTransactionId; /* Also stash away the expression result type */ expr->expr_simple_type = exprType((Node *) tle->expr); + expr->expr_simple_typmod = exprTypmod((Node *) tle->expr); } /* ---------- *************** free_var(PLpgSQL_var *var) *** 6329,6335 **** { if (var->freeval) { ! pfree(DatumGetPointer(var->value)); var->freeval = false; } } --- 6477,6488 ---- { if (var->freeval) { ! if (DatumIsReadWriteExpandedObject(var->value, ! var->isnull, ! var->datatype->typlen)) ! DeleteExpandedObject(var->value); ! else ! pfree(DatumGetPointer(var->value)); var->freeval = false; } } *************** exec_eval_using_params(PLpgSQL_execstate *** 6371,6380 **** { PLpgSQL_expr *param = (PLpgSQL_expr *) lfirst(lc); bool isnull; ppd->values[i] = exec_eval_expr(estate, param, &isnull, ! &ppd->types[i]); ppd->nulls[i] = isnull ? 'n' : ' '; ppd->freevals[i] = false; --- 6524,6535 ---- { PLpgSQL_expr *param = (PLpgSQL_expr *) lfirst(lc); bool isnull; + int32 ppdtypmod; ppd->values[i] = exec_eval_expr(estate, param, &isnull, ! &ppd->types[i], ! &ppdtypmod); ppd->nulls[i] = isnull ? 'n' : ' '; ppd->freevals[i] = false; *************** exec_dynquery_with_params(PLpgSQL_execst *** 6452,6464 **** Datum query; bool isnull; Oid restype; char *querystr; /* * Evaluate the string expression after the EXECUTE keyword. Its result is * the querystring we have to execute. */ ! query = exec_eval_expr(estate, dynquery, &isnull, &restype); if (isnull) ereport(ERROR, (errcode(ERRCODE_NULL_VALUE_NOT_ALLOWED), --- 6607,6620 ---- Datum query; bool isnull; Oid restype; + int32 restypmod; char *querystr; /* * Evaluate the string expression after the EXECUTE keyword. Its result is * the querystring we have to execute. */ ! query = exec_eval_expr(estate, dynquery, &isnull, &restype, &restypmod); if (isnull) ereport(ERROR, (errcode(ERRCODE_NULL_VALUE_NOT_ALLOWED), *************** format_expr_params(PLpgSQL_execstate *es *** 6536,6543 **** curvar = (PLpgSQL_var *) estate->datums[dno]; ! exec_eval_datum(estate, (PLpgSQL_datum *) curvar, ¶mtypeid, ! ¶mtypmod, ¶mdatum, ¶misnull); appendStringInfo(¶mstr, "%s%s = ", paramno > 0 ? ", " : "", --- 6692,6700 ---- curvar = (PLpgSQL_var *) estate->datums[dno]; ! exec_eval_datum(estate, (PLpgSQL_datum *) curvar, false, ! ¶mtypeid, ¶mtypmod, ! ¶mdatum, ¶misnull); appendStringInfo(¶mstr, "%s%s = ", paramno > 0 ? ", " : "", diff --git a/src/pl/plpgsql/src/plpgsql.h b/src/pl/plpgsql/src/plpgsql.h index 337b989..43e4037 100644 *** a/src/pl/plpgsql/src/plpgsql.h --- b/src/pl/plpgsql/src/plpgsql.h *************** typedef struct *** 180,185 **** --- 180,186 ---- bool typbyval; Oid typrelid; Oid typioparam; + bool typisarray; /* is "true" array, or domain over one */ Oid collation; /* from pg_type, but can be overridden */ FmgrInfo typinput; /* lookup info for typinput function */ int32 atttypmod; /* typmod (taken from someplace else) */ *************** typedef struct PLpgSQL_expr *** 226,231 **** --- 227,233 ---- Expr *expr_simple_expr; /* NULL means not a simple expr */ int expr_simple_generation; /* plancache generation we checked */ Oid expr_simple_type; /* result type Oid, if simple */ + int32 expr_simple_typmod; /* result typmod, if simple */ /* * if expr is simple AND prepared in current transaction,
I wrote: > [ expanded-arrays-1.0.patch ] This is overdue for a rebase; attached. No functional changes, but some of what was in the original patch has already been merged, and other parts were superseded. regards, tom lane diff --git a/doc/src/sgml/storage.sgml b/doc/src/sgml/storage.sgml index d8c5287..e5b7b4b 100644 *** a/doc/src/sgml/storage.sgml --- b/doc/src/sgml/storage.sgml *************** comparison table, in which all the HTML *** 503,510 **** <acronym>TOAST</> pointers can point to data that is not on disk, but is elsewhere in the memory of the current server process. Such pointers obviously cannot be long-lived, but they are nonetheless useful. There ! is currently just one sub-case: ! pointers to <firstterm>indirect</> data. </para> <para> --- 503,511 ---- <acronym>TOAST</> pointers can point to data that is not on disk, but is elsewhere in the memory of the current server process. Such pointers obviously cannot be long-lived, but they are nonetheless useful. There ! are currently two sub-cases: ! pointers to <firstterm>indirect</> data and ! pointers to <firstterm>expanded</> data. </para> <para> *************** and there is no infrastructure to help w *** 519,524 **** --- 520,562 ---- </para> <para> + Expanded <acronym>TOAST</> pointers are useful for complex data types + whose on-disk representation is not especially suited for computational + purposes. As an example, the standard varlena representation of a + <productname>PostgreSQL</> array includes dimensionality information, a + nulls bitmap if there are any null elements, then the values of all the + elements in order. When the element type itself is variable-length, the + only way to find the <replaceable>N</>'th element is to scan through all the + preceding elements. This representation is appropriate for on-disk storage + because of its compactness, but for computations with the array it's much + nicer to have an <quote>expanded</> or <quote>deconstructed</> + representation in which all the element starting locations have been + identified. The <acronym>TOAST</> pointer mechanism supports this need by + allowing a pass-by-reference Datum to point to either a standard varlena + value (the on-disk representation) or a <acronym>TOAST</> pointer that + points to an expanded representation somewhere in memory. The details of + this expanded representation are up to the data type, though it must have + a standard header and meet the other API requirements given + in <filename>src/include/utils/expandeddatum.h</>. C-level functions + working with the data type can choose to handle either representation. + Functions that do not know about the expanded representation, but simply + apply <function>PG_DETOAST_DATUM</> to their inputs, will automatically + receive the traditional varlena representation; so support for an expanded + representation can be introduced incrementally, one function at a time. + </para> + + <para> + <acronym>TOAST</> pointers to expanded values are further broken down + into <firstterm>read-write</> and <firstterm>read-only</> pointers. + The pointed-to representation is the same either way, but a function that + receives a read-write pointer is allowed to modify the referenced value + in-place, whereas one that receives a read-only pointer must not; it must + first create a copy if it wants to make a modified version of the value. + This distinction and some associated conventions make it possible to avoid + unnecessary copying of expanded values during query execution. + </para> + + <para> For all types of in-memory <acronym>TOAST</> pointer, the <acronym>TOAST</> management code ensures that no such pointer datum can accidentally get stored on disk. In-memory <acronym>TOAST</> pointers are automatically diff --git a/doc/src/sgml/xtypes.sgml b/doc/src/sgml/xtypes.sgml index 2459616..ac0b8a2 100644 *** a/doc/src/sgml/xtypes.sgml --- b/doc/src/sgml/xtypes.sgml *************** CREATE TYPE complex ( *** 300,305 **** --- 300,376 ---- </para> </note> + <para> + Another feature that's enabled by <acronym>TOAST</> support is the + possibility of having an <firstterm>expanded</> in-memory data + representation that is more convenient to work with than the format that + is stored on disk. The regular or <quote>flat</> varlena storage format + is ultimately just a blob of bytes; it cannot for example contain + pointers, since it may get copied to other locations in memory. + For complex data types, the flat format may be quite expensive to work + with, so <productname>PostgreSQL</> provides a way to <quote>expand</> + the flat format into a representation that is more suited to computation, + and then pass that format in-memory between functions of the data type. + </para> + + <para> + To use expanded storage, a data type must define an expanded format that + follows the rules given in <filename>src/include/utils/expandeddatum.h</>, + and provide functions to <quote>expand</> a flat varlena value into + expanded format and <quote>flatten</> the expanded format back to the + regular varlena representation. Then ensure that all C functions for + the data type can accept either representation, possibly by converting + one into the other immediately upon receipt. This does not require fixing + all existing functions for the data type at once, because the standard + <function>PG_DETOAST_DATUM</> macro is defined to convert expanded inputs + into regular flat format. Therefore, existing functions that work with + the flat varlena format will continue to work, though slightly + inefficiently, with expanded inputs; they need not be converted until and + unless better performance is important. + </para> + + <para> + C functions that know how to work with an expanded representation + typically fall into two categories: those that can only handle expanded + format, and those that can handle either expanded or flat varlena inputs. + The former are easier to write but may be less efficient overall, because + converting a flat input to expanded form for use by a single function may + cost more than is saved by operating on the expanded format. + When only expanded format need be handled, conversion of flat inputs to + expanded form can be hidden inside an argument-fetching macro, so that + the function appears no more complex than one working with traditional + varlena input. + To handle both types of input, write an argument-fetching function that + will detoast external, short-header, and compressed varlena inputs, but + not expanded inputs. Such a function can be defined as returning a + pointer to a union of the flat varlena format and the expanded format. + Callers can use the <function>VARATT_IS_EXPANDED_HEADER()</> macro to + determine which format they received. + </para> + + <para> + The <acronym>TOAST</> infrastructure not only allows regular varlena + values to be distinguished from expanded values, but also + distinguishes <quote>read-write</> and <quote>read-only</> pointers to + expanded values. C functions that only need to examine an expanded + value, or will only change it in safe and non-semantically-visible ways, + need not care which type of pointer they receive. C functions that + produce a modified version of an input value are allowed to modify an + expanded input value in-place if they receive a read-write pointer, but + must not modify the input if they receive a read-only pointer; in that + case they have to copy the value first, producing a new value to modify. + A C function that has constructed a new expanded value should always + return a read-write pointer to it. Also, a C function that is modifying + a read-write expanded value in-place should take care to leave the value + in a sane state if it fails partway through. + </para> + + <para> + For examples of working with expanded values, see the standard array + infrastructure, particularly + <filename>src/backend/utils/adt/array_expanded.c</>. + </para> + </sect2> </sect1> diff --git a/src/backend/access/common/heaptuple.c b/src/backend/access/common/heaptuple.c index 6cd4e8e..de7f02f 100644 *** a/src/backend/access/common/heaptuple.c --- b/src/backend/access/common/heaptuple.c *************** *** 60,65 **** --- 60,66 ---- #include "access/sysattr.h" #include "access/tuptoaster.h" #include "executor/tuptable.h" + #include "utils/expandeddatum.h" /* Does att's datatype allow packing into the 1-byte-header varlena format? */ *************** heap_compute_data_size(TupleDesc tupleDe *** 93,105 **** for (i = 0; i < numberOfAttributes; i++) { Datum val; if (isnull[i]) continue; val = values[i]; ! if (ATT_IS_PACKABLE(att[i]) && VARATT_CAN_MAKE_SHORT(DatumGetPointer(val))) { /* --- 94,108 ---- for (i = 0; i < numberOfAttributes; i++) { Datum val; + Form_pg_attribute atti; if (isnull[i]) continue; val = values[i]; + atti = att[i]; ! if (ATT_IS_PACKABLE(atti) && VARATT_CAN_MAKE_SHORT(DatumGetPointer(val))) { /* *************** heap_compute_data_size(TupleDesc tupleDe *** 108,118 **** */ data_length += VARATT_CONVERTED_SHORT_SIZE(DatumGetPointer(val)); } else { ! data_length = att_align_datum(data_length, att[i]->attalign, ! att[i]->attlen, val); ! data_length = att_addlength_datum(data_length, att[i]->attlen, val); } } --- 111,131 ---- */ data_length += VARATT_CONVERTED_SHORT_SIZE(DatumGetPointer(val)); } + else if (atti->attlen == -1 && + VARATT_IS_EXTERNAL_EXPANDED(DatumGetPointer(val))) + { + /* + * we want to flatten the expanded value so that the constructed + * tuple doesn't depend on it + */ + data_length = att_align_nominal(data_length, atti->attalign); + data_length += EOH_get_flat_size(DatumGetEOHP(val)); + } else { ! data_length = att_align_datum(data_length, atti->attalign, ! atti->attlen, val); ! data_length = att_addlength_datum(data_length, atti->attlen, val); } } *************** heap_fill_tuple(TupleDesc tupleDesc, *** 203,212 **** *infomask |= HEAP_HASVARWIDTH; if (VARATT_IS_EXTERNAL(val)) { ! *infomask |= HEAP_HASEXTERNAL; ! /* no alignment, since it's short by definition */ ! data_length = VARSIZE_EXTERNAL(val); ! memcpy(data, val, data_length); } else if (VARATT_IS_SHORT(val)) { --- 216,241 ---- *infomask |= HEAP_HASVARWIDTH; if (VARATT_IS_EXTERNAL(val)) { ! if (VARATT_IS_EXTERNAL_EXPANDED(val)) ! { ! /* ! * we want to flatten the expanded value so that the ! * constructed tuple doesn't depend on it ! */ ! ExpandedObjectHeader *eoh = DatumGetEOHP(values[i]); ! ! data = (char *) att_align_nominal(data, ! att[i]->attalign); ! data_length = EOH_get_flat_size(eoh); ! EOH_flatten_into(eoh, data, data_length); ! } ! else ! { ! *infomask |= HEAP_HASEXTERNAL; ! /* no alignment, since it's short by definition */ ! data_length = VARSIZE_EXTERNAL(val); ! memcpy(data, val, data_length); ! } } else if (VARATT_IS_SHORT(val)) { diff --git a/src/backend/access/heap/tuptoaster.c b/src/backend/access/heap/tuptoaster.c index 8464e87..c3ebbef 100644 *** a/src/backend/access/heap/tuptoaster.c --- b/src/backend/access/heap/tuptoaster.c *************** *** 37,42 **** --- 37,43 ---- #include "catalog/catalog.h" #include "common/pg_lzcompress.h" #include "miscadmin.h" + #include "utils/expandeddatum.h" #include "utils/fmgroids.h" #include "utils/rel.h" #include "utils/typcache.h" *************** heap_tuple_fetch_attr(struct varlena * a *** 130,135 **** --- 131,149 ---- result = (struct varlena *) palloc(VARSIZE_ANY(attr)); memcpy(result, attr, VARSIZE_ANY(attr)); } + else if (VARATT_IS_EXTERNAL_EXPANDED(attr)) + { + /* + * This is an expanded-object pointer --- get flat format + */ + ExpandedObjectHeader *eoh; + Size resultsize; + + eoh = DatumGetEOHP(PointerGetDatum(attr)); + resultsize = EOH_get_flat_size(eoh); + result = (struct varlena *) palloc(resultsize); + EOH_flatten_into(eoh, (void *) result, resultsize); + } else { /* *************** heap_tuple_untoast_attr(struct varlena * *** 196,201 **** --- 210,224 ---- attr = result; } } + else if (VARATT_IS_EXTERNAL_EXPANDED(attr)) + { + /* + * This is an expanded-object pointer --- get flat format + */ + attr = heap_tuple_fetch_attr(attr); + /* flatteners are not allowed to produce compressed/short output */ + Assert(!VARATT_IS_EXTENDED(attr)); + } else if (VARATT_IS_COMPRESSED(attr)) { /* *************** heap_tuple_untoast_attr_slice(struct var *** 263,268 **** --- 286,296 ---- return heap_tuple_untoast_attr_slice(redirect.pointer, sliceoffset, slicelength); } + else if (VARATT_IS_EXTERNAL_EXPANDED(attr)) + { + /* pass it off to heap_tuple_fetch_attr to flatten */ + preslice = heap_tuple_fetch_attr(attr); + } else preslice = attr; *************** toast_raw_datum_size(Datum value) *** 344,349 **** --- 372,381 ---- return toast_raw_datum_size(PointerGetDatum(toast_pointer.pointer)); } + else if (VARATT_IS_EXTERNAL_EXPANDED(attr)) + { + result = EOH_get_flat_size(DatumGetEOHP(value)); + } else if (VARATT_IS_COMPRESSED(attr)) { /* here, va_rawsize is just the payload size */ *************** toast_datum_size(Datum value) *** 400,405 **** --- 432,441 ---- return toast_datum_size(PointerGetDatum(toast_pointer.pointer)); } + else if (VARATT_IS_EXTERNAL_EXPANDED(attr)) + { + result = EOH_get_flat_size(DatumGetEOHP(value)); + } else if (VARATT_IS_SHORT(attr)) { result = VARSIZE_SHORT(attr); diff --git a/src/backend/executor/execQual.c b/src/backend/executor/execQual.c index d94fe58..e599411 100644 *** a/src/backend/executor/execQual.c --- b/src/backend/executor/execQual.c *************** ExecEvalArrayCoerceExpr(ArrayCoerceExprS *** 4248,4254 **** { ArrayCoerceExpr *acoerce = (ArrayCoerceExpr *) astate->xprstate.expr; Datum result; - ArrayType *array; FunctionCallInfoData locfcinfo; result = ExecEvalExpr(astate->arg, econtext, isNull, isDone); --- 4248,4253 ---- *************** ExecEvalArrayCoerceExpr(ArrayCoerceExprS *** 4265,4278 **** if (!OidIsValid(acoerce->elemfuncid)) { /* Detoast input array if necessary, and copy in any case */ ! array = DatumGetArrayTypePCopy(result); ARR_ELEMTYPE(array) = astate->resultelemtype; PG_RETURN_ARRAYTYPE_P(array); } - /* Detoast input array if necessary, but don't make a useless copy */ - array = DatumGetArrayTypeP(result); - /* Initialize function cache if first time through */ if (astate->elemfunc.fn_oid == InvalidOid) { --- 4264,4275 ---- if (!OidIsValid(acoerce->elemfuncid)) { /* Detoast input array if necessary, and copy in any case */ ! ArrayType *array = DatumGetArrayTypePCopy(result); ! ARR_ELEMTYPE(array) = astate->resultelemtype; PG_RETURN_ARRAYTYPE_P(array); } /* Initialize function cache if first time through */ if (astate->elemfunc.fn_oid == InvalidOid) { *************** ExecEvalArrayCoerceExpr(ArrayCoerceExprS *** 4302,4316 **** */ InitFunctionCallInfoData(locfcinfo, &(astate->elemfunc), 3, InvalidOid, NULL, NULL); ! locfcinfo.arg[0] = PointerGetDatum(array); locfcinfo.arg[1] = Int32GetDatum(acoerce->resulttypmod); locfcinfo.arg[2] = BoolGetDatum(acoerce->isExplicit); locfcinfo.argnull[0] = false; locfcinfo.argnull[1] = false; locfcinfo.argnull[2] = false; ! return array_map(&locfcinfo, ARR_ELEMTYPE(array), astate->resultelemtype, ! astate->amstate); } /* ---------------------------------------------------------------- --- 4299,4312 ---- */ InitFunctionCallInfoData(locfcinfo, &(astate->elemfunc), 3, InvalidOid, NULL, NULL); ! locfcinfo.arg[0] = result; locfcinfo.arg[1] = Int32GetDatum(acoerce->resulttypmod); locfcinfo.arg[2] = BoolGetDatum(acoerce->isExplicit); locfcinfo.argnull[0] = false; locfcinfo.argnull[1] = false; locfcinfo.argnull[2] = false; ! return array_map(&locfcinfo, astate->resultelemtype, astate->amstate); } /* ---------------------------------------------------------------- diff --git a/src/backend/executor/execTuples.c b/src/backend/executor/execTuples.c index 753754d..a05d8b1 100644 *** a/src/backend/executor/execTuples.c --- b/src/backend/executor/execTuples.c *************** *** 88,93 **** --- 88,94 ---- #include "nodes/nodeFuncs.h" #include "storage/bufmgr.h" #include "utils/builtins.h" + #include "utils/expandeddatum.h" #include "utils/lsyscache.h" #include "utils/typcache.h" *************** ExecCopySlot(TupleTableSlot *dstslot, Tu *** 812,817 **** --- 813,864 ---- return ExecStoreTuple(newTuple, dstslot, InvalidBuffer, true); } + /* -------------------------------- + * ExecMakeSlotContentsReadOnly + * Mark any R/W expanded datums in the slot as read-only. + * + * This is needed when a slot that might contain R/W datum references is to be + * used as input for general expression evaluation. Since the expression(s) + * might contain more than one Var referencing the same R/W datum, we could + * get wrong answers if functions acting on those Vars thought they could + * modify the expanded value in-place. + * + * For notational reasons, we return the same slot passed in. + * -------------------------------- + */ + TupleTableSlot * + ExecMakeSlotContentsReadOnly(TupleTableSlot *slot) + { + /* + * sanity checks + */ + Assert(slot != NULL); + Assert(slot->tts_tupleDescriptor != NULL); + Assert(!slot->tts_isempty); + + /* + * If the slot contains a physical tuple, it can't contain any expanded + * datums, because we flatten those when making a physical tuple. This + * might change later; but for now, we need do nothing unless the slot is + * virtual. + */ + if (slot->tts_tuple == NULL) + { + Form_pg_attribute *att = slot->tts_tupleDescriptor->attrs; + int attnum; + + for (attnum = 0; attnum < slot->tts_nvalid; attnum++) + { + slot->tts_values[attnum] = + MakeExpandedObjectReadOnly(slot->tts_values[attnum], + slot->tts_isnull[attnum], + att[attnum]->attlen); + } + } + + return slot; + } + /* ---------------------------------------------------------------- * convenience initialization routines diff --git a/src/backend/executor/nodeSubqueryscan.c b/src/backend/executor/nodeSubqueryscan.c index 3f66e24..e5d1e54 100644 *** a/src/backend/executor/nodeSubqueryscan.c --- b/src/backend/executor/nodeSubqueryscan.c *************** SubqueryNext(SubqueryScanState *node) *** 56,62 **** --- 56,70 ---- * We just return the subplan's result slot, rather than expending extra * cycles for ExecCopySlot(). (Our own ScanTupleSlot is used only for * EvalPlanQual rechecks.) + * + * We do need to mark the slot contents read-only to prevent interference + * between different functions reading the same datum from the slot. It's + * a bit hokey to do this to the subplan's slot, but should be safe + * enough. */ + if (!TupIsNull(slot)) + slot = ExecMakeSlotContentsReadOnly(slot); + return slot; } diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c index b3c0502..54b5f2c 100644 *** a/src/backend/executor/spi.c --- b/src/backend/executor/spi.c *************** SPI_pfree(void *pointer) *** 1014,1019 **** --- 1014,1040 ---- pfree(pointer); } + Datum + SPI_datumTransfer(Datum value, bool typByVal, int typLen) + { + MemoryContext oldcxt = NULL; + Datum result; + + if (_SPI_curid + 1 == _SPI_connected) /* connected */ + { + if (_SPI_current != &(_SPI_stack[_SPI_curid + 1])) + elog(ERROR, "SPI stack corrupted"); + oldcxt = MemoryContextSwitchTo(_SPI_current->savedcxt); + } + + result = datumTransfer(value, typByVal, typLen); + + if (oldcxt) + MemoryContextSwitchTo(oldcxt); + + return result; + } + void SPI_freetuple(HeapTuple tuple) { diff --git a/src/backend/utils/adt/Makefile b/src/backend/utils/adt/Makefile index 20e5ff1..d1ed33f 100644 *** a/src/backend/utils/adt/Makefile --- b/src/backend/utils/adt/Makefile *************** endif *** 16,25 **** endif # keep this list arranged alphabetically or it gets to be a mess ! OBJS = acl.o arrayfuncs.o array_selfuncs.o array_typanalyze.o \ ! array_userfuncs.o arrayutils.o ascii.o bool.o \ ! cash.o char.o date.o datetime.o datum.o dbsize.o domains.o \ ! encode.o enum.o float.o format_type.o formatting.o genfile.o \ geo_ops.o geo_selfuncs.o inet_cidr_ntop.o inet_net_pton.o int.o \ int8.o json.o jsonb.o jsonb_gin.o jsonb_op.o jsonb_util.o \ jsonfuncs.o like.o lockfuncs.o mac.o misc.o nabstime.o name.o \ --- 16,26 ---- endif # keep this list arranged alphabetically or it gets to be a mess ! OBJS = acl.o arrayfuncs.o array_expanded.o array_selfuncs.o \ ! array_typanalyze.o array_userfuncs.o arrayutils.o ascii.o \ ! bool.o cash.o char.o date.o datetime.o datum.o dbsize.o domains.o \ ! encode.o enum.o expandeddatum.o \ ! float.o format_type.o formatting.o genfile.o \ geo_ops.o geo_selfuncs.o inet_cidr_ntop.o inet_net_pton.o int.o \ int8.o json.o jsonb.o jsonb_gin.o jsonb_op.o jsonb_util.o \ jsonfuncs.o like.o lockfuncs.o mac.o misc.o nabstime.o name.o \ diff --git a/src/backend/utils/adt/array_expanded.c b/src/backend/utils/adt/array_expanded.c index ...6d3b724 . *** a/src/backend/utils/adt/array_expanded.c --- b/src/backend/utils/adt/array_expanded.c *************** *** 0 **** --- 1,374 ---- + /*------------------------------------------------------------------------- + * + * array_expanded.c + * Basic functions for manipulating expanded arrays. + * + * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group + * Portions Copyright (c) 1994, Regents of the University of California + * + * + * IDENTIFICATION + * src/backend/utils/adt/array_expanded.c + * + *------------------------------------------------------------------------- + */ + #include "postgres.h" + + #include "access/tupmacs.h" + #include "utils/array.h" + #include "utils/lsyscache.h" + #include "utils/memutils.h" + + + /* "Methods" required for an expanded object */ + static Size EA_get_flat_size(ExpandedObjectHeader *eohptr); + static void EA_flatten_into(ExpandedObjectHeader *eohptr, + void *result, Size allocated_size); + + static const ExpandedObjectMethods EA_methods = + { + EA_get_flat_size, + EA_flatten_into + }; + + + /* + * expand_array: convert an array Datum into an expanded array + * + * The expanded object will be a child of parentcontext. + * + * Some callers can provide cache space to avoid repeated lookups of element + * type data across calls; if so, pass a metacache pointer, making sure that + * metacache->element_type is initialized to InvalidOid before first call. + * If no cross-call caching is required, pass NULL for metacache. + */ + Datum + expand_array(Datum arraydatum, MemoryContext parentcontext, + ArrayMetaState *metacache) + { + ArrayType *array; + ExpandedArrayHeader *eah; + MemoryContext objcxt; + MemoryContext oldcxt; + + /* + * Allocate private context for expanded object. We start by assuming + * that the array won't be very large; but if it does grow a lot, don't + * constrain aset.c's large-context behavior. + */ + objcxt = AllocSetContextCreate(parentcontext, + "expanded array", + ALLOCSET_SMALL_MINSIZE, + ALLOCSET_SMALL_INITSIZE, + ALLOCSET_DEFAULT_MAXSIZE); + + /* Set up expanded array header */ + eah = (ExpandedArrayHeader *) + MemoryContextAlloc(objcxt, sizeof(ExpandedArrayHeader)); + + EOH_init_header(&eah->hdr, &EA_methods, objcxt); + eah->ea_magic = EA_MAGIC; + + /* + * Detoast and copy original array into private context, as a flat array. + * We flatten it even if it's in expanded form; it's not clear that adding + * a special-case path for that would be worth the trouble. + * + * Note that this coding risks leaking some memory in the private context + * if we have to fetch data from a TOAST table; however, experimentation + * says that the leak is minimal. Doing it this way saves a copy step, + * which seems worthwhile, especially if the array is large enough to need + * external storage. + */ + oldcxt = MemoryContextSwitchTo(objcxt); + array = DatumGetArrayTypePCopy(arraydatum); + MemoryContextSwitchTo(oldcxt); + + eah->ndims = ARR_NDIM(array); + /* note these pointers point into the fvalue header! */ + eah->dims = ARR_DIMS(array); + eah->lbound = ARR_LBOUND(array); + + /* Save array's element-type data for possible use later */ + eah->element_type = ARR_ELEMTYPE(array); + if (metacache && metacache->element_type == eah->element_type) + { + /* Caller provided valid cache of representational data */ + eah->typlen = metacache->typlen; + eah->typbyval = metacache->typbyval; + eah->typalign = metacache->typalign; + } + else + { + /* No, so look it up */ + get_typlenbyvalalign(eah->element_type, + &eah->typlen, + &eah->typbyval, + &eah->typalign); + /* Update cache if provided */ + if (metacache) + { + metacache->element_type = eah->element_type; + metacache->typlen = eah->typlen; + metacache->typbyval = eah->typbyval; + metacache->typalign = eah->typalign; + } + } + + /* we don't make a deconstructed representation now */ + eah->dvalues = NULL; + eah->dnulls = NULL; + eah->dvalueslen = 0; + eah->nelems = 0; + eah->flat_size = 0; + + /* remember we have a flat representation */ + eah->fvalue = array; + eah->fstartptr = ARR_DATA_PTR(array); + eah->fendptr = ((char *) array) + ARR_SIZE(array); + + /* return a R/W pointer to the expanded array */ + return EOHPGetRWDatum(&eah->hdr); + } + + /* + * get_flat_size method for expanded arrays + */ + static Size + EA_get_flat_size(ExpandedObjectHeader *eohptr) + { + ExpandedArrayHeader *eah = (ExpandedArrayHeader *) eohptr; + int nelems; + int ndims; + Datum *dvalues; + bool *dnulls; + Size nbytes; + int i; + + Assert(eah->ea_magic == EA_MAGIC); + + /* Easy if we have a valid flattened value */ + if (eah->fvalue) + return ARR_SIZE(eah->fvalue); + + /* If we have a cached size value, believe that */ + if (eah->flat_size) + return eah->flat_size; + + /* + * Compute space needed by examining dvalues/dnulls. Note that the result + * array will have a nulls bitmap if dnulls isn't NULL, even if the array + * doesn't actually contain any nulls now. + */ + nelems = eah->nelems; + ndims = eah->ndims; + Assert(nelems == ArrayGetNItems(ndims, eah->dims)); + dvalues = eah->dvalues; + dnulls = eah->dnulls; + nbytes = 0; + for (i = 0; i < nelems; i++) + { + if (dnulls && dnulls[i]) + continue; + nbytes = att_addlength_datum(nbytes, eah->typlen, dvalues[i]); + nbytes = att_align_nominal(nbytes, eah->typalign); + /* check for overflow of total request */ + if (!AllocSizeIsValid(nbytes)) + ereport(ERROR, + (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED), + errmsg("array size exceeds the maximum allowed (%d)", + (int) MaxAllocSize))); + } + + if (dnulls) + nbytes += ARR_OVERHEAD_WITHNULLS(ndims, nelems); + else + nbytes += ARR_OVERHEAD_NONULLS(ndims); + + /* cache for next time */ + eah->flat_size = nbytes; + + return nbytes; + } + + /* + * flatten_into method for expanded arrays + */ + static void + EA_flatten_into(ExpandedObjectHeader *eohptr, + void *result, Size allocated_size) + { + ExpandedArrayHeader *eah = (ExpandedArrayHeader *) eohptr; + ArrayType *aresult = (ArrayType *) result; + int nelems; + int ndims; + int32 dataoffset; + + Assert(eah->ea_magic == EA_MAGIC); + + /* Easy if we have a valid flattened value */ + if (eah->fvalue) + { + Assert(allocated_size == ARR_SIZE(eah->fvalue)); + memcpy(result, eah->fvalue, allocated_size); + return; + } + + /* Else allocation should match previous get_flat_size result */ + Assert(allocated_size == eah->flat_size); + + /* Fill result array from dvalues/dnulls */ + nelems = eah->nelems; + ndims = eah->ndims; + + if (eah->dnulls) + dataoffset = ARR_OVERHEAD_WITHNULLS(ndims, nelems); + else + dataoffset = 0; /* marker for no null bitmap */ + + /* We must ensure that any pad space is zero-filled */ + memset(aresult, 0, allocated_size); + + SET_VARSIZE(aresult, allocated_size); + aresult->ndim = ndims; + aresult->dataoffset = dataoffset; + aresult->elemtype = eah->element_type; + memcpy(ARR_DIMS(aresult), eah->dims, ndims * sizeof(int)); + memcpy(ARR_LBOUND(aresult), eah->lbound, ndims * sizeof(int)); + + CopyArrayEls(aresult, + eah->dvalues, eah->dnulls, nelems, + eah->typlen, eah->typbyval, eah->typalign, + false); + } + + /* + * Argument fetching support code + */ + + /* + * DatumGetExpandedArray: get a writable expanded array from an input argument + */ + ExpandedArrayHeader * + DatumGetExpandedArray(Datum d) + { + ExpandedArrayHeader *eah; + + /* If it's a writable expanded array already, just return it */ + if (VARATT_IS_EXTERNAL_EXPANDED_RW(DatumGetPointer(d))) + { + eah = (ExpandedArrayHeader *) DatumGetEOHP(d); + Assert(eah->ea_magic == EA_MAGIC); + return eah; + } + + /* + * If it's a non-writable expanded array, copy it, extracting the element + * representational data to save a catalog lookup. + */ + if (VARATT_IS_EXTERNAL_EXPANDED_RO(DatumGetPointer(d))) + { + ArrayMetaState fakecache; + + eah = (ExpandedArrayHeader *) DatumGetEOHP(d); + Assert(eah->ea_magic == EA_MAGIC); + fakecache.element_type = eah->element_type; + fakecache.typlen = eah->typlen; + fakecache.typbyval = eah->typbyval; + fakecache.typalign = eah->typalign; + d = expand_array(d, CurrentMemoryContext, &fakecache); + return (ExpandedArrayHeader *) DatumGetEOHP(d); + } + + /* Else expand the hard way */ + d = expand_array(d, CurrentMemoryContext, NULL); + return (ExpandedArrayHeader *) DatumGetEOHP(d); + } + + /* + * As above, when caller has the ability to cache element type info + */ + ExpandedArrayHeader * + DatumGetExpandedArrayX(Datum d, ArrayMetaState *metacache) + { + ExpandedArrayHeader *eah; + + /* If it's a writable expanded array already, just return it */ + if (VARATT_IS_EXTERNAL_EXPANDED_RW(DatumGetPointer(d))) + { + eah = (ExpandedArrayHeader *) DatumGetEOHP(d); + Assert(eah->ea_magic == EA_MAGIC); + /* Update cache if provided */ + if (metacache) + { + metacache->element_type = eah->element_type; + metacache->typlen = eah->typlen; + metacache->typbyval = eah->typbyval; + metacache->typalign = eah->typalign; + } + return eah; + } + + /* Else expand using caller's cache if any */ + d = expand_array(d, CurrentMemoryContext, metacache); + return (ExpandedArrayHeader *) DatumGetEOHP(d); + } + + /* + * DatumGetAnyArray: return either an expanded array or a detoasted varlena + * array. The result must not be modified in-place. + */ + AnyArrayType * + DatumGetAnyArray(Datum d) + { + ExpandedArrayHeader *eah; + + /* + * If it's an expanded array (RW or RO), return the header pointer. + */ + if (VARATT_IS_EXTERNAL_EXPANDED(DatumGetPointer(d))) + { + eah = (ExpandedArrayHeader *) DatumGetEOHP(d); + Assert(eah->ea_magic == EA_MAGIC); + return (AnyArrayType *) eah; + } + + /* Else do regular detoasting as needed */ + return (AnyArrayType *) PG_DETOAST_DATUM(d); + } + + /* + * Create the Datum/isnull representation of an expanded array object + * if we didn't do so previously + */ + void + deconstruct_expanded_array(ExpandedArrayHeader *eah) + { + if (eah->dvalues == NULL) + { + MemoryContext oldcxt = MemoryContextSwitchTo(eah->hdr.eoh_context); + Datum *dvalues; + bool *dnulls; + int nelems; + + dnulls = NULL; + deconstruct_array(eah->fvalue, + eah->element_type, + eah->typlen, eah->typbyval, eah->typalign, + &dvalues, + ARR_HASNULL(eah->fvalue) ? &dnulls : NULL, + &nelems); + + /* + * Update header only after successful completion of this step. If + * deconstruct_array fails partway through, worst consequence is some + * leaked memory in the object's context. If the caller fails at a + * later point, that's fine, since the deconstructed representation is + * valid anyhow. + */ + eah->dvalues = dvalues; + eah->dnulls = dnulls; + eah->dvalueslen = eah->nelems = nelems; + MemoryContextSwitchTo(oldcxt); + } + } diff --git a/src/backend/utils/adt/array_userfuncs.c b/src/backend/utils/adt/array_userfuncs.c index 57074e0..470275a 100644 *** a/src/backend/utils/adt/array_userfuncs.c --- b/src/backend/utils/adt/array_userfuncs.c *************** static Datum array_offset_common(Functio *** 25,46 **** /* * fetch_array_arg_replace_nulls * ! * Fetch an array-valued argument; if it's null, construct an empty array ! * value of the proper data type. Also cache basic element type information ! * in fn_extra. */ ! static ArrayType * fetch_array_arg_replace_nulls(FunctionCallInfo fcinfo, int argno) { ! ArrayType *v; Oid element_type; ArrayMetaState *my_extra; ! /* First collect the array value */ if (!PG_ARGISNULL(argno)) { ! v = PG_GETARG_ARRAYTYPE_P(argno); ! element_type = ARR_ELEMTYPE(v); } else { --- 25,56 ---- /* * fetch_array_arg_replace_nulls * ! * Fetch an array-valued argument in expanded form; if it's null, construct an ! * empty array value of the proper data type. Also cache basic element type ! * information in fn_extra. */ ! static ExpandedArrayHeader * fetch_array_arg_replace_nulls(FunctionCallInfo fcinfo, int argno) { ! ExpandedArrayHeader *eah; Oid element_type; ArrayMetaState *my_extra; ! /* If first time through, create datatype cache struct */ ! my_extra = (ArrayMetaState *) fcinfo->flinfo->fn_extra; ! if (my_extra == NULL) ! { ! my_extra = (ArrayMetaState *) ! MemoryContextAlloc(fcinfo->flinfo->fn_mcxt, ! sizeof(ArrayMetaState)); ! my_extra->element_type = InvalidOid; ! fcinfo->flinfo->fn_extra = my_extra; ! } ! ! /* Now collect the array value */ if (!PG_ARGISNULL(argno)) { ! eah = PG_GETARG_EXPANDED_ARRAYX(argno, my_extra); } else { *************** fetch_array_arg_replace_nulls(FunctionCa *** 57,86 **** (errcode(ERRCODE_DATATYPE_MISMATCH), errmsg("input data type is not an array"))); ! v = construct_empty_array(element_type); ! } ! ! /* Now cache required info, which might change from call to call */ ! my_extra = (ArrayMetaState *) fcinfo->flinfo->fn_extra; ! if (my_extra == NULL) ! { ! my_extra = (ArrayMetaState *) ! MemoryContextAlloc(fcinfo->flinfo->fn_mcxt, ! sizeof(ArrayMetaState)); ! my_extra->element_type = InvalidOid; ! fcinfo->flinfo->fn_extra = my_extra; ! } ! ! if (my_extra->element_type != element_type) ! { ! get_typlenbyvalalign(element_type, ! &my_extra->typlen, ! &my_extra->typbyval, ! &my_extra->typalign); ! my_extra->element_type = element_type; } ! return v; } /*----------------------------------------------------------------------------- --- 67,78 ---- (errcode(ERRCODE_DATATYPE_MISMATCH), errmsg("input data type is not an array"))); ! eah = construct_empty_expanded_array(element_type, ! CurrentMemoryContext, ! my_extra); } ! return eah; } /*----------------------------------------------------------------------------- *************** fetch_array_arg_replace_nulls(FunctionCa *** 91,119 **** Datum array_append(PG_FUNCTION_ARGS) { ! ArrayType *v; Datum newelem; bool isNull; ! ArrayType *result; int *dimv, *lb; int indx; ArrayMetaState *my_extra; ! v = fetch_array_arg_replace_nulls(fcinfo, 0); isNull = PG_ARGISNULL(1); if (isNull) newelem = (Datum) 0; else newelem = PG_GETARG_DATUM(1); ! if (ARR_NDIM(v) == 1) { /* append newelem */ int ub; ! lb = ARR_LBOUND(v); ! dimv = ARR_DIMS(v); ub = dimv[0] + lb[0] - 1; indx = ub + 1; --- 83,111 ---- Datum array_append(PG_FUNCTION_ARGS) { ! ExpandedArrayHeader *eah; Datum newelem; bool isNull; ! Datum result; int *dimv, *lb; int indx; ArrayMetaState *my_extra; ! eah = fetch_array_arg_replace_nulls(fcinfo, 0); isNull = PG_ARGISNULL(1); if (isNull) newelem = (Datum) 0; else newelem = PG_GETARG_DATUM(1); ! if (eah->ndims == 1) { /* append newelem */ int ub; ! lb = eah->lbound; ! dimv = eah->dims; ub = dimv[0] + lb[0] - 1; indx = ub + 1; *************** array_append(PG_FUNCTION_ARGS) *** 123,129 **** (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE), errmsg("integer out of range"))); } ! else if (ARR_NDIM(v) == 0) indx = 1; else ereport(ERROR, --- 115,121 ---- (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE), errmsg("integer out of range"))); } ! else if (eah->ndims == 0) indx = 1; else ereport(ERROR, *************** array_append(PG_FUNCTION_ARGS) *** 133,142 **** /* Perform element insertion */ my_extra = (ArrayMetaState *) fcinfo->flinfo->fn_extra; ! result = array_set(v, 1, &indx, newelem, isNull, -1, my_extra->typlen, my_extra->typbyval, my_extra->typalign); ! PG_RETURN_ARRAYTYPE_P(result); } /*----------------------------------------------------------------------------- --- 125,135 ---- /* Perform element insertion */ my_extra = (ArrayMetaState *) fcinfo->flinfo->fn_extra; ! result = array_set_element(EOHPGetRWDatum(&eah->hdr), ! 1, &indx, newelem, isNull, -1, my_extra->typlen, my_extra->typbyval, my_extra->typalign); ! PG_RETURN_DATUM(result); } /*----------------------------------------------------------------------------- *************** array_append(PG_FUNCTION_ARGS) *** 147,158 **** Datum array_prepend(PG_FUNCTION_ARGS) { ! ArrayType *v; Datum newelem; bool isNull; ! ArrayType *result; int *lb; int indx; ArrayMetaState *my_extra; isNull = PG_ARGISNULL(0); --- 140,152 ---- Datum array_prepend(PG_FUNCTION_ARGS) { ! ExpandedArrayHeader *eah; Datum newelem; bool isNull; ! Datum result; int *lb; int indx; + int lb0; ArrayMetaState *my_extra; isNull = PG_ARGISNULL(0); *************** array_prepend(PG_FUNCTION_ARGS) *** 160,172 **** newelem = (Datum) 0; else newelem = PG_GETARG_DATUM(0); ! v = fetch_array_arg_replace_nulls(fcinfo, 1); ! if (ARR_NDIM(v) == 1) { /* prepend newelem */ ! lb = ARR_LBOUND(v); indx = lb[0] - 1; /* overflow? */ if (indx > lb[0]) --- 154,167 ---- newelem = (Datum) 0; else newelem = PG_GETARG_DATUM(0); ! eah = fetch_array_arg_replace_nulls(fcinfo, 1); ! if (eah->ndims == 1) { /* prepend newelem */ ! lb = eah->lbound; indx = lb[0] - 1; + lb0 = lb[0]; /* overflow? */ if (indx > lb[0]) *************** array_prepend(PG_FUNCTION_ARGS) *** 174,181 **** (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE), errmsg("integer out of range"))); } ! else if (ARR_NDIM(v) == 0) indx = 1; else ereport(ERROR, (errcode(ERRCODE_DATA_EXCEPTION), --- 169,179 ---- (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE), errmsg("integer out of range"))); } ! else if (eah->ndims == 0) ! { indx = 1; + lb0 = 1; + } else ereport(ERROR, (errcode(ERRCODE_DATA_EXCEPTION), *************** array_prepend(PG_FUNCTION_ARGS) *** 184,197 **** /* Perform element insertion */ my_extra = (ArrayMetaState *) fcinfo->flinfo->fn_extra; ! result = array_set(v, 1, &indx, newelem, isNull, -1, my_extra->typlen, my_extra->typbyval, my_extra->typalign); /* Readjust result's LB to match the input's, as expected for prepend */ ! if (ARR_NDIM(v) == 1) ! ARR_LBOUND(result)[0] = ARR_LBOUND(v)[0]; ! PG_RETURN_ARRAYTYPE_P(result); } /*----------------------------------------------------------------------------- --- 182,200 ---- /* Perform element insertion */ my_extra = (ArrayMetaState *) fcinfo->flinfo->fn_extra; ! result = array_set_element(EOHPGetRWDatum(&eah->hdr), ! 1, &indx, newelem, isNull, -1, my_extra->typlen, my_extra->typbyval, my_extra->typalign); /* Readjust result's LB to match the input's, as expected for prepend */ ! Assert(result == EOHPGetRWDatum(&eah->hdr)); ! if (eah->ndims == 1) ! { ! /* This is ok whether we've deconstructed or not */ ! eah->lbound[0] = lb0; ! } ! PG_RETURN_DATUM(result); } /*----------------------------------------------------------------------------- diff --git a/src/backend/utils/adt/arrayfuncs.c b/src/backend/utils/adt/arrayfuncs.c index 9117a55..26fa648 100644 *** a/src/backend/utils/adt/arrayfuncs.c --- b/src/backend/utils/adt/arrayfuncs.c *************** bool Array_nulls = true; *** 42,47 **** --- 42,53 ---- */ #define ASSGN "=" + #define AARR_FREE_IF_COPY(array,n) \ + do { \ + if (!VARATT_IS_EXPANDED_HEADER(array)) \ + PG_FREE_IF_COPY(array, n); \ + } while (0) + typedef enum { ARRAY_NO_LEVEL, *************** static void ReadArrayBinary(StringInfo b *** 93,102 **** int typlen, bool typbyval, char typalign, Datum *values, bool *nulls, bool *hasnulls, int32 *nbytes); ! static void CopyArrayEls(ArrayType *array, ! Datum *values, bool *nulls, int nitems, ! int typlen, bool typbyval, char typalign, ! bool freedata); static bool array_get_isnull(const bits8 *nullbitmap, int offset); static void array_set_isnull(bits8 *nullbitmap, int offset, bool isNull); static Datum ArrayCast(char *value, bool byval, int len); --- 99,114 ---- int typlen, bool typbyval, char typalign, Datum *values, bool *nulls, bool *hasnulls, int32 *nbytes); ! static Datum array_get_element_expanded(Datum arraydatum, ! int nSubscripts, int *indx, ! int arraytyplen, ! int elmlen, bool elmbyval, char elmalign, ! bool *isNull); ! static Datum array_set_element_expanded(Datum arraydatum, ! int nSubscripts, int *indx, ! Datum dataValue, bool isNull, ! int arraytyplen, ! int elmlen, bool elmbyval, char elmalign); static bool array_get_isnull(const bits8 *nullbitmap, int offset); static void array_set_isnull(bits8 *nullbitmap, int offset, bool isNull); static Datum ArrayCast(char *value, bool byval, int len); *************** ReadArrayStr(char *arrayStr, *** 939,945 **** * the values are not toasted. (Doing it here doesn't work since the * caller has already allocated space for the array...) */ ! static void CopyArrayEls(ArrayType *array, Datum *values, bool *nulls, --- 951,957 ---- * the values are not toasted. (Doing it here doesn't work since the * caller has already allocated space for the array...) */ ! void CopyArrayEls(ArrayType *array, Datum *values, bool *nulls, *************** CopyArrayEls(ArrayType *array, *** 997,1004 **** Datum array_out(PG_FUNCTION_ARGS) { ! ArrayType *v = PG_GETARG_ARRAYTYPE_P(0); ! Oid element_type = ARR_ELEMTYPE(v); int typlen; bool typbyval; char typalign; --- 1009,1016 ---- Datum array_out(PG_FUNCTION_ARGS) { ! AnyArrayType *v = PG_GETARG_ANY_ARRAY(0); ! Oid element_type = AARR_ELEMTYPE(v); int typlen; bool typbyval; char typalign; *************** array_out(PG_FUNCTION_ARGS) *** 1014,1021 **** * * +2 allows for assignment operator + trailing null */ - bits8 *bitmap; - int bitmask; bool *needquotes, needdims = false; int nitems, --- 1026,1031 ---- *************** array_out(PG_FUNCTION_ARGS) *** 1027,1032 **** --- 1037,1043 ---- int ndim, *dims, *lb; + ARRAY_ITER ARRAY_ITER_VARS(iter); ArrayMetaState *my_extra; /* *************** array_out(PG_FUNCTION_ARGS) *** 1061,1069 **** typalign = my_extra->typalign; typdelim = my_extra->typdelim; ! ndim = ARR_NDIM(v); ! dims = ARR_DIMS(v); ! lb = ARR_LBOUND(v); nitems = ArrayGetNItems(ndim, dims); if (nitems == 0) --- 1072,1080 ---- typalign = my_extra->typalign; typdelim = my_extra->typdelim; ! ndim = AARR_NDIM(v); ! dims = AARR_DIMS(v); ! lb = AARR_LBOUND(v); nitems = ArrayGetNItems(ndim, dims); if (nitems == 0) *************** array_out(PG_FUNCTION_ARGS) *** 1094,1109 **** needquotes = (bool *) palloc(nitems * sizeof(bool)); overall_length = 1; /* don't forget to count \0 at end. */ ! p = ARR_DATA_PTR(v); ! bitmap = ARR_NULLBITMAP(v); ! bitmask = 1; for (i = 0; i < nitems; i++) { bool needquote; /* Get source element, checking for NULL */ ! if (bitmap && (*bitmap & bitmask) == 0) { values[i] = pstrdup("NULL"); overall_length += 4; --- 1105,1122 ---- needquotes = (bool *) palloc(nitems * sizeof(bool)); overall_length = 1; /* don't forget to count \0 at end. */ ! ARRAY_ITER_SETUP(iter, v); for (i = 0; i < nitems; i++) { + Datum itemvalue; + bool isnull; bool needquote; /* Get source element, checking for NULL */ ! ARRAY_ITER_NEXT(iter, i, itemvalue, isnull, typlen, typbyval, typalign); ! ! if (isnull) { values[i] = pstrdup("NULL"); overall_length += 4; *************** array_out(PG_FUNCTION_ARGS) *** 1111,1122 **** } else { - Datum itemvalue; - - itemvalue = fetch_att(p, typbyval, typlen); values[i] = OutputFunctionCall(&my_extra->proc, itemvalue); - p = att_addlength_pointer(p, typlen, p); - p = (char *) att_align_nominal(p, typalign); /* count data plus backslashes; detect chars needing quotes */ if (values[i][0] == '\0') --- 1124,1130 ---- *************** array_out(PG_FUNCTION_ARGS) *** 1149,1165 **** overall_length += 2; /* and the comma */ overall_length += 1; - - /* advance bitmap pointer if any */ - if (bitmap) - { - bitmask <<= 1; - if (bitmask == 0x100) - { - bitmap++; - bitmask = 1; - } - } } /* --- 1157,1162 ---- *************** ReadArrayBinary(StringInfo buf, *** 1534,1552 **** Datum array_send(PG_FUNCTION_ARGS) { ! ArrayType *v = PG_GETARG_ARRAYTYPE_P(0); ! Oid element_type = ARR_ELEMTYPE(v); int typlen; bool typbyval; char typalign; - char *p; - bits8 *bitmap; - int bitmask; int nitems, i; int ndim, ! *dim; StringInfoData buf; ArrayMetaState *my_extra; /* --- 1531,1548 ---- Datum array_send(PG_FUNCTION_ARGS) { ! AnyArrayType *v = PG_GETARG_ANY_ARRAY(0); ! Oid element_type = AARR_ELEMTYPE(v); int typlen; bool typbyval; char typalign; int nitems, i; int ndim, ! *dim, ! *lb; StringInfoData buf; + ARRAY_ITER ARRAY_ITER_VARS(iter); ArrayMetaState *my_extra; /* *************** array_send(PG_FUNCTION_ARGS) *** 1583,1642 **** typbyval = my_extra->typbyval; typalign = my_extra->typalign; ! ndim = ARR_NDIM(v); ! dim = ARR_DIMS(v); nitems = ArrayGetNItems(ndim, dim); pq_begintypsend(&buf); /* Send the array header information */ pq_sendint(&buf, ndim, 4); ! pq_sendint(&buf, ARR_HASNULL(v) ? 1 : 0, 4); pq_sendint(&buf, element_type, sizeof(Oid)); for (i = 0; i < ndim; i++) { ! pq_sendint(&buf, ARR_DIMS(v)[i], 4); ! pq_sendint(&buf, ARR_LBOUND(v)[i], 4); } /* Send the array elements using the element's own sendproc */ ! p = ARR_DATA_PTR(v); ! bitmap = ARR_NULLBITMAP(v); ! bitmask = 1; for (i = 0; i < nitems; i++) { /* Get source element, checking for NULL */ ! if (bitmap && (*bitmap & bitmask) == 0) { /* -1 length means a NULL */ pq_sendint(&buf, -1, 4); } else { - Datum itemvalue; bytea *outputbytes; - itemvalue = fetch_att(p, typbyval, typlen); outputbytes = SendFunctionCall(&my_extra->proc, itemvalue); pq_sendint(&buf, VARSIZE(outputbytes) - VARHDRSZ, 4); pq_sendbytes(&buf, VARDATA(outputbytes), VARSIZE(outputbytes) - VARHDRSZ); pfree(outputbytes); - - p = att_addlength_pointer(p, typlen, p); - p = (char *) att_align_nominal(p, typalign); - } - - /* advance bitmap pointer if any */ - if (bitmap) - { - bitmask <<= 1; - if (bitmask == 0x100) - { - bitmap++; - bitmask = 1; - } } } --- 1579,1626 ---- typbyval = my_extra->typbyval; typalign = my_extra->typalign; ! ndim = AARR_NDIM(v); ! dim = AARR_DIMS(v); ! lb = AARR_LBOUND(v); nitems = ArrayGetNItems(ndim, dim); pq_begintypsend(&buf); /* Send the array header information */ pq_sendint(&buf, ndim, 4); ! pq_sendint(&buf, AARR_HASNULL(v) ? 1 : 0, 4); pq_sendint(&buf, element_type, sizeof(Oid)); for (i = 0; i < ndim; i++) { ! pq_sendint(&buf, dim[i], 4); ! pq_sendint(&buf, lb[i], 4); } /* Send the array elements using the element's own sendproc */ ! ARRAY_ITER_SETUP(iter, v); for (i = 0; i < nitems; i++) { + Datum itemvalue; + bool isnull; + /* Get source element, checking for NULL */ ! ARRAY_ITER_NEXT(iter, i, itemvalue, isnull, typlen, typbyval, typalign); ! ! if (isnull) { /* -1 length means a NULL */ pq_sendint(&buf, -1, 4); } else { bytea *outputbytes; outputbytes = SendFunctionCall(&my_extra->proc, itemvalue); pq_sendint(&buf, VARSIZE(outputbytes) - VARHDRSZ, 4); pq_sendbytes(&buf, VARDATA(outputbytes), VARSIZE(outputbytes) - VARHDRSZ); pfree(outputbytes); } } *************** array_send(PG_FUNCTION_ARGS) *** 1650,1662 **** Datum array_ndims(PG_FUNCTION_ARGS) { ! ArrayType *v = PG_GETARG_ARRAYTYPE_P(0); /* Sanity check: does it look like an array at all? */ ! if (ARR_NDIM(v) <= 0 || ARR_NDIM(v) > MAXDIM) PG_RETURN_NULL(); ! PG_RETURN_INT32(ARR_NDIM(v)); } /* --- 1634,1646 ---- Datum array_ndims(PG_FUNCTION_ARGS) { ! AnyArrayType *v = PG_GETARG_ANY_ARRAY(0); /* Sanity check: does it look like an array at all? */ ! if (AARR_NDIM(v) <= 0 || AARR_NDIM(v) > MAXDIM) PG_RETURN_NULL(); ! PG_RETURN_INT32(AARR_NDIM(v)); } /* *************** array_ndims(PG_FUNCTION_ARGS) *** 1666,1672 **** Datum array_dims(PG_FUNCTION_ARGS) { ! ArrayType *v = PG_GETARG_ARRAYTYPE_P(0); char *p; int i; int *dimv, --- 1650,1656 ---- Datum array_dims(PG_FUNCTION_ARGS) { ! AnyArrayType *v = PG_GETARG_ANY_ARRAY(0); char *p; int i; int *dimv, *************** array_dims(PG_FUNCTION_ARGS) *** 1680,1693 **** char buf[MAXDIM * 33 + 1]; /* Sanity check: does it look like an array at all? */ ! if (ARR_NDIM(v) <= 0 || ARR_NDIM(v) > MAXDIM) PG_RETURN_NULL(); ! dimv = ARR_DIMS(v); ! lb = ARR_LBOUND(v); p = buf; ! for (i = 0; i < ARR_NDIM(v); i++) { sprintf(p, "[%d:%d]", lb[i], dimv[i] + lb[i] - 1); p += strlen(p); --- 1664,1677 ---- char buf[MAXDIM * 33 + 1]; /* Sanity check: does it look like an array at all? */ ! if (AARR_NDIM(v) <= 0 || AARR_NDIM(v) > MAXDIM) PG_RETURN_NULL(); ! dimv = AARR_DIMS(v); ! lb = AARR_LBOUND(v); p = buf; ! for (i = 0; i < AARR_NDIM(v); i++) { sprintf(p, "[%d:%d]", lb[i], dimv[i] + lb[i] - 1); p += strlen(p); *************** array_dims(PG_FUNCTION_ARGS) *** 1704,1723 **** Datum array_lower(PG_FUNCTION_ARGS) { ! ArrayType *v = PG_GETARG_ARRAYTYPE_P(0); int reqdim = PG_GETARG_INT32(1); int *lb; int result; /* Sanity check: does it look like an array at all? */ ! if (ARR_NDIM(v) <= 0 || ARR_NDIM(v) > MAXDIM) PG_RETURN_NULL(); /* Sanity check: was the requested dim valid */ ! if (reqdim <= 0 || reqdim > ARR_NDIM(v)) PG_RETURN_NULL(); ! lb = ARR_LBOUND(v); result = lb[reqdim - 1]; PG_RETURN_INT32(result); --- 1688,1707 ---- Datum array_lower(PG_FUNCTION_ARGS) { ! AnyArrayType *v = PG_GETARG_ANY_ARRAY(0); int reqdim = PG_GETARG_INT32(1); int *lb; int result; /* Sanity check: does it look like an array at all? */ ! if (AARR_NDIM(v) <= 0 || AARR_NDIM(v) > MAXDIM) PG_RETURN_NULL(); /* Sanity check: was the requested dim valid */ ! if (reqdim <= 0 || reqdim > AARR_NDIM(v)) PG_RETURN_NULL(); ! lb = AARR_LBOUND(v); result = lb[reqdim - 1]; PG_RETURN_INT32(result); *************** array_lower(PG_FUNCTION_ARGS) *** 1731,1752 **** Datum array_upper(PG_FUNCTION_ARGS) { ! ArrayType *v = PG_GETARG_ARRAYTYPE_P(0); int reqdim = PG_GETARG_INT32(1); int *dimv, *lb; int result; /* Sanity check: does it look like an array at all? */ ! if (ARR_NDIM(v) <= 0 || ARR_NDIM(v) > MAXDIM) PG_RETURN_NULL(); /* Sanity check: was the requested dim valid */ ! if (reqdim <= 0 || reqdim > ARR_NDIM(v)) PG_RETURN_NULL(); ! lb = ARR_LBOUND(v); ! dimv = ARR_DIMS(v); result = dimv[reqdim - 1] + lb[reqdim - 1] - 1; --- 1715,1736 ---- Datum array_upper(PG_FUNCTION_ARGS) { ! AnyArrayType *v = PG_GETARG_ANY_ARRAY(0); int reqdim = PG_GETARG_INT32(1); int *dimv, *lb; int result; /* Sanity check: does it look like an array at all? */ ! if (AARR_NDIM(v) <= 0 || AARR_NDIM(v) > MAXDIM) PG_RETURN_NULL(); /* Sanity check: was the requested dim valid */ ! if (reqdim <= 0 || reqdim > AARR_NDIM(v)) PG_RETURN_NULL(); ! lb = AARR_LBOUND(v); ! dimv = AARR_DIMS(v); result = dimv[reqdim - 1] + lb[reqdim - 1] - 1; *************** array_upper(PG_FUNCTION_ARGS) *** 1761,1780 **** Datum array_length(PG_FUNCTION_ARGS) { ! ArrayType *v = PG_GETARG_ARRAYTYPE_P(0); int reqdim = PG_GETARG_INT32(1); int *dimv; int result; /* Sanity check: does it look like an array at all? */ ! if (ARR_NDIM(v) <= 0 || ARR_NDIM(v) > MAXDIM) PG_RETURN_NULL(); /* Sanity check: was the requested dim valid */ ! if (reqdim <= 0 || reqdim > ARR_NDIM(v)) PG_RETURN_NULL(); ! dimv = ARR_DIMS(v); result = dimv[reqdim - 1]; --- 1745,1764 ---- Datum array_length(PG_FUNCTION_ARGS) { ! AnyArrayType *v = PG_GETARG_ANY_ARRAY(0); int reqdim = PG_GETARG_INT32(1); int *dimv; int result; /* Sanity check: does it look like an array at all? */ ! if (AARR_NDIM(v) <= 0 || AARR_NDIM(v) > MAXDIM) PG_RETURN_NULL(); /* Sanity check: was the requested dim valid */ ! if (reqdim <= 0 || reqdim > AARR_NDIM(v)) PG_RETURN_NULL(); ! dimv = AARR_DIMS(v); result = dimv[reqdim - 1]; *************** array_length(PG_FUNCTION_ARGS) *** 1788,1796 **** Datum array_cardinality(PG_FUNCTION_ARGS) { ! ArrayType *v = PG_GETARG_ARRAYTYPE_P(0); ! PG_RETURN_INT32(ArrayGetNItems(ARR_NDIM(v), ARR_DIMS(v))); } --- 1772,1780 ---- Datum array_cardinality(PG_FUNCTION_ARGS) { ! AnyArrayType *v = PG_GETARG_ANY_ARRAY(0); ! PG_RETURN_INT32(ArrayGetNItems(AARR_NDIM(v), AARR_DIMS(v))); } *************** array_get_element(Datum arraydatum, *** 1825,1831 **** char elmalign, bool *isNull) { - ArrayType *array; int i, ndim, *dim, --- 1809,1814 ---- *************** array_get_element(Datum arraydatum, *** 1850,1859 **** arraydataptr = (char *) DatumGetPointer(arraydatum); arraynullsptr = NULL; } else { ! /* detoast input array if necessary */ ! array = DatumGetArrayTypeP(arraydatum); ndim = ARR_NDIM(array); dim = ARR_DIMS(array); --- 1833,1854 ---- arraydataptr = (char *) DatumGetPointer(arraydatum); arraynullsptr = NULL; } + else if (VARATT_IS_EXTERNAL_EXPANDED(DatumGetPointer(arraydatum))) + { + /* expanded array: let's do this in a separate function */ + return array_get_element_expanded(arraydatum, + nSubscripts, + indx, + arraytyplen, + elmlen, + elmbyval, + elmalign, + isNull); + } else { ! /* detoast array if necessary, producing normal varlena input */ ! ArrayType *array = DatumGetArrayTypeP(arraydatum); ndim = ARR_NDIM(array); dim = ARR_DIMS(array); *************** array_get_element(Datum arraydatum, *** 1903,1908 **** --- 1898,1985 ---- } /* + * Implementation of array_get_element() for an expanded array + */ + static Datum + array_get_element_expanded(Datum arraydatum, + int nSubscripts, int *indx, + int arraytyplen, + int elmlen, bool elmbyval, char elmalign, + bool *isNull) + { + ExpandedArrayHeader *eah; + int i, + ndim, + *dim, + *lb, + offset; + Datum *dvalues; + bool *dnulls; + + eah = (ExpandedArrayHeader *) DatumGetEOHP(arraydatum); + Assert(eah->ea_magic == EA_MAGIC); + + /* sanity-check caller's info against object */ + Assert(arraytyplen == -1); + Assert(elmlen == eah->typlen); + Assert(elmbyval == eah->typbyval); + Assert(elmalign == eah->typalign); + + ndim = eah->ndims; + dim = eah->dims; + lb = eah->lbound; + + /* + * Return NULL for invalid subscript + */ + if (ndim != nSubscripts || ndim <= 0 || ndim > MAXDIM) + { + *isNull = true; + return (Datum) 0; + } + for (i = 0; i < ndim; i++) + { + if (indx[i] < lb[i] || indx[i] >= (dim[i] + lb[i])) + { + *isNull = true; + return (Datum) 0; + } + } + + /* + * Calculate the element number + */ + offset = ArrayGetOffset(nSubscripts, dim, lb, indx); + + /* + * Deconstruct array if we didn't already. Note that we apply this even + * if the input is nominally read-only: it should be safe enough. + */ + deconstruct_expanded_array(eah); + + dvalues = eah->dvalues; + dnulls = eah->dnulls; + + /* + * Check for NULL array element + */ + if (dnulls && dnulls[offset]) + { + *isNull = true; + return (Datum) 0; + } + + /* + * OK, get the element. It's OK to return a pass-by-ref value as a + * pointer into the expanded array, for the same reason that regular + * array_get_element can return a pointer into flat arrays: the value is + * assumed not to change for as long as the Datum reference can exist. + */ + *isNull = false; + return dvalues[offset]; + } + + /* * array_get_slice : * This routine takes an array and a range of indices (upperIndex and * lowerIndx), creates a new array structure for the referred elements *************** array_get_slice(Datum arraydatum, *** 2083,2089 **** * * Result: * A new array is returned, just like the old except for the one ! * modified entry. The original array object is not changed. * * For one-dimensional arrays only, we allow the array to be extended * by assigning to a position outside the existing subscript range; any --- 2160,2168 ---- * * Result: * A new array is returned, just like the old except for the one ! * modified entry. The original array object is not changed, ! * unless what is passed is a read-write reference to an expanded ! * array object; in that case the expanded array is updated in-place. * * For one-dimensional arrays only, we allow the array to be extended * by assigning to a position outside the existing subscript range; any *************** array_set_element(Datum arraydatum, *** 2166,2171 **** --- 2245,2264 ---- if (elmlen == -1 && !isNull) dataValue = PointerGetDatum(PG_DETOAST_DATUM(dataValue)); + if (VARATT_IS_EXTERNAL_EXPANDED(DatumGetPointer(arraydatum))) + { + /* expanded array: let's do this in a separate function */ + return array_set_element_expanded(arraydatum, + nSubscripts, + indx, + dataValue, + isNull, + arraytyplen, + elmlen, + elmbyval, + elmalign); + } + /* detoast input array if necessary */ array = DatumGetArrayTypeP(arraydatum); *************** array_set_element(Datum arraydatum, *** 2355,2360 **** --- 2448,2698 ---- } /* + * Implementation of array_set_element() for an expanded array + * + * Note: as with any operation on a read/write expanded object, we must + * take pains not to leave the object in a corrupt state if we fail partway + * through. + */ + static Datum + array_set_element_expanded(Datum arraydatum, + int nSubscripts, int *indx, + Datum dataValue, bool isNull, + int arraytyplen, + int elmlen, bool elmbyval, char elmalign) + { + ExpandedArrayHeader *eah; + Datum *dvalues; + bool *dnulls; + int i, + ndim, + dim[MAXDIM], + lb[MAXDIM], + offset; + bool dimschanged, + newhasnulls; + int addedbefore, + addedafter; + char *oldValue; + + /* Convert to R/W object if not so already */ + eah = DatumGetExpandedArray(arraydatum); + + /* Sanity-check caller's info against object; we don't use it otherwise */ + Assert(arraytyplen == -1); + Assert(elmlen == eah->typlen); + Assert(elmbyval == eah->typbyval); + Assert(elmalign == eah->typalign); + + /* + * Copy dimension info into local storage. This allows us to modify the + * dimensions if needed, while not messing up the expanded value if we + * fail partway through. + */ + ndim = eah->ndims; + Assert(ndim >= 0 && ndim <= MAXDIM); + memcpy(dim, eah->dims, ndim * sizeof(int)); + memcpy(lb, eah->lbound, ndim * sizeof(int)); + dimschanged = false; + + /* + * if number of dims is zero, i.e. an empty array, create an array with + * nSubscripts dimensions, and set the lower bounds to the supplied + * subscripts. + */ + if (ndim == 0) + { + /* + * Allocate adequate space for new dimension info. This is harmless + * if we fail later. + */ + Assert(nSubscripts > 0 && nSubscripts <= MAXDIM); + eah->dims = (int *) MemoryContextAllocZero(eah->hdr.eoh_context, + nSubscripts * sizeof(int)); + eah->lbound = (int *) MemoryContextAllocZero(eah->hdr.eoh_context, + nSubscripts * sizeof(int)); + + /* Update local copies of dimension info */ + ndim = nSubscripts; + for (i = 0; i < nSubscripts; i++) + { + dim[i] = 0; + lb[i] = indx[i]; + } + dimschanged = true; + } + else if (ndim != nSubscripts) + ereport(ERROR, + (errcode(ERRCODE_ARRAY_SUBSCRIPT_ERROR), + errmsg("wrong number of array subscripts"))); + + /* + * Deconstruct array if we didn't already. (Someday maybe add a special + * case path for fixed-length, no-nulls cases, where we can overwrite an + * element in place without ever deconstructing. But today is not that + * day.) + */ + deconstruct_expanded_array(eah); + + /* + * Copy new element into array's context, if needed (we assume it's + * already detoasted, so no junk should be created). If we fail further + * down, this memory is leaked, but that's reasonably harmless. + */ + if (!eah->typbyval && !isNull) + { + MemoryContext oldcxt = MemoryContextSwitchTo(eah->hdr.eoh_context); + + dataValue = datumCopy(dataValue, false, eah->typlen); + MemoryContextSwitchTo(oldcxt); + } + + dvalues = eah->dvalues; + dnulls = eah->dnulls; + + newhasnulls = ((dnulls != NULL) || isNull); + addedbefore = addedafter = 0; + + /* + * Check subscripts (this logic matches original array_set_element) + */ + if (ndim == 1) + { + if (indx[0] < lb[0]) + { + addedbefore = lb[0] - indx[0]; + dim[0] += addedbefore; + lb[0] = indx[0]; + dimschanged = true; + if (addedbefore > 1) + newhasnulls = true; /* will insert nulls */ + } + if (indx[0] >= (dim[0] + lb[0])) + { + addedafter = indx[0] - (dim[0] + lb[0]) + 1; + dim[0] += addedafter; + dimschanged = true; + if (addedafter > 1) + newhasnulls = true; /* will insert nulls */ + } + } + else + { + /* + * XXX currently we do not support extending multi-dimensional arrays + * during assignment + */ + for (i = 0; i < ndim; i++) + { + if (indx[i] < lb[i] || + indx[i] >= (dim[i] + lb[i])) + ereport(ERROR, + (errcode(ERRCODE_ARRAY_SUBSCRIPT_ERROR), + errmsg("array subscript out of range"))); + } + } + + /* Now we can calculate linear offset of target item in array */ + offset = ArrayGetOffset(nSubscripts, dim, lb, indx); + + /* Physically enlarge existing dvalues/dnulls arrays if needed */ + if (dim[0] > eah->dvalueslen) + { + /* We want some extra space if we're enlarging */ + int newlen = dim[0] + dim[0] / 8; + + newlen = Max(newlen, dim[0]); /* integer overflow guard */ + eah->dvalues = dvalues = (Datum *) + repalloc(dvalues, newlen * sizeof(Datum)); + if (dnulls) + eah->dnulls = dnulls = (bool *) + repalloc(dnulls, newlen * sizeof(bool)); + eah->dvalueslen = newlen; + } + + /* + * If we need a nulls bitmap and don't already have one, create it, being + * sure to mark all existing entries as not null. + */ + if (newhasnulls && dnulls == NULL) + eah->dnulls = dnulls = (bool *) + MemoryContextAllocZero(eah->hdr.eoh_context, + eah->dvalueslen * sizeof(bool)); + + /* + * We now have all the needed space allocated, so we're ready to make + * irreversible changes. Be very wary of allowing failure below here. + */ + + /* Flattened value will no longer represent array accurately */ + eah->fvalue = NULL; + /* And we don't know the flattened size either */ + eah->flat_size = 0; + + /* Update dimensionality info if needed */ + if (dimschanged) + { + eah->ndims = ndim; + memcpy(eah->dims, dim, ndim * sizeof(int)); + memcpy(eah->lbound, lb, ndim * sizeof(int)); + } + + /* Reposition items if needed, and fill addedbefore items with nulls */ + if (addedbefore > 0) + { + memmove(dvalues + addedbefore, dvalues, eah->nelems * sizeof(Datum)); + for (i = 0; i < addedbefore; i++) + dvalues[i] = (Datum) 0; + if (dnulls) + { + memmove(dnulls + addedbefore, dnulls, eah->nelems * sizeof(bool)); + for (i = 0; i < addedbefore; i++) + dnulls[i] = true; + } + eah->nelems += addedbefore; + } + + /* fill addedafter items with nulls */ + if (addedafter > 0) + { + for (i = 0; i < addedafter; i++) + dvalues[eah->nelems + i] = (Datum) 0; + if (dnulls) + { + for (i = 0; i < addedafter; i++) + dnulls[eah->nelems + i] = true; + } + eah->nelems += addedafter; + } + + /* Grab old element value for pfree'ing, if needed. */ + if (!eah->typbyval && (dnulls == NULL || !dnulls[offset])) + oldValue = (char *) DatumGetPointer(dvalues[offset]); + else + oldValue = NULL; + + /* And finally we can insert the new element. */ + dvalues[offset] = dataValue; + if (dnulls) + dnulls[offset] = isNull; + + /* + * Free old element if needed; this keeps repeated element replacements + * from bloating the array's storage. If the pfree somehow fails, it + * won't corrupt the array. + */ + if (oldValue) + { + /* Don't try to pfree a part of the original flat array */ + if (oldValue < eah->fstartptr || oldValue >= eah->fendptr) + pfree(oldValue); + } + + /* Done, return standard TOAST pointer for object */ + return EOHPGetRWDatum(&eah->hdr); + } + + /* * array_set_slice : * This routine sets the value of a range of array locations (specified * by upper and lower subscript values) to new values passed as *************** array_set(ArrayType *array, int nSubscri *** 2734,2741 **** * the function fn(), and if nargs > 1 then argument positions after the * first must be preset to the additional values to be passed. The * first argument position initially holds the input array value. - * * inpType: OID of element type of input array. This must be the same as, - * or binary-compatible with, the first argument type of fn(). * * retType: OID of element type of output array. This must be the same as, * or binary-compatible with, the result type of fn(). * * amstate: workspace for array_map. Must be zeroed by caller before --- 3072,3077 ---- *************** array_set(ArrayType *array, int nSubscri *** 2749,2762 **** * the array are OK however. */ Datum ! array_map(FunctionCallInfo fcinfo, Oid inpType, Oid retType, ! ArrayMapState *amstate) { ! ArrayType *v; ArrayType *result; Datum *values; bool *nulls; - Datum elt; int *dim; int ndim; int nitems; --- 3085,3096 ---- * the array are OK however. */ Datum ! array_map(FunctionCallInfo fcinfo, Oid retType, ArrayMapState *amstate) { ! AnyArrayType *v; ArrayType *result; Datum *values; bool *nulls; int *dim; int ndim; int nitems; *************** array_map(FunctionCallInfo fcinfo, Oid i *** 2764,2778 **** int32 nbytes = 0; int32 dataoffset; bool hasnulls; int inp_typlen; bool inp_typbyval; char inp_typalign; int typlen; bool typbyval; char typalign; ! char *s; ! bits8 *bitmap; ! int bitmask; ArrayMetaState *inp_extra; ArrayMetaState *ret_extra; --- 3098,3111 ---- int32 nbytes = 0; int32 dataoffset; bool hasnulls; + Oid inpType; int inp_typlen; bool inp_typbyval; char inp_typalign; int typlen; bool typbyval; char typalign; ! ARRAY_ITER ARRAY_ITER_VARS(iter); ArrayMetaState *inp_extra; ArrayMetaState *ret_extra; *************** array_map(FunctionCallInfo fcinfo, Oid i *** 2781,2792 **** elog(ERROR, "invalid nargs: %d", fcinfo->nargs); if (PG_ARGISNULL(0)) elog(ERROR, "null input array"); ! v = PG_GETARG_ARRAYTYPE_P(0); ! ! Assert(ARR_ELEMTYPE(v) == inpType); ! ndim = ARR_NDIM(v); ! dim = ARR_DIMS(v); nitems = ArrayGetNItems(ndim, dim); /* Check for empty array */ --- 3114,3124 ---- elog(ERROR, "invalid nargs: %d", fcinfo->nargs); if (PG_ARGISNULL(0)) elog(ERROR, "null input array"); ! v = PG_GETARG_ANY_ARRAY(0); ! inpType = AARR_ELEMTYPE(v); ! ndim = AARR_NDIM(v); ! dim = AARR_DIMS(v); nitems = ArrayGetNItems(ndim, dim); /* Check for empty array */ *************** array_map(FunctionCallInfo fcinfo, Oid i *** 2833,2841 **** nulls = (bool *) palloc(nitems * sizeof(bool)); /* Loop over source data */ ! s = ARR_DATA_PTR(v); ! bitmap = ARR_NULLBITMAP(v); ! bitmask = 1; hasnulls = false; for (i = 0; i < nitems; i++) --- 3165,3171 ---- nulls = (bool *) palloc(nitems * sizeof(bool)); /* Loop over source data */ ! ARRAY_ITER_SETUP(iter, v); hasnulls = false; for (i = 0; i < nitems; i++) *************** array_map(FunctionCallInfo fcinfo, Oid i *** 2843,2860 **** bool callit = true; /* Get source element, checking for NULL */ ! if (bitmap && (*bitmap & bitmask) == 0) ! { ! fcinfo->argnull[0] = true; ! } ! else ! { ! elt = fetch_att(s, inp_typbyval, inp_typlen); ! s = att_addlength_datum(s, inp_typlen, elt); ! s = (char *) att_align_nominal(s, inp_typalign); ! fcinfo->arg[0] = elt; ! fcinfo->argnull[0] = false; ! } /* * Apply the given function to source elt and extra args. --- 3173,3180 ---- bool callit = true; /* Get source element, checking for NULL */ ! ARRAY_ITER_NEXT(iter, i, fcinfo->arg[0], fcinfo->argnull[0], ! inp_typlen, inp_typbyval, inp_typalign); /* * Apply the given function to source elt and extra args. *************** array_map(FunctionCallInfo fcinfo, Oid i *** 2899,2915 **** errmsg("array size exceeds the maximum allowed (%d)", (int) MaxAllocSize))); } - - /* advance bitmap pointer if any */ - if (bitmap) - { - bitmask <<= 1; - if (bitmask == 0x100) - { - bitmap++; - bitmask = 1; - } - } } /* Allocate and initialize the result array */ --- 3219,3224 ---- *************** array_map(FunctionCallInfo fcinfo, Oid i *** 2928,2934 **** result->ndim = ndim; result->dataoffset = dataoffset; result->elemtype = retType; ! memcpy(ARR_DIMS(result), ARR_DIMS(v), 2 * ndim * sizeof(int)); /* * Note: do not risk trying to pfree the results of the called function --- 3237,3244 ---- result->ndim = ndim; result->dataoffset = dataoffset; result->elemtype = retType; ! memcpy(ARR_DIMS(result), AARR_DIMS(v), ndim * sizeof(int)); ! memcpy(ARR_LBOUND(result), AARR_LBOUND(v), ndim * sizeof(int)); /* * Note: do not risk trying to pfree the results of the called function *************** construct_empty_array(Oid elmtype) *** 3092,3097 **** --- 3402,3424 ---- } /* + * construct_empty_expanded_array: make an empty expanded array + * given only type information. (metacache can be NULL if not needed.) + */ + ExpandedArrayHeader * + construct_empty_expanded_array(Oid element_type, + MemoryContext parentcontext, + ArrayMetaState *metacache) + { + ArrayType *array = construct_empty_array(element_type); + Datum d; + + d = expand_array(PointerGetDatum(array), parentcontext, metacache); + pfree(array); + return (ExpandedArrayHeader *) DatumGetEOHP(d); + } + + /* * deconstruct_array --- simple method for extracting data from an array * * array: array object to examine (must not be NULL) *************** array_contains_nulls(ArrayType *array) *** 3229,3264 **** Datum array_eq(PG_FUNCTION_ARGS) { ! ArrayType *array1 = PG_GETARG_ARRAYTYPE_P(0); ! ArrayType *array2 = PG_GETARG_ARRAYTYPE_P(1); Oid collation = PG_GET_COLLATION(); ! int ndims1 = ARR_NDIM(array1); ! int ndims2 = ARR_NDIM(array2); ! int *dims1 = ARR_DIMS(array1); ! int *dims2 = ARR_DIMS(array2); ! Oid element_type = ARR_ELEMTYPE(array1); bool result = true; int nitems; TypeCacheEntry *typentry; int typlen; bool typbyval; char typalign; ! char *ptr1; ! char *ptr2; ! bits8 *bitmap1; ! bits8 *bitmap2; ! int bitmask; int i; FunctionCallInfoData locfcinfo; ! if (element_type != ARR_ELEMTYPE(array2)) ereport(ERROR, (errcode(ERRCODE_DATATYPE_MISMATCH), errmsg("cannot compare arrays of different element types"))); /* fast path if the arrays do not have the same dimensionality */ if (ndims1 != ndims2 || ! memcmp(dims1, dims2, 2 * ndims1 * sizeof(int)) != 0) result = false; else { --- 3556,3591 ---- Datum array_eq(PG_FUNCTION_ARGS) { ! AnyArrayType *array1 = PG_GETARG_ANY_ARRAY(0); ! AnyArrayType *array2 = PG_GETARG_ANY_ARRAY(1); Oid collation = PG_GET_COLLATION(); ! int ndims1 = AARR_NDIM(array1); ! int ndims2 = AARR_NDIM(array2); ! int *dims1 = AARR_DIMS(array1); ! int *dims2 = AARR_DIMS(array2); ! int *lbs1 = AARR_LBOUND(array1); ! int *lbs2 = AARR_LBOUND(array2); ! Oid element_type = AARR_ELEMTYPE(array1); bool result = true; int nitems; TypeCacheEntry *typentry; int typlen; bool typbyval; char typalign; ! ARRAY_ITER ARRAY_ITER_VARS(it1); ! ARRAY_ITER ARRAY_ITER_VARS(it2); int i; FunctionCallInfoData locfcinfo; ! if (element_type != AARR_ELEMTYPE(array2)) ereport(ERROR, (errcode(ERRCODE_DATATYPE_MISMATCH), errmsg("cannot compare arrays of different element types"))); /* fast path if the arrays do not have the same dimensionality */ if (ndims1 != ndims2 || ! memcmp(dims1, dims2, ndims1 * sizeof(int)) != 0 || ! memcmp(lbs1, lbs2, ndims1 * sizeof(int)) != 0) result = false; else { *************** array_eq(PG_FUNCTION_ARGS) *** 3293,3303 **** /* Loop over source data */ nitems = ArrayGetNItems(ndims1, dims1); ! ptr1 = ARR_DATA_PTR(array1); ! ptr2 = ARR_DATA_PTR(array2); ! bitmap1 = ARR_NULLBITMAP(array1); ! bitmap2 = ARR_NULLBITMAP(array2); ! bitmask = 1; /* use same bitmask for both arrays */ for (i = 0; i < nitems; i++) { --- 3620,3627 ---- /* Loop over source data */ nitems = ArrayGetNItems(ndims1, dims1); ! ARRAY_ITER_SETUP(it1, array1); ! ARRAY_ITER_SETUP(it2, array2); for (i = 0; i < nitems; i++) { *************** array_eq(PG_FUNCTION_ARGS) *** 3308,3349 **** bool oprresult; /* Get elements, checking for NULL */ ! if (bitmap1 && (*bitmap1 & bitmask) == 0) ! { ! isnull1 = true; ! elt1 = (Datum) 0; ! } ! else ! { ! isnull1 = false; ! elt1 = fetch_att(ptr1, typbyval, typlen); ! ptr1 = att_addlength_pointer(ptr1, typlen, ptr1); ! ptr1 = (char *) att_align_nominal(ptr1, typalign); ! } ! ! if (bitmap2 && (*bitmap2 & bitmask) == 0) ! { ! isnull2 = true; ! elt2 = (Datum) 0; ! } ! else ! { ! isnull2 = false; ! elt2 = fetch_att(ptr2, typbyval, typlen); ! ptr2 = att_addlength_pointer(ptr2, typlen, ptr2); ! ptr2 = (char *) att_align_nominal(ptr2, typalign); ! } ! ! /* advance bitmap pointers if any */ ! bitmask <<= 1; ! if (bitmask == 0x100) ! { ! if (bitmap1) ! bitmap1++; ! if (bitmap2) ! bitmap2++; ! bitmask = 1; ! } /* * We consider two NULLs equal; NULL and not-NULL are unequal. --- 3632,3639 ---- bool oprresult; /* Get elements, checking for NULL */ ! ARRAY_ITER_NEXT(it1, i, elt1, isnull1, typlen, typbyval, typalign); ! ARRAY_ITER_NEXT(it2, i, elt2, isnull2, typlen, typbyval, typalign); /* * We consider two NULLs equal; NULL and not-NULL are unequal. *************** array_eq(PG_FUNCTION_ARGS) *** 3374,3381 **** } /* Avoid leaking memory when handed toasted input. */ ! PG_FREE_IF_COPY(array1, 0); ! PG_FREE_IF_COPY(array2, 1); PG_RETURN_BOOL(result); } --- 3664,3671 ---- } /* Avoid leaking memory when handed toasted input. */ ! AARR_FREE_IF_COPY(array1, 0); ! AARR_FREE_IF_COPY(array2, 1); PG_RETURN_BOOL(result); } *************** btarraycmp(PG_FUNCTION_ARGS) *** 3435,3465 **** static int array_cmp(FunctionCallInfo fcinfo) { ! ArrayType *array1 = PG_GETARG_ARRAYTYPE_P(0); ! ArrayType *array2 = PG_GETARG_ARRAYTYPE_P(1); Oid collation = PG_GET_COLLATION(); ! int ndims1 = ARR_NDIM(array1); ! int ndims2 = ARR_NDIM(array2); ! int *dims1 = ARR_DIMS(array1); ! int *dims2 = ARR_DIMS(array2); int nitems1 = ArrayGetNItems(ndims1, dims1); int nitems2 = ArrayGetNItems(ndims2, dims2); ! Oid element_type = ARR_ELEMTYPE(array1); int result = 0; TypeCacheEntry *typentry; int typlen; bool typbyval; char typalign; int min_nitems; ! char *ptr1; ! char *ptr2; ! bits8 *bitmap1; ! bits8 *bitmap2; ! int bitmask; int i; FunctionCallInfoData locfcinfo; ! if (element_type != ARR_ELEMTYPE(array2)) ereport(ERROR, (errcode(ERRCODE_DATATYPE_MISMATCH), errmsg("cannot compare arrays of different element types"))); --- 3725,3752 ---- static int array_cmp(FunctionCallInfo fcinfo) { ! AnyArrayType *array1 = PG_GETARG_ANY_ARRAY(0); ! AnyArrayType *array2 = PG_GETARG_ANY_ARRAY(1); Oid collation = PG_GET_COLLATION(); ! int ndims1 = AARR_NDIM(array1); ! int ndims2 = AARR_NDIM(array2); ! int *dims1 = AARR_DIMS(array1); ! int *dims2 = AARR_DIMS(array2); int nitems1 = ArrayGetNItems(ndims1, dims1); int nitems2 = ArrayGetNItems(ndims2, dims2); ! Oid element_type = AARR_ELEMTYPE(array1); int result = 0; TypeCacheEntry *typentry; int typlen; bool typbyval; char typalign; int min_nitems; ! ARRAY_ITER ARRAY_ITER_VARS(it1); ! ARRAY_ITER ARRAY_ITER_VARS(it2); int i; FunctionCallInfoData locfcinfo; ! if (element_type != AARR_ELEMTYPE(array2)) ereport(ERROR, (errcode(ERRCODE_DATATYPE_MISMATCH), errmsg("cannot compare arrays of different element types"))); *************** array_cmp(FunctionCallInfo fcinfo) *** 3495,3505 **** /* Loop over source data */ min_nitems = Min(nitems1, nitems2); ! ptr1 = ARR_DATA_PTR(array1); ! ptr2 = ARR_DATA_PTR(array2); ! bitmap1 = ARR_NULLBITMAP(array1); ! bitmap2 = ARR_NULLBITMAP(array2); ! bitmask = 1; /* use same bitmask for both arrays */ for (i = 0; i < min_nitems; i++) { --- 3782,3789 ---- /* Loop over source data */ min_nitems = Min(nitems1, nitems2); ! ARRAY_ITER_SETUP(it1, array1); ! ARRAY_ITER_SETUP(it2, array2); for (i = 0; i < min_nitems; i++) { *************** array_cmp(FunctionCallInfo fcinfo) *** 3510,3551 **** int32 cmpresult; /* Get elements, checking for NULL */ ! if (bitmap1 && (*bitmap1 & bitmask) == 0) ! { ! isnull1 = true; ! elt1 = (Datum) 0; ! } ! else ! { ! isnull1 = false; ! elt1 = fetch_att(ptr1, typbyval, typlen); ! ptr1 = att_addlength_pointer(ptr1, typlen, ptr1); ! ptr1 = (char *) att_align_nominal(ptr1, typalign); ! } ! ! if (bitmap2 && (*bitmap2 & bitmask) == 0) ! { ! isnull2 = true; ! elt2 = (Datum) 0; ! } ! else ! { ! isnull2 = false; ! elt2 = fetch_att(ptr2, typbyval, typlen); ! ptr2 = att_addlength_pointer(ptr2, typlen, ptr2); ! ptr2 = (char *) att_align_nominal(ptr2, typalign); ! } ! ! /* advance bitmap pointers if any */ ! bitmask <<= 1; ! if (bitmask == 0x100) ! { ! if (bitmap1) ! bitmap1++; ! if (bitmap2) ! bitmap2++; ! bitmask = 1; ! } /* * We consider two NULLs equal; NULL > not-NULL. --- 3794,3801 ---- int32 cmpresult; /* Get elements, checking for NULL */ ! ARRAY_ITER_NEXT(it1, i, elt1, isnull1, typlen, typbyval, typalign); ! ARRAY_ITER_NEXT(it2, i, elt2, isnull2, typlen, typbyval, typalign); /* * We consider two NULLs equal; NULL > not-NULL. *************** array_cmp(FunctionCallInfo fcinfo) *** 3604,3611 **** result = (ndims1 < ndims2) ? -1 : 1; else { ! /* this relies on LB array immediately following DIMS array */ ! for (i = 0; i < ndims1 * 2; i++) { if (dims1[i] != dims2[i]) { --- 3854,3860 ---- result = (ndims1 < ndims2) ? -1 : 1; else { ! for (i = 0; i < ndims1; i++) { if (dims1[i] != dims2[i]) { *************** array_cmp(FunctionCallInfo fcinfo) *** 3613,3624 **** break; } } } } /* Avoid leaking memory when handed toasted input. */ ! PG_FREE_IF_COPY(array1, 0); ! PG_FREE_IF_COPY(array2, 1); return result; } --- 3862,3887 ---- break; } } + if (result == 0) + { + int *lbound1 = AARR_LBOUND(array1); + int *lbound2 = AARR_LBOUND(array2); + + for (i = 0; i < ndims1; i++) + { + if (lbound1[i] != lbound2[i]) + { + result = (lbound1[i] < lbound2[i]) ? -1 : 1; + break; + } + } + } } } /* Avoid leaking memory when handed toasted input. */ ! AARR_FREE_IF_COPY(array1, 0); ! AARR_FREE_IF_COPY(array2, 1); return result; } *************** array_cmp(FunctionCallInfo fcinfo) *** 3633,3652 **** Datum hash_array(PG_FUNCTION_ARGS) { ! ArrayType *array = PG_GETARG_ARRAYTYPE_P(0); ! int ndims = ARR_NDIM(array); ! int *dims = ARR_DIMS(array); ! Oid element_type = ARR_ELEMTYPE(array); uint32 result = 1; int nitems; TypeCacheEntry *typentry; int typlen; bool typbyval; char typalign; - char *ptr; - bits8 *bitmap; - int bitmask; int i; FunctionCallInfoData locfcinfo; /* --- 3896,3913 ---- Datum hash_array(PG_FUNCTION_ARGS) { ! AnyArrayType *array = PG_GETARG_ANY_ARRAY(0); ! int ndims = AARR_NDIM(array); ! int *dims = AARR_DIMS(array); ! Oid element_type = AARR_ELEMTYPE(array); uint32 result = 1; int nitems; TypeCacheEntry *typentry; int typlen; bool typbyval; char typalign; int i; + ARRAY_ITER ARRAY_ITER_VARS(iter); FunctionCallInfoData locfcinfo; /* *************** hash_array(PG_FUNCTION_ARGS) *** 3680,3707 **** /* Loop over source data */ nitems = ArrayGetNItems(ndims, dims); ! ptr = ARR_DATA_PTR(array); ! bitmap = ARR_NULLBITMAP(array); ! bitmask = 1; for (i = 0; i < nitems; i++) { uint32 elthash; /* Get element, checking for NULL */ ! if (bitmap && (*bitmap & bitmask) == 0) { /* Treat nulls as having hashvalue 0 */ elthash = 0; } else { - Datum elt; - - elt = fetch_att(ptr, typbyval, typlen); - ptr = att_addlength_pointer(ptr, typlen, ptr); - ptr = (char *) att_align_nominal(ptr, typalign); - /* Apply the hash function */ locfcinfo.arg[0] = elt; locfcinfo.argnull[0] = false; --- 3941,3964 ---- /* Loop over source data */ nitems = ArrayGetNItems(ndims, dims); ! ARRAY_ITER_SETUP(iter, array); for (i = 0; i < nitems; i++) { + Datum elt; + bool isnull; uint32 elthash; /* Get element, checking for NULL */ ! ARRAY_ITER_NEXT(iter, i, elt, isnull, typlen, typbyval, typalign); ! ! if (isnull) { /* Treat nulls as having hashvalue 0 */ elthash = 0; } else { /* Apply the hash function */ locfcinfo.arg[0] = elt; locfcinfo.argnull[0] = false; *************** hash_array(PG_FUNCTION_ARGS) *** 3709,3725 **** elthash = DatumGetUInt32(FunctionCallInvoke(&locfcinfo)); } - /* advance bitmap pointer if any */ - if (bitmap) - { - bitmask <<= 1; - if (bitmask == 0x100) - { - bitmap++; - bitmask = 1; - } - } - /* * Combine hash values of successive elements by multiplying the * current value by 31 and adding on the new element's hash value. --- 3966,3971 ---- *************** hash_array(PG_FUNCTION_ARGS) *** 3735,3741 **** } /* Avoid leaking memory when handed toasted input. */ ! PG_FREE_IF_COPY(array, 0); PG_RETURN_UINT32(result); } --- 3981,3987 ---- } /* Avoid leaking memory when handed toasted input. */ ! AARR_FREE_IF_COPY(array, 0); PG_RETURN_UINT32(result); } *************** hash_array(PG_FUNCTION_ARGS) *** 3756,3766 **** * When matchall is false, return true if any members of array1 are in array2. */ static bool ! array_contain_compare(ArrayType *array1, ArrayType *array2, Oid collation, bool matchall, void **fn_extra) { bool result = matchall; ! Oid element_type = ARR_ELEMTYPE(array1); TypeCacheEntry *typentry; int nelems1; Datum *values2; --- 4002,4012 ---- * When matchall is false, return true if any members of array1 are in array2. */ static bool ! array_contain_compare(AnyArrayType *array1, AnyArrayType *array2, Oid collation, bool matchall, void **fn_extra) { bool result = matchall; ! Oid element_type = AARR_ELEMTYPE(array1); TypeCacheEntry *typentry; int nelems1; Datum *values2; *************** array_contain_compare(ArrayType *array1, *** 3769,3782 **** int typlen; bool typbyval; char typalign; - char *ptr1; - bits8 *bitmap1; - int bitmask; int i; int j; FunctionCallInfoData locfcinfo; ! if (element_type != ARR_ELEMTYPE(array2)) ereport(ERROR, (errcode(ERRCODE_DATATYPE_MISMATCH), errmsg("cannot compare arrays of different element types"))); --- 4015,4026 ---- int typlen; bool typbyval; char typalign; int i; int j; + ARRAY_ITER ARRAY_ITER_VARS(it1); FunctionCallInfoData locfcinfo; ! if (element_type != AARR_ELEMTYPE(array2)) ereport(ERROR, (errcode(ERRCODE_DATATYPE_MISMATCH), errmsg("cannot compare arrays of different element types"))); *************** array_contain_compare(ArrayType *array1, *** 3809,3816 **** * worthwhile to use deconstruct_array on it. We scan array1 the hard way * however, since we very likely won't need to look at all of it. */ ! deconstruct_array(array2, element_type, typlen, typbyval, typalign, ! &values2, &nulls2, &nelems2); /* * Apply the comparison operator to each pair of array elements. --- 4053,4070 ---- * worthwhile to use deconstruct_array on it. We scan array1 the hard way * however, since we very likely won't need to look at all of it. */ ! if (VARATT_IS_EXPANDED_HEADER(array2)) ! { ! /* This should be safe even if input is read-only */ ! deconstruct_expanded_array(&(array2->xpn)); ! values2 = array2->xpn.dvalues; ! nulls2 = array2->xpn.dnulls; ! nelems2 = array2->xpn.nelems; ! } ! else ! deconstruct_array(&(array2->flt), ! element_type, typlen, typbyval, typalign, ! &values2, &nulls2, &nelems2); /* * Apply the comparison operator to each pair of array elements. *************** array_contain_compare(ArrayType *array1, *** 3819,3828 **** collation, NULL, NULL); /* Loop over source data */ ! nelems1 = ArrayGetNItems(ARR_NDIM(array1), ARR_DIMS(array1)); ! ptr1 = ARR_DATA_PTR(array1); ! bitmap1 = ARR_NULLBITMAP(array1); ! bitmask = 1; for (i = 0; i < nelems1; i++) { --- 4073,4080 ---- collation, NULL, NULL); /* Loop over source data */ ! nelems1 = ArrayGetNItems(AARR_NDIM(array1), AARR_DIMS(array1)); ! ARRAY_ITER_SETUP(it1, array1); for (i = 0; i < nelems1; i++) { *************** array_contain_compare(ArrayType *array1, *** 3830,3856 **** bool isnull1; /* Get element, checking for NULL */ ! if (bitmap1 && (*bitmap1 & bitmask) == 0) ! { ! isnull1 = true; ! elt1 = (Datum) 0; ! } ! else ! { ! isnull1 = false; ! elt1 = fetch_att(ptr1, typbyval, typlen); ! ptr1 = att_addlength_pointer(ptr1, typlen, ptr1); ! ptr1 = (char *) att_align_nominal(ptr1, typalign); ! } ! ! /* advance bitmap pointer if any */ ! bitmask <<= 1; ! if (bitmask == 0x100) ! { ! if (bitmap1) ! bitmap1++; ! bitmask = 1; ! } /* * We assume that the comparison operator is strict, so a NULL can't --- 4082,4088 ---- bool isnull1; /* Get element, checking for NULL */ ! ARRAY_ITER_NEXT(it1, i, elt1, isnull1, typlen, typbyval, typalign); /* * We assume that the comparison operator is strict, so a NULL can't *************** array_contain_compare(ArrayType *array1, *** 3909,3925 **** } } - pfree(values2); - pfree(nulls2); - return result; } Datum arrayoverlap(PG_FUNCTION_ARGS) { ! ArrayType *array1 = PG_GETARG_ARRAYTYPE_P(0); ! ArrayType *array2 = PG_GETARG_ARRAYTYPE_P(1); Oid collation = PG_GET_COLLATION(); bool result; --- 4141,4154 ---- } } return result; } Datum arrayoverlap(PG_FUNCTION_ARGS) { ! AnyArrayType *array1 = PG_GETARG_ANY_ARRAY(0); ! AnyArrayType *array2 = PG_GETARG_ANY_ARRAY(1); Oid collation = PG_GET_COLLATION(); bool result; *************** arrayoverlap(PG_FUNCTION_ARGS) *** 3927,3934 **** &fcinfo->flinfo->fn_extra); /* Avoid leaking memory when handed toasted input. */ ! PG_FREE_IF_COPY(array1, 0); ! PG_FREE_IF_COPY(array2, 1); PG_RETURN_BOOL(result); } --- 4156,4163 ---- &fcinfo->flinfo->fn_extra); /* Avoid leaking memory when handed toasted input. */ ! AARR_FREE_IF_COPY(array1, 0); ! AARR_FREE_IF_COPY(array2, 1); PG_RETURN_BOOL(result); } *************** arrayoverlap(PG_FUNCTION_ARGS) *** 3936,3943 **** Datum arraycontains(PG_FUNCTION_ARGS) { ! ArrayType *array1 = PG_GETARG_ARRAYTYPE_P(0); ! ArrayType *array2 = PG_GETARG_ARRAYTYPE_P(1); Oid collation = PG_GET_COLLATION(); bool result; --- 4165,4172 ---- Datum arraycontains(PG_FUNCTION_ARGS) { ! AnyArrayType *array1 = PG_GETARG_ANY_ARRAY(0); ! AnyArrayType *array2 = PG_GETARG_ANY_ARRAY(1); Oid collation = PG_GET_COLLATION(); bool result; *************** arraycontains(PG_FUNCTION_ARGS) *** 3945,3952 **** &fcinfo->flinfo->fn_extra); /* Avoid leaking memory when handed toasted input. */ ! PG_FREE_IF_COPY(array1, 0); ! PG_FREE_IF_COPY(array2, 1); PG_RETURN_BOOL(result); } --- 4174,4181 ---- &fcinfo->flinfo->fn_extra); /* Avoid leaking memory when handed toasted input. */ ! AARR_FREE_IF_COPY(array1, 0); ! AARR_FREE_IF_COPY(array2, 1); PG_RETURN_BOOL(result); } *************** arraycontains(PG_FUNCTION_ARGS) *** 3954,3961 **** Datum arraycontained(PG_FUNCTION_ARGS) { ! ArrayType *array1 = PG_GETARG_ARRAYTYPE_P(0); ! ArrayType *array2 = PG_GETARG_ARRAYTYPE_P(1); Oid collation = PG_GET_COLLATION(); bool result; --- 4183,4190 ---- Datum arraycontained(PG_FUNCTION_ARGS) { ! AnyArrayType *array1 = PG_GETARG_ANY_ARRAY(0); ! AnyArrayType *array2 = PG_GETARG_ANY_ARRAY(1); Oid collation = PG_GET_COLLATION(); bool result; *************** arraycontained(PG_FUNCTION_ARGS) *** 3963,3970 **** &fcinfo->flinfo->fn_extra); /* Avoid leaking memory when handed toasted input. */ ! PG_FREE_IF_COPY(array1, 0); ! PG_FREE_IF_COPY(array2, 1); PG_RETURN_BOOL(result); } --- 4192,4199 ---- &fcinfo->flinfo->fn_extra); /* Avoid leaking memory when handed toasted input. */ ! AARR_FREE_IF_COPY(array1, 0); ! AARR_FREE_IF_COPY(array2, 1); PG_RETURN_BOOL(result); } *************** initArrayResult(Oid element_type, Memory *** 4702,4708 **** MemoryContextAlloc(arr_context, sizeof(ArrayBuildState)); astate->mcontext = arr_context; astate->private_cxt = subcontext; ! astate->alen = (subcontext ? 64 : 8); /* arbitrary starting array size */ astate->dvalues = (Datum *) MemoryContextAlloc(arr_context, astate->alen * sizeof(Datum)); astate->dnulls = (bool *) --- 4931,4938 ---- MemoryContextAlloc(arr_context, sizeof(ArrayBuildState)); astate->mcontext = arr_context; astate->private_cxt = subcontext; ! astate->alen = (subcontext ? 64 : 8); /* arbitrary starting array ! * size */ astate->dvalues = (Datum *) MemoryContextAlloc(arr_context, astate->alen * sizeof(Datum)); astate->dnulls = (bool *) *************** initArrayResultArr(Oid array_type, Oid e *** 4878,4887 **** bool subcontext) { ArrayBuildStateArr *astate; ! MemoryContext arr_context = rcontext; /* by default use the parent ctx */ /* Lookup element type, unless element_type already provided */ ! if (! OidIsValid(element_type)) { element_type = get_element_type(array_type); --- 5108,5118 ---- bool subcontext) { ArrayBuildStateArr *astate; ! MemoryContext arr_context = rcontext; /* by default use the parent ! * ctx */ /* Lookup element type, unless element_type already provided */ ! if (!OidIsValid(element_type)) { element_type = get_element_type(array_type); *************** makeArrayResultAny(ArrayBuildStateAny *a *** 5259,5289 **** Datum array_larger(PG_FUNCTION_ARGS) { ! ArrayType *v1, ! *v2, ! *result; ! ! v1 = PG_GETARG_ARRAYTYPE_P(0); ! v2 = PG_GETARG_ARRAYTYPE_P(1); ! ! result = ((array_cmp(fcinfo) > 0) ? v1 : v2); ! ! PG_RETURN_ARRAYTYPE_P(result); } Datum array_smaller(PG_FUNCTION_ARGS) { ! ArrayType *v1, ! *v2, ! *result; ! ! v1 = PG_GETARG_ARRAYTYPE_P(0); ! v2 = PG_GETARG_ARRAYTYPE_P(1); ! ! result = ((array_cmp(fcinfo) < 0) ? v1 : v2); ! ! PG_RETURN_ARRAYTYPE_P(result); } --- 5490,5508 ---- Datum array_larger(PG_FUNCTION_ARGS) { ! if (array_cmp(fcinfo) > 0) ! PG_RETURN_DATUM(PG_GETARG_DATUM(0)); ! else ! PG_RETURN_DATUM(PG_GETARG_DATUM(1)); } Datum array_smaller(PG_FUNCTION_ARGS) { ! if (array_cmp(fcinfo) < 0) ! PG_RETURN_DATUM(PG_GETARG_DATUM(0)); ! else ! PG_RETURN_DATUM(PG_GETARG_DATUM(1)); } *************** generate_subscripts(PG_FUNCTION_ARGS) *** 5308,5314 **** /* stuff done only on the first call of the function */ if (SRF_IS_FIRSTCALL()) { ! ArrayType *v = PG_GETARG_ARRAYTYPE_P(0); int reqdim = PG_GETARG_INT32(1); int *lb, *dimv; --- 5527,5533 ---- /* stuff done only on the first call of the function */ if (SRF_IS_FIRSTCALL()) { ! AnyArrayType *v = PG_GETARG_ANY_ARRAY(0); int reqdim = PG_GETARG_INT32(1); int *lb, *dimv; *************** generate_subscripts(PG_FUNCTION_ARGS) *** 5317,5327 **** funcctx = SRF_FIRSTCALL_INIT(); /* Sanity check: does it look like an array at all? */ ! if (ARR_NDIM(v) <= 0 || ARR_NDIM(v) > MAXDIM) SRF_RETURN_DONE(funcctx); /* Sanity check: was the requested dim valid */ ! if (reqdim <= 0 || reqdim > ARR_NDIM(v)) SRF_RETURN_DONE(funcctx); /* --- 5536,5546 ---- funcctx = SRF_FIRSTCALL_INIT(); /* Sanity check: does it look like an array at all? */ ! if (AARR_NDIM(v) <= 0 || AARR_NDIM(v) > MAXDIM) SRF_RETURN_DONE(funcctx); /* Sanity check: was the requested dim valid */ ! if (reqdim <= 0 || reqdim > AARR_NDIM(v)) SRF_RETURN_DONE(funcctx); /* *************** generate_subscripts(PG_FUNCTION_ARGS) *** 5330,5337 **** oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx); fctx = (generate_subscripts_fctx *) palloc(sizeof(generate_subscripts_fctx)); ! lb = ARR_LBOUND(v); ! dimv = ARR_DIMS(v); fctx->lower = lb[reqdim - 1]; fctx->upper = dimv[reqdim - 1] + lb[reqdim - 1] - 1; --- 5549,5556 ---- oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx); fctx = (generate_subscripts_fctx *) palloc(sizeof(generate_subscripts_fctx)); ! lb = AARR_LBOUND(v); ! dimv = AARR_DIMS(v); fctx->lower = lb[reqdim - 1]; fctx->upper = dimv[reqdim - 1] + lb[reqdim - 1] - 1; *************** array_unnest(PG_FUNCTION_ARGS) *** 5650,5660 **** { typedef struct { ! ArrayType *arr; int nextelem; int numelems; - char *elemdataptr; /* this moves with nextelem */ - bits8 *arraynullsptr; /* this does not */ int16 elmlen; bool elmbyval; char elmalign; --- 5869,5877 ---- { typedef struct { ! ARRAY_ITER ARRAY_ITER_VARS(iter); int nextelem; int numelems; int16 elmlen; bool elmbyval; char elmalign; *************** array_unnest(PG_FUNCTION_ARGS) *** 5667,5673 **** /* stuff done only on the first call of the function */ if (SRF_IS_FIRSTCALL()) { ! ArrayType *arr; /* create a function context for cross-call persistence */ funcctx = SRF_FIRSTCALL_INIT(); --- 5884,5890 ---- /* stuff done only on the first call of the function */ if (SRF_IS_FIRSTCALL()) { ! AnyArrayType *arr; /* create a function context for cross-call persistence */ funcctx = SRF_FIRSTCALL_INIT(); *************** array_unnest(PG_FUNCTION_ARGS) *** 5684,5706 **** * and not before. (If no detoast happens, we assume the originally * passed array will stick around till then.) */ ! arr = PG_GETARG_ARRAYTYPE_P(0); /* allocate memory for user context */ fctx = (array_unnest_fctx *) palloc(sizeof(array_unnest_fctx)); /* initialize state */ ! fctx->arr = arr; fctx->nextelem = 0; ! fctx->numelems = ArrayGetNItems(ARR_NDIM(arr), ARR_DIMS(arr)); ! ! fctx->elemdataptr = ARR_DATA_PTR(arr); ! fctx->arraynullsptr = ARR_NULLBITMAP(arr); ! get_typlenbyvalalign(ARR_ELEMTYPE(arr), ! &fctx->elmlen, ! &fctx->elmbyval, ! &fctx->elmalign); funcctx->user_fctx = fctx; MemoryContextSwitchTo(oldcontext); --- 5901,5928 ---- * and not before. (If no detoast happens, we assume the originally * passed array will stick around till then.) */ ! arr = PG_GETARG_ANY_ARRAY(0); /* allocate memory for user context */ fctx = (array_unnest_fctx *) palloc(sizeof(array_unnest_fctx)); /* initialize state */ ! ARRAY_ITER_SETUP(fctx->iter, arr); fctx->nextelem = 0; ! fctx->numelems = ArrayGetNItems(AARR_NDIM(arr), AARR_DIMS(arr)); ! if (VARATT_IS_EXPANDED_HEADER(arr)) ! { ! /* we can just grab the type data from expanded array */ ! fctx->elmlen = arr->xpn.typlen; ! fctx->elmbyval = arr->xpn.typbyval; ! fctx->elmalign = arr->xpn.typalign; ! } ! else ! get_typlenbyvalalign(AARR_ELEMTYPE(arr), ! &fctx->elmlen, ! &fctx->elmbyval, ! &fctx->elmalign); funcctx->user_fctx = fctx; MemoryContextSwitchTo(oldcontext); *************** array_unnest(PG_FUNCTION_ARGS) *** 5715,5746 **** int offset = fctx->nextelem++; Datum elem; ! /* ! * Check for NULL array element ! */ ! if (array_get_isnull(fctx->arraynullsptr, offset)) ! { ! fcinfo->isnull = true; ! elem = (Datum) 0; ! /* elemdataptr does not move */ ! } ! else ! { ! /* ! * OK, get the element ! */ ! char *ptr = fctx->elemdataptr; ! ! fcinfo->isnull = false; ! elem = ArrayCast(ptr, fctx->elmbyval, fctx->elmlen); ! ! /* ! * Advance elemdataptr over it ! */ ! ptr = att_addlength_pointer(ptr, fctx->elmlen, ptr); ! ptr = (char *) att_align_nominal(ptr, fctx->elmalign); ! fctx->elemdataptr = ptr; ! } SRF_RETURN_NEXT(funcctx, elem); } --- 5937,5944 ---- int offset = fctx->nextelem++; Datum elem; ! ARRAY_ITER_NEXT(fctx->iter, offset, elem, fcinfo->isnull, ! fctx->elmlen, fctx->elmbyval, fctx->elmalign); SRF_RETURN_NEXT(funcctx, elem); } *************** array_replace_internal(ArrayType *array, *** 5992,5998 **** result->ndim = ndim; result->dataoffset = dataoffset; result->elemtype = element_type; ! memcpy(ARR_DIMS(result), ARR_DIMS(array), 2 * ndim * sizeof(int)); if (remove) { --- 6190,6197 ---- result->ndim = ndim; result->dataoffset = dataoffset; result->elemtype = element_type; ! memcpy(ARR_DIMS(result), ARR_DIMS(array), ndim * sizeof(int)); ! memcpy(ARR_LBOUND(result), ARR_LBOUND(array), ndim * sizeof(int)); if (remove) { diff --git a/src/backend/utils/adt/datum.c b/src/backend/utils/adt/datum.c index 014eca5..e8af030 100644 *** a/src/backend/utils/adt/datum.c --- b/src/backend/utils/adt/datum.c *************** *** 12,19 **** * *------------------------------------------------------------------------- */ /* ! * In the implementation of the next routines we assume the following: * * A) if a type is "byVal" then all the information is stored in the * Datum itself (i.e. no pointers involved!). In this case the --- 12,20 ---- * *------------------------------------------------------------------------- */ + /* ! * In the implementation of these routines we assume the following: * * A) if a type is "byVal" then all the information is stored in the * Datum itself (i.e. no pointers involved!). In this case the *************** *** 34,44 **** --- 35,49 ---- * * Note that we do not treat "toasted" datums specially; therefore what * will be copied or compared is the compressed data or toast reference. + * An exception is made for datumCopy() of an expanded object, however, + * because most callers expect to get a simple contiguous (and pfree'able) + * result from datumCopy(). See also datumTransfer(). */ #include "postgres.h" #include "utils/datum.h" + #include "utils/expandeddatum.h" /*------------------------------------------------------------------------- *************** *** 46,51 **** --- 51,57 ---- * * Find the "real" size of a datum, given the datum value, * whether it is a "by value", and the declared type length. + * (For TOAST pointer datums, this is the size of the pointer datum.) * * This is essentially an out-of-line version of the att_addlength_datum() * macro in access/tupmacs.h. We do a tad more error checking though. *************** datumGetSize(Datum value, bool typByVal, *** 106,114 **** /*------------------------------------------------------------------------- * datumCopy * ! * make a copy of a datum * * If the datatype is pass-by-reference, memory is obtained with palloc(). *------------------------------------------------------------------------- */ Datum --- 112,127 ---- /*------------------------------------------------------------------------- * datumCopy * ! * Make a copy of a non-NULL datum. * * If the datatype is pass-by-reference, memory is obtained with palloc(). + * + * If the value is a reference to an expanded object, we flatten into memory + * obtained with palloc(). We need to copy because one of the main uses of + * this function is to copy a datum out of a transient memory context that's + * about to be destroyed, and the expanded object is probably in a child + * context that will also go away. Moreover, many callers assume that the + * result is a single pfree-able chunk. *------------------------------------------------------------------------- */ Datum *************** datumCopy(Datum value, bool typByVal, in *** 118,161 **** if (typByVal) res = value; else { Size realSize; ! char *s; ! ! if (DatumGetPointer(value) == NULL) ! return PointerGetDatum(NULL); realSize = datumGetSize(value, typByVal, typLen); ! s = (char *) palloc(realSize); ! memcpy(s, DatumGetPointer(value), realSize); ! res = PointerGetDatum(s); } return res; } /*------------------------------------------------------------------------- ! * datumFree * ! * Free the space occupied by a datum CREATED BY "datumCopy" * ! * NOTE: DO NOT USE THIS ROUTINE with datums returned by heap_getattr() etc. ! * ONLY datums created by "datumCopy" can be freed! *------------------------------------------------------------------------- */ ! #ifdef NOT_USED ! void ! datumFree(Datum value, bool typByVal, int typLen) { ! if (!typByVal) ! { ! Pointer s = DatumGetPointer(value); ! ! pfree(s); ! } } - #endif /*------------------------------------------------------------------------- * datumIsEqual --- 131,201 ---- if (typByVal) res = value; + else if (typLen == -1) + { + /* It is a varlena datatype */ + struct varlena *vl = (struct varlena *) DatumGetPointer(value); + + if (VARATT_IS_EXTERNAL_EXPANDED(vl)) + { + /* Flatten into the caller's memory context */ + ExpandedObjectHeader *eoh = DatumGetEOHP(value); + Size resultsize; + char *resultptr; + + resultsize = EOH_get_flat_size(eoh); + resultptr = (char *) palloc(resultsize); + EOH_flatten_into(eoh, (void *) resultptr, resultsize); + res = PointerGetDatum(resultptr); + } + else + { + /* Otherwise, just copy the varlena datum verbatim */ + Size realSize; + char *resultptr; + + realSize = (Size) VARSIZE_ANY(vl); + resultptr = (char *) palloc(realSize); + memcpy(resultptr, vl, realSize); + res = PointerGetDatum(resultptr); + } + } else { + /* Pass by reference, but not varlena, so not toasted */ Size realSize; ! char *resultptr; realSize = datumGetSize(value, typByVal, typLen); ! resultptr = (char *) palloc(realSize); ! memcpy(resultptr, DatumGetPointer(value), realSize); ! res = PointerGetDatum(resultptr); } return res; } /*------------------------------------------------------------------------- ! * datumTransfer * ! * Transfer a non-NULL datum into the current memory context. * ! * This is equivalent to datumCopy() except when the datum is a read-write ! * pointer to an expanded object. In that case we merely reparent the object ! * into the current context, and return its standard R/W pointer (in case the ! * given one is a transient pointer of shorter lifespan). *------------------------------------------------------------------------- */ ! Datum ! datumTransfer(Datum value, bool typByVal, int typLen) { ! if (!typByVal && typLen == -1 && ! VARATT_IS_EXTERNAL_EXPANDED_RW(DatumGetPointer(value))) ! value = TransferExpandedObject(value, CurrentMemoryContext); ! else ! value = datumCopy(value, typByVal, typLen); ! return value; } /*------------------------------------------------------------------------- * datumIsEqual diff --git a/src/backend/utils/adt/expandeddatum.c b/src/backend/utils/adt/expandeddatum.c index ...039671b . *** a/src/backend/utils/adt/expandeddatum.c --- b/src/backend/utils/adt/expandeddatum.c *************** *** 0 **** --- 1,163 ---- + /*------------------------------------------------------------------------- + * + * expandeddatum.c + * Support functions for "expanded" value representations. + * + * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group + * Portions Copyright (c) 1994, Regents of the University of California + * + * + * IDENTIFICATION + * src/backend/utils/adt/expandeddatum.c + * + *------------------------------------------------------------------------- + */ + #include "postgres.h" + + #include "utils/expandeddatum.h" + #include "utils/memutils.h" + + /* + * DatumGetEOHP + * + * Given a Datum that is an expanded-object reference, extract the pointer. + * + * This is a bit tedious since the pointer may not be properly aligned; + * compare VARATT_EXTERNAL_GET_POINTER(). + */ + ExpandedObjectHeader * + DatumGetEOHP(Datum d) + { + varattrib_1b_e *datum = (varattrib_1b_e *) DatumGetPointer(d); + varatt_expanded ptr; + + Assert(VARATT_IS_EXTERNAL_EXPANDED(datum)); + memcpy(&ptr, VARDATA_EXTERNAL(datum), sizeof(ptr)); + Assert(VARATT_IS_EXPANDED_HEADER(ptr.eohptr)); + return ptr.eohptr; + } + + /* + * EOH_init_header + * + * Initialize the common header of an expanded object. + * + * The main thing this encapsulates is initializing the TOAST pointers. + */ + void + EOH_init_header(ExpandedObjectHeader *eohptr, + const ExpandedObjectMethods *methods, + MemoryContext obj_context) + { + varatt_expanded ptr; + + eohptr->vl_len_ = EOH_HEADER_MAGIC; + eohptr->eoh_methods = methods; + eohptr->eoh_context = obj_context; + + ptr.eohptr = eohptr; + + SET_VARTAG_EXTERNAL(eohptr->eoh_rw_ptr, VARTAG_EXPANDED_RW); + memcpy(VARDATA_EXTERNAL(eohptr->eoh_rw_ptr), &ptr, sizeof(ptr)); + + SET_VARTAG_EXTERNAL(eohptr->eoh_ro_ptr, VARTAG_EXPANDED_RO); + memcpy(VARDATA_EXTERNAL(eohptr->eoh_ro_ptr), &ptr, sizeof(ptr)); + } + + /* + * EOH_get_flat_size + * EOH_flatten_into + * + * Convenience functions for invoking the "methods" of an expanded object. + */ + + Size + EOH_get_flat_size(ExpandedObjectHeader *eohptr) + { + return (*eohptr->eoh_methods->get_flat_size) (eohptr); + } + + void + EOH_flatten_into(ExpandedObjectHeader *eohptr, + void *result, Size allocated_size) + { + (*eohptr->eoh_methods->flatten_into) (eohptr, result, allocated_size); + } + + /* + * Does the Datum represent a writable expanded object? + */ + bool + DatumIsReadWriteExpandedObject(Datum d, bool isnull, int16 typlen) + { + /* Reject if it's NULL or not a varlena type */ + if (isnull || typlen != -1) + return false; + + /* Reject if not a read-write expanded-object pointer */ + if (!VARATT_IS_EXTERNAL_EXPANDED_RW(DatumGetPointer(d))) + return false; + + return true; + } + + /* + * If the Datum represents a R/W expanded object, change it to R/O. + * Otherwise return the original Datum. + */ + Datum + MakeExpandedObjectReadOnly(Datum d, bool isnull, int16 typlen) + { + ExpandedObjectHeader *eohptr; + + /* Nothing to do if it's NULL or not a varlena type */ + if (isnull || typlen != -1) + return d; + + /* Nothing to do if not a read-write expanded-object pointer */ + if (!VARATT_IS_EXTERNAL_EXPANDED_RW(DatumGetPointer(d))) + return d; + + /* Now safe to extract the object pointer */ + eohptr = DatumGetEOHP(d); + + /* Return the built-in read-only pointer instead of given pointer */ + return EOHPGetRODatum(eohptr); + } + + /* + * Transfer ownership of an expanded object to a new parent memory context. + * The object must be referenced by a R/W pointer, and what we return is + * always its "standard" R/W pointer, which is certain to have the same + * lifespan as the object itself. (The passed-in pointer might not, and + * in any case wouldn't provide a unique identifier if it's not that one.) + */ + Datum + TransferExpandedObject(Datum d, MemoryContext new_parent) + { + ExpandedObjectHeader *eohptr = DatumGetEOHP(d); + + /* Assert caller gave a R/W pointer */ + Assert(VARATT_IS_EXTERNAL_EXPANDED_RW(DatumGetPointer(d))); + + /* Transfer ownership */ + MemoryContextSetParent(eohptr->eoh_context, new_parent); + + /* Return the object's standard read-write pointer */ + return EOHPGetRWDatum(eohptr); + } + + /* + * Delete an expanded object (must be referenced by a R/W pointer). + */ + void + DeleteExpandedObject(Datum d) + { + ExpandedObjectHeader *eohptr = DatumGetEOHP(d); + + /* Assert caller gave a R/W pointer */ + Assert(VARATT_IS_EXTERNAL_EXPANDED_RW(DatumGetPointer(d))); + + /* Kill it */ + MemoryContextDelete(eohptr->eoh_context); + } diff --git a/src/backend/utils/mmgr/mcxt.c b/src/backend/utils/mmgr/mcxt.c index e2fbfd4..a13e5e8 100644 *** a/src/backend/utils/mmgr/mcxt.c --- b/src/backend/utils/mmgr/mcxt.c *************** MemoryContextSetParent(MemoryContext con *** 323,328 **** --- 323,332 ---- AssertArg(MemoryContextIsValid(context)); AssertArg(context != new_parent); + /* Fast path if it's got correct parent already */ + if (new_parent == context->parent) + return; + /* Delink from existing parent, if any */ if (context->parent) { diff --git a/src/include/executor/spi.h b/src/include/executor/spi.h index 9e912ba..fbcae0c 100644 *** a/src/include/executor/spi.h --- b/src/include/executor/spi.h *************** extern char *SPI_getnspname(Relation rel *** 124,129 **** --- 124,130 ---- extern void *SPI_palloc(Size size); extern void *SPI_repalloc(void *pointer, Size size); extern void SPI_pfree(void *pointer); + extern Datum SPI_datumTransfer(Datum value, bool typByVal, int typLen); extern void SPI_freetuple(HeapTuple pointer); extern void SPI_freetuptable(SPITupleTable *tuptable); diff --git a/src/include/executor/tuptable.h b/src/include/executor/tuptable.h index 48f84bf..00686b0 100644 *** a/src/include/executor/tuptable.h --- b/src/include/executor/tuptable.h *************** extern Datum ExecFetchSlotTupleDatum(Tup *** 163,168 **** --- 163,169 ---- extern HeapTuple ExecMaterializeSlot(TupleTableSlot *slot); extern TupleTableSlot *ExecCopySlot(TupleTableSlot *dstslot, TupleTableSlot *srcslot); + extern TupleTableSlot *ExecMakeSlotContentsReadOnly(TupleTableSlot *slot); /* in access/common/heaptuple.c */ extern Datum slot_getattr(TupleTableSlot *slot, int attnum, bool *isnull); diff --git a/src/include/nodes/primnodes.h b/src/include/nodes/primnodes.h index 4f1d234..deaa3c5 100644 *** a/src/include/nodes/primnodes.h --- b/src/include/nodes/primnodes.h *************** typedef struct WindowFunc *** 305,310 **** --- 305,314 ---- * Note: the result datatype is the element type when fetching a single * element; but it is the array type when doing subarray fetch or either * type of store. + * + * Note: for the cases where an array is returned, if refexpr yields a R/W + * expanded array, then the implementation is allowed to modify that object + * in-place and return the same object.) * ---------------- */ typedef struct ArrayRef diff --git a/src/include/postgres.h b/src/include/postgres.h index be37313..ccf1605 100644 *** a/src/include/postgres.h --- b/src/include/postgres.h *************** typedef struct varatt_indirect *** 88,93 **** --- 88,110 ---- } varatt_indirect; /* + * struct varatt_expanded is a "TOAST pointer" representing an out-of-line + * Datum that is stored in memory, in some type-specific, not necessarily + * physically contiguous format that is convenient for computation not + * storage. APIs for this, in particular the definition of struct + * ExpandedObjectHeader, are in src/include/utils/expandeddatum.h. + * + * Note that just as for struct varatt_external, this struct is stored + * unaligned within any containing tuple. + */ + typedef struct ExpandedObjectHeader ExpandedObjectHeader; + + typedef struct varatt_expanded + { + ExpandedObjectHeader *eohptr; + } varatt_expanded; + + /* * Type tag for the various sorts of "TOAST pointer" datums. The peculiar * value for VARTAG_ONDISK comes from a requirement for on-disk compatibility * with a previous notion that the tag field was the pointer datum's length. *************** typedef struct varatt_indirect *** 95,105 **** --- 112,129 ---- typedef enum vartag_external { VARTAG_INDIRECT = 1, + VARTAG_EXPANDED_RO = 2, + VARTAG_EXPANDED_RW = 3, VARTAG_ONDISK = 18 } vartag_external; + /* this test relies on the specific tag values above */ + #define VARTAG_IS_EXPANDED(tag) \ + (((tag) & ~1) == VARTAG_EXPANDED_RO) + #define VARTAG_SIZE(tag) \ ((tag) == VARTAG_INDIRECT ? sizeof(varatt_indirect) : \ + VARTAG_IS_EXPANDED(tag) ? sizeof(varatt_expanded) : \ (tag) == VARTAG_ONDISK ? sizeof(varatt_external) : \ TrapMacro(true, "unrecognized TOAST vartag")) *************** typedef struct *** 294,299 **** --- 318,329 ---- (VARATT_IS_EXTERNAL(PTR) && VARTAG_EXTERNAL(PTR) == VARTAG_ONDISK) #define VARATT_IS_EXTERNAL_INDIRECT(PTR) \ (VARATT_IS_EXTERNAL(PTR) && VARTAG_EXTERNAL(PTR) == VARTAG_INDIRECT) + #define VARATT_IS_EXTERNAL_EXPANDED_RO(PTR) \ + (VARATT_IS_EXTERNAL(PTR) && VARTAG_EXTERNAL(PTR) == VARTAG_EXPANDED_RO) + #define VARATT_IS_EXTERNAL_EXPANDED_RW(PTR) \ + (VARATT_IS_EXTERNAL(PTR) && VARTAG_EXTERNAL(PTR) == VARTAG_EXPANDED_RW) + #define VARATT_IS_EXTERNAL_EXPANDED(PTR) \ + (VARATT_IS_EXTERNAL(PTR) && VARTAG_IS_EXPANDED(VARTAG_EXTERNAL(PTR))) #define VARATT_IS_SHORT(PTR) VARATT_IS_1B(PTR) #define VARATT_IS_EXTENDED(PTR) (!VARATT_IS_4B_U(PTR)) diff --git a/src/include/utils/array.h b/src/include/utils/array.h index b78b42a..66452f5 100644 *** a/src/include/utils/array.h --- b/src/include/utils/array.h *************** *** 45,50 **** --- 45,55 ---- * We support subscripting on these types, but array_in() and array_out() * only work with varlena arrays. * + * In addition, arrays are a major user of the "expanded object" TOAST + * infrastructure. This allows a varlena array to be converted to a + * separate representation that may include "deconstructed" Datum/isnull + * arrays holding the elements. + * * * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group * Portions Copyright (c) 1994, Regents of the University of California *************** *** 57,62 **** --- 62,69 ---- #define ARRAY_H #include "fmgr.h" + #include "utils/expandeddatum.h" + /* * Arrays are varlena objects, so must meet the varlena convention that *************** typedef struct *** 75,80 **** --- 82,167 ---- } ArrayType; /* + * An expanded array is contained within a private memory context (as + * all expanded objects must be) and has a control structure as below. + * + * The expanded array might contain a regular "flat" array if that was the + * original input and we've not modified it significantly. Otherwise, the + * contents are represented by Datum/isnull arrays plus dimensionality and + * type information. We could also have both forms, if we've deconstructed + * the original array for access purposes but not yet changed it. For pass- + * by-reference element types, the Datums would point into the flat array in + * this situation. Once we start modifying array elements, new pass-by-ref + * elements are separately palloc'd within the memory context. + */ + #define EA_MAGIC 689375833 /* ID for debugging crosschecks */ + + typedef struct ExpandedArrayHeader + { + /* Standard header for expanded objects */ + ExpandedObjectHeader hdr; + + /* Magic value identifying an expanded array (for debugging only) */ + int ea_magic; + + /* Dimensionality info (always valid) */ + int ndims; /* # of dimensions */ + int *dims; /* array dimensions */ + int *lbound; /* index lower bounds for each dimension */ + + /* Element type info (always valid) */ + Oid element_type; /* element type OID */ + int16 typlen; /* needed info about element datatype */ + bool typbyval; + char typalign; + + /* + * If we have a Datum-array representation of the array, it's kept here; + * else dvalues/dnulls are NULL. The dvalues and dnulls arrays are always + * palloc'd within the object private context, but may change size from + * time to time. For pass-by-ref element types, dvalues entries might + * point either into the fstartptr..fendptr area, or to separately + * palloc'd chunks. Elements should always be fully detoasted, as they + * are in the standard flat representation. + * + * Even when dvalues is valid, dnulls can be NULL if there are no null + * elements. + */ + Datum *dvalues; /* array of Datums */ + bool *dnulls; /* array of is-null flags for Datums */ + int dvalueslen; /* allocated length of above arrays */ + int nelems; /* number of valid entries in above arrays */ + + /* + * flat_size is the current space requirement for the flat equivalent of + * the expanded array, if known; otherwise it's 0. We store this to make + * consecutive calls of get_flat_size cheap. + */ + Size flat_size; + + /* + * fvalue points to the flat representation if it is valid, else it is + * NULL. If we have or ever had a flat representation then + * fstartptr/fendptr point to the start and end+1 of its data area; this + * is so that we can tell which Datum pointers point into the flat + * representation rather than being pointers to separately palloc'd data. + */ + ArrayType *fvalue; /* must be a fully detoasted array */ + char *fstartptr; /* start of its data area */ + char *fendptr; /* end+1 of its data area */ + } ExpandedArrayHeader; + + /* + * Functions that can handle either a "flat" varlena array or an expanded + * array use this union to work with their input. + */ + typedef union AnyArrayType + { + ArrayType flt; + ExpandedArrayHeader xpn; + } AnyArrayType; + + /* * working state for accumArrayResult() and friends * note that the input must be scalars (legal array elements) */ *************** typedef struct ArrayMapState *** 151,167 **** /* ArrayIteratorData is private in arrayfuncs.c */ typedef struct ArrayIteratorData *ArrayIterator; ! /* ! * fmgr macros for array objects ! */ #define DatumGetArrayTypeP(X) ((ArrayType *) PG_DETOAST_DATUM(X)) #define DatumGetArrayTypePCopy(X) ((ArrayType *) PG_DETOAST_DATUM_COPY(X)) #define PG_GETARG_ARRAYTYPE_P(n) DatumGetArrayTypeP(PG_GETARG_DATUM(n)) #define PG_GETARG_ARRAYTYPE_P_COPY(n) DatumGetArrayTypePCopy(PG_GETARG_DATUM(n)) #define PG_RETURN_ARRAYTYPE_P(x) PG_RETURN_POINTER(x) /* ! * Access macros for array header fields. * * ARR_DIMS returns a pointer to an array of array dimensions (number of * elements along the various array axes). --- 238,261 ---- /* ArrayIteratorData is private in arrayfuncs.c */ typedef struct ArrayIteratorData *ArrayIterator; ! /* fmgr macros for regular varlena array objects */ #define DatumGetArrayTypeP(X) ((ArrayType *) PG_DETOAST_DATUM(X)) #define DatumGetArrayTypePCopy(X) ((ArrayType *) PG_DETOAST_DATUM_COPY(X)) #define PG_GETARG_ARRAYTYPE_P(n) DatumGetArrayTypeP(PG_GETARG_DATUM(n)) #define PG_GETARG_ARRAYTYPE_P_COPY(n) DatumGetArrayTypePCopy(PG_GETARG_DATUM(n)) #define PG_RETURN_ARRAYTYPE_P(x) PG_RETURN_POINTER(x) + /* fmgr macros for expanded array objects */ + #define PG_GETARG_EXPANDED_ARRAY(n) DatumGetExpandedArray(PG_GETARG_DATUM(n)) + #define PG_GETARG_EXPANDED_ARRAYX(n, metacache) \ + DatumGetExpandedArrayX(PG_GETARG_DATUM(n), metacache) + #define PG_RETURN_EXPANDED_ARRAY(x) PG_RETURN_DATUM(EOHPGetRWDatum(&(x)->hdr)) + + /* fmgr macros for AnyArrayType (ie, get either varlena or expanded form) */ + #define PG_GETARG_ANY_ARRAY(n) DatumGetAnyArray(PG_GETARG_DATUM(n)) + /* ! * Access macros for varlena array header fields. * * ARR_DIMS returns a pointer to an array of array dimensions (number of * elements along the various array axes). *************** typedef struct ArrayIteratorData *ArrayI *** 209,214 **** --- 303,404 ---- #define ARR_DATA_PTR(a) \ (((char *) (a)) + ARR_DATA_OFFSET(a)) + /* + * Macros for working with AnyArrayType inputs. Beware multiple references! + */ + #define AARR_NDIM(a) \ + (VARATT_IS_EXPANDED_HEADER(a) ? (a)->xpn.ndims : ARR_NDIM(&(a)->flt)) + #define AARR_HASNULL(a) \ + (VARATT_IS_EXPANDED_HEADER(a) ? \ + ((a)->xpn.dvalues != NULL ? (a)->xpn.dnulls != NULL : ARR_HASNULL((a)->xpn.fvalue)) : \ + ARR_HASNULL(&(a)->flt)) + #define AARR_ELEMTYPE(a) \ + (VARATT_IS_EXPANDED_HEADER(a) ? (a)->xpn.element_type : ARR_ELEMTYPE(&(a)->flt)) + #define AARR_DIMS(a) \ + (VARATT_IS_EXPANDED_HEADER(a) ? (a)->xpn.dims : ARR_DIMS(&(a)->flt)) + #define AARR_LBOUND(a) \ + (VARATT_IS_EXPANDED_HEADER(a) ? (a)->xpn.lbound : ARR_LBOUND(&(a)->flt)) + + /* + * Macros for iterating through elements of a flat or expanded array. + * Use "ARRAY_ITER ARRAY_ITER_VARS(name);" to declare the local variables + * needed for an iterator (more than one set can be used in the same function, + * if they have different names). + * Use "ARRAY_ITER_SETUP(name, arrayptr);" to prepare to iterate, and + * "ARRAY_ITER_NEXT(name, index, datumvar, isnullvar, ...);" to fetch the + * next element into datumvar/isnullvar. "index" must be the zero-origin + * element number; we make caller provide this since caller is generally + * counting the elements anyway. + */ + #define ARRAY_ITER /* dummy type name to keep pgindent happy */ + + #define ARRAY_ITER_VARS(iter) \ + Datum *iter##datumptr; \ + bool *iter##isnullptr; \ + char *iter##dataptr; \ + bits8 *iter##bitmapptr; \ + int iter##bitmask + + #define ARRAY_ITER_SETUP(iter, arrayptr) \ + do { \ + if (VARATT_IS_EXPANDED_HEADER(arrayptr)) \ + { \ + if ((arrayptr)->xpn.dvalues) \ + { \ + (iter##datumptr) = (arrayptr)->xpn.dvalues; \ + (iter##isnullptr) = (arrayptr)->xpn.dnulls; \ + (iter##dataptr) = NULL; \ + (iter##bitmapptr) = NULL; \ + } \ + else \ + { \ + (iter##datumptr) = NULL; \ + (iter##isnullptr) = NULL; \ + (iter##dataptr) = ARR_DATA_PTR((arrayptr)->xpn.fvalue); \ + (iter##bitmapptr) = ARR_NULLBITMAP((arrayptr)->xpn.fvalue); \ + } \ + } \ + else \ + { \ + (iter##datumptr) = NULL; \ + (iter##isnullptr) = NULL; \ + (iter##dataptr) = ARR_DATA_PTR(&(arrayptr)->flt); \ + (iter##bitmapptr) = ARR_NULLBITMAP(&(arrayptr)->flt); \ + } \ + (iter##bitmask) = 1; \ + } while (0) + + #define ARRAY_ITER_NEXT(iter,i, datumvar,isnullvar, elmlen,elmbyval,elmalign) \ + do { \ + if (iter##datumptr) \ + { \ + (datumvar) = (iter##datumptr)[i]; \ + (isnullvar) = (iter##isnullptr) ? (iter##isnullptr)[i] : false; \ + } \ + else \ + { \ + if ((iter##bitmapptr) && (*(iter##bitmapptr) & (iter##bitmask)) == 0) \ + { \ + (isnullvar) = true; \ + (datumvar) = (Datum) 0; \ + } \ + else \ + { \ + (isnullvar) = false; \ + (datumvar) = fetch_att(iter##dataptr, elmbyval, elmlen); \ + (iter##dataptr) = att_addlength_pointer(iter##dataptr, elmlen, iter##dataptr); \ + (iter##dataptr) = (char *) att_align_nominal(iter##dataptr, elmalign); \ + } \ + (iter##bitmask) <<= 1; \ + if ((iter##bitmask) == 0x100) \ + { \ + if (iter##bitmapptr) \ + (iter##bitmapptr)++; \ + (iter##bitmask) = 1; \ + } \ + } \ + } while (0) + /* * GUC parameter *************** extern Datum array_remove(PG_FUNCTION_AR *** 250,255 **** --- 440,454 ---- extern Datum array_replace(PG_FUNCTION_ARGS); extern Datum width_bucket_array(PG_FUNCTION_ARGS); + extern void CopyArrayEls(ArrayType *array, + Datum *values, + bool *nulls, + int nitems, + int typlen, + bool typbyval, + char typalign, + bool freedata); + extern Datum array_get_element(Datum arraydatum, int nSubscripts, int *indx, int arraytyplen, int elmlen, bool elmbyval, char elmalign, bool *isNull); *************** extern ArrayType *array_set(ArrayType *a *** 271,277 **** Datum dataValue, bool isNull, int arraytyplen, int elmlen, bool elmbyval, char elmalign); ! extern Datum array_map(FunctionCallInfo fcinfo, Oid inpType, Oid retType, ArrayMapState *amstate); extern void array_bitmap_copy(bits8 *destbitmap, int destoffset, --- 470,476 ---- Datum dataValue, bool isNull, int arraytyplen, int elmlen, bool elmbyval, char elmalign); ! extern Datum array_map(FunctionCallInfo fcinfo, Oid retType, ArrayMapState *amstate); extern void array_bitmap_copy(bits8 *destbitmap, int destoffset, *************** extern ArrayType *construct_md_array(Dat *** 288,293 **** --- 487,495 ---- int *lbs, Oid elmtype, int elmlen, bool elmbyval, char elmalign); extern ArrayType *construct_empty_array(Oid elmtype); + extern ExpandedArrayHeader *construct_empty_expanded_array(Oid element_type, + MemoryContext parentcontext, + ArrayMetaState *metacache); extern void deconstruct_array(ArrayType *array, Oid elmtype, int elmlen, bool elmbyval, char elmalign, *************** extern int mda_next_tuple(int n, int *cu *** 341,346 **** --- 543,559 ---- extern int32 *ArrayGetIntegerTypmods(ArrayType *arr, int *n); /* + * prototypes for functions defined in array_expanded.c + */ + extern Datum expand_array(Datum arraydatum, MemoryContext parentcontext, + ArrayMetaState *metacache); + extern ExpandedArrayHeader *DatumGetExpandedArray(Datum d); + extern ExpandedArrayHeader *DatumGetExpandedArrayX(Datum d, + ArrayMetaState *metacache); + extern AnyArrayType *DatumGetAnyArray(Datum d); + extern void deconstruct_expanded_array(ExpandedArrayHeader *eah); + + /* * prototypes for functions defined in array_userfuncs.c */ extern Datum array_append(PG_FUNCTION_ARGS); diff --git a/src/include/utils/datum.h b/src/include/utils/datum.h index 663414b..c572f79 100644 *** a/src/include/utils/datum.h --- b/src/include/utils/datum.h *************** *** 24,41 **** extern Size datumGetSize(Datum value, bool typByVal, int typLen); /* ! * datumCopy - make a copy of a datum. * * If the datatype is pass-by-reference, memory is obtained with palloc(). */ extern Datum datumCopy(Datum value, bool typByVal, int typLen); /* ! * datumFree - free a datum previously allocated by datumCopy, if any. * ! * Does nothing if datatype is pass-by-value. */ ! extern void datumFree(Datum value, bool typByVal, int typLen); /* * datumIsEqual --- 24,41 ---- extern Size datumGetSize(Datum value, bool typByVal, int typLen); /* ! * datumCopy - make a copy of a non-NULL datum. * * If the datatype is pass-by-reference, memory is obtained with palloc(). */ extern Datum datumCopy(Datum value, bool typByVal, int typLen); /* ! * datumTransfer - transfer a non-NULL datum into the current memory context. * ! * Differs from datumCopy() in its handling of read-write expanded objects. */ ! extern Datum datumTransfer(Datum value, bool typByVal, int typLen); /* * datumIsEqual diff --git a/src/include/utils/expandeddatum.h b/src/include/utils/expandeddatum.h index ...3a8336e . *** a/src/include/utils/expandeddatum.h --- b/src/include/utils/expandeddatum.h *************** *** 0 **** --- 1,148 ---- + /*------------------------------------------------------------------------- + * + * expandeddatum.h + * Declarations for access to "expanded" value representations. + * + * Complex data types, particularly container types such as arrays and + * records, usually have on-disk representations that are compact but not + * especially convenient to modify. What's more, when we do modify them, + * having to recopy all the rest of the value can be extremely inefficient. + * Therefore, we provide a notion of an "expanded" representation that is used + * only in memory and is optimized more for computation than storage. + * The format appearing on disk is called the data type's "flattened" + * representation, since it is required to be a contiguous blob of bytes -- + * but the type can have an expanded representation that is not. Data types + * must provide means to translate an expanded representation back to + * flattened form. + * + * An expanded object is meant to survive across multiple operations, but + * not to be enormously long-lived; for example it might be a local variable + * in a PL/pgSQL procedure. So its extra bulk compared to the on-disk format + * is a worthwhile trade-off. + * + * References to expanded objects are a type of TOAST pointer. + * Because of longstanding conventions in Postgres, this means that the + * flattened form of such an object must always be a varlena object. + * Fortunately that's no restriction in practice. + * + * There are actually two kinds of TOAST pointers for expanded objects: + * read-only and read-write pointers. Possession of one of the latter + * authorizes a function to modify the value in-place rather than copying it + * as would normally be required. Functions should always return a read-write + * pointer to any new expanded object they create. Functions that modify an + * argument value in-place must take care that they do not corrupt the old + * value if they fail partway through. + * + * + * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group + * Portions Copyright (c) 1994, Regents of the University of California + * + * src/include/utils/expandeddatum.h + * + *------------------------------------------------------------------------- + */ + #ifndef EXPANDEDDATUM_H + #define EXPANDEDDATUM_H + + /* Size of an EXTERNAL datum that contains a pointer to an expanded object */ + #define EXPANDED_POINTER_SIZE (VARHDRSZ_EXTERNAL + sizeof(varatt_expanded)) + + /* + * "Methods" that must be provided for any expanded object. + * + * get_flat_size: compute space needed for flattened representation (which + * must be a valid in-line, non-compressed, 4-byte-header varlena object). + * + * flatten_into: construct flattened representation in the caller-allocated + * space at *result, of size allocated_size (which will always be the result + * of a preceding get_flat_size call; it's passed for cross-checking). + * + * Note: construction of a heap tuple from an expanded datum calls + * get_flat_size twice, so it's worthwhile to make sure that that doesn't + * incur too much overhead. + */ + typedef Size (*EOM_get_flat_size_method) (ExpandedObjectHeader *eohptr); + typedef void (*EOM_flatten_into_method) (ExpandedObjectHeader *eohptr, + void *result, Size allocated_size); + + /* Struct of function pointers for an expanded object's methods */ + typedef struct ExpandedObjectMethods + { + EOM_get_flat_size_method get_flat_size; + EOM_flatten_into_method flatten_into; + } ExpandedObjectMethods; + + /* + * Every expanded object must contain this header; typically the header + * is embedded in some larger struct that adds type-specific fields. + * + * It is presumed that the header object and all subsidiary data are stored + * in eoh_context, so that the object can be freed by deleting that context, + * or its storage lifespan can be altered by reparenting the context. + * (In principle the object could own additional resources, such as malloc'd + * storage, and use a memory context reset callback to free them upon reset or + * deletion of eoh_context.) + * + * We set up two TOAST pointers within the standard header, one read-write + * and one read-only. This allows functions to return either kind of pointer + * without making an additional allocation, and in particular without worrying + * whether a separately palloc'd object would have sufficient lifespan. + * But note that these pointers are just a convenience; a pointer object + * appearing somewhere else would still be legal. + * + * The typedef declaration for this appears in postgres.h. + */ + struct ExpandedObjectHeader + { + /* Phony varlena header */ + int32 vl_len_; /* always EOH_HEADER_MAGIC, see below */ + + /* Pointer to methods required for object type */ + const ExpandedObjectMethods *eoh_methods; + + /* Memory context containing this header and subsidiary data */ + MemoryContext eoh_context; + + /* Standard R/W TOAST pointer for this object is kept here */ + char eoh_rw_ptr[EXPANDED_POINTER_SIZE]; + + /* Standard R/O TOAST pointer for this object is kept here */ + char eoh_ro_ptr[EXPANDED_POINTER_SIZE]; + }; + + /* + * Particularly for read-only functions, it is handy to be able to work with + * either regular "flat" varlena inputs or expanded inputs of the same data + * type. To allow determining which case an argument-fetching function has + * returned, the first int32 of an ExpandedObjectHeader always contains -1 + * (EOH_HEADER_MAGIC to the code). This works since no 4-byte-header varlena + * could have that as its first 4 bytes. Caution: we could not reliably tell + * the difference between an ExpandedObjectHeader and a short-header object + * with this trick. However, it works fine if the argument fetching code + * always returns either a 4-byte-header flat object or an expanded object. + */ + #define EOH_HEADER_MAGIC (-1) + #define VARATT_IS_EXPANDED_HEADER(PTR) \ + (((ExpandedObjectHeader *) (PTR))->vl_len_ == EOH_HEADER_MAGIC) + + /* + * Generic support functions for expanded objects. + * (More of these might be worth inlining later.) + */ + + #define EOHPGetRWDatum(eohptr) PointerGetDatum((eohptr)->eoh_rw_ptr) + #define EOHPGetRODatum(eohptr) PointerGetDatum((eohptr)->eoh_ro_ptr) + + extern ExpandedObjectHeader *DatumGetEOHP(Datum d); + extern void EOH_init_header(ExpandedObjectHeader *eohptr, + const ExpandedObjectMethods *methods, + MemoryContext obj_context); + extern Size EOH_get_flat_size(ExpandedObjectHeader *eohptr); + extern void EOH_flatten_into(ExpandedObjectHeader *eohptr, + void *result, Size allocated_size); + extern bool DatumIsReadWriteExpandedObject(Datum d, bool isnull, int16 typlen); + extern Datum MakeExpandedObjectReadOnly(Datum d, bool isnull, int16 typlen); + extern Datum TransferExpandedObject(Datum d, MemoryContext new_parent); + extern void DeleteExpandedObject(Datum d); + + #endif /* EXPANDEDDATUM_H */ diff --git a/src/pl/plpgsql/src/pl_comp.c b/src/pl/plpgsql/src/pl_comp.c index 650cc48..0ff2086 100644 *** a/src/pl/plpgsql/src/pl_comp.c --- b/src/pl/plpgsql/src/pl_comp.c *************** build_datatype(HeapTuple typeTup, int32 *** 2200,2205 **** --- 2200,2221 ---- typ->collation = typeStruct->typcollation; if (OidIsValid(collation) && OidIsValid(typ->collation)) typ->collation = collation; + /* Detect if type is true array, or domain thereof */ + /* NB: this is only used to decide whether to apply expand_array */ + if (typeStruct->typtype == TYPTYPE_BASE) + { + /* this test should match what get_element_type() checks */ + typ->typisarray = (typeStruct->typlen == -1 && + OidIsValid(typeStruct->typelem)); + } + else if (typeStruct->typtype == TYPTYPE_DOMAIN) + { + /* we can short-circuit looking up base types if it's not varlena */ + typ->typisarray = (typeStruct->typlen == -1 && + OidIsValid(get_base_element_type(typeStruct->typbasetype))); + } + else + typ->typisarray = false; typ->atttypmod = typmod; return typ; diff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c index deefb1f..14969c8 100644 *** a/src/pl/plpgsql/src/pl_exec.c --- b/src/pl/plpgsql/src/pl_exec.c *************** plpgsql_exec_function(PLpgSQL_function * *** 312,317 **** --- 312,355 ---- var->value = fcinfo->arg[i]; var->isnull = fcinfo->argnull[i]; var->freeval = false; + + /* + * Force any array-valued parameter to be stored in + * expanded form in our local variable, in hopes of + * improving efficiency of uses of the variable. (This is + * a hack, really: why only arrays? Need more thought + * about which cases are likely to win. See also + * typisarray-specific heuristic in exec_assign_value.) + * + * Special cases: If passed a R/W expanded pointer, assume + * we can commandeer the object rather than having to copy + * it. If passed a R/O expanded pointer, just keep it as + * the value of the variable for the moment. (We'll force + * it to R/W if the variable gets modified, but that may + * very well never happen.) + */ + if (!var->isnull && var->datatype->typisarray) + { + if (VARATT_IS_EXTERNAL_EXPANDED_RW(DatumGetPointer(var->value))) + { + /* take ownership of R/W object */ + var->value = TransferExpandedObject(var->value, + CurrentMemoryContext); + var->freeval = true; + } + else if (VARATT_IS_EXTERNAL_EXPANDED_RO(DatumGetPointer(var->value))) + { + /* R/O pointer, keep it as-is until assigned to */ + } + else + { + /* flat array, so force to expanded form */ + var->value = expand_array(var->value, + CurrentMemoryContext, + NULL); + var->freeval = true; + } + } } break; *************** plpgsql_exec_function(PLpgSQL_function * *** 477,494 **** /* * If the function's return type isn't by value, copy the value ! * into upper executor memory context. */ if (!fcinfo->isnull && !func->fn_retbyval) ! { ! Size len; ! void *tmp; ! ! len = datumGetSize(estate.retval, false, func->fn_rettyplen); ! tmp = SPI_palloc(len); ! memcpy(tmp, DatumGetPointer(estate.retval), len); ! estate.retval = PointerGetDatum(tmp); ! } } } --- 515,528 ---- /* * If the function's return type isn't by value, copy the value ! * into upper executor memory context. However, if we have a R/W ! * expanded datum, we can just transfer its ownership out to the ! * upper executor context. */ if (!fcinfo->isnull && !func->fn_retbyval) ! estate.retval = SPI_datumTransfer(estate.retval, ! false, ! func->fn_rettyplen); } } *************** exec_stmt_return(PLpgSQL_execstate *esta *** 2476,2481 **** --- 2510,2522 ---- * Special case path when the RETURN expression is a simple variable * reference; in particular, this path is always taken in functions with * one or more OUT parameters. + * + * This special case is especially efficient for returning variables that + * have R/W expanded values: we can put the R/W pointer directly into + * estate->retval, leading to transferring the value to the caller's + * context cheaply. If we went through exec_eval_expr we'd end up with a + * R/O pointer. It's okay to skip MakeExpandedObjectReadOnly here since + * we know we won't need the variable's value within the function anymore. */ if (stmt->retvarno >= 0) { *************** exec_stmt_return_next(PLpgSQL_execstate *** 2604,2609 **** --- 2645,2655 ---- * Special case path when the RETURN NEXT expression is a simple variable * reference; in particular, this path is always taken in functions with * one or more OUT parameters. + * + * Unlike exec_statement_return, there's no special win here for R/W + * expanded values, since they'll have to get flattened to go into the + * tuplestore. Indeed, we'd better make them R/O to avoid any risk of the + * casting step changing them in-place. */ if (stmt->retvarno >= 0) { *************** exec_stmt_return_next(PLpgSQL_execstate *** 2622,2627 **** --- 2668,2678 ---- (errcode(ERRCODE_DATATYPE_MISMATCH), errmsg("wrong result type supplied in RETURN NEXT"))); + /* let's be very paranoid about the cast step */ + retval = MakeExpandedObjectReadOnly(retval, + isNull, + var->datatype->typlen); + /* coerce type if needed */ retval = exec_cast_value(estate, retval, *************** exec_assign_value(PLpgSQL_execstate *est *** 4140,4165 **** /* * If type is by-reference, copy the new value (which is * probably in the eval_econtext) into the procedure's memory ! * context. */ if (!var->datatype->typbyval && !isNull) ! newvalue = datumCopy(newvalue, ! false, ! var->datatype->typlen); /* ! * Now free the old value. (We can't do this any earlier ! * because of the possibility that we are assigning the var's ! * old value to it, eg "foo := foo". We could optimize out ! * the assignment altogether in such cases, but it's too ! * infrequent to be worth testing for.) */ ! free_var(var); var->value = newvalue; var->isnull = isNull; ! if (!var->datatype->typbyval && !isNull) ! var->freeval = true; break; } --- 4191,4241 ---- /* * If type is by-reference, copy the new value (which is * probably in the eval_econtext) into the procedure's memory ! * context. But if it's a read/write reference to an expanded ! * object, no physical copy needs to happen; at most we need ! * to reparent the object's memory context. ! * ! * If it's an array, we force the value to be stored in R/W ! * expanded form. This wins if the function later does, say, ! * a lot of array subscripting operations on the variable, and ! * otherwise might lose. We might need to use a different ! * heuristic, but it's too soon to tell. Also, are there ! * cases where it'd be useful to force non-array values into ! * expanded form? */ if (!var->datatype->typbyval && !isNull) ! { ! if (var->datatype->typisarray && ! !VARATT_IS_EXTERNAL_EXPANDED_RW(DatumGetPointer(newvalue))) ! { ! /* array and not already R/W, so apply expand_array */ ! newvalue = expand_array(newvalue, ! CurrentMemoryContext, ! NULL); ! } ! else ! { ! /* else transfer value if R/W, else just datumCopy */ ! newvalue = datumTransfer(newvalue, ! false, ! var->datatype->typlen); ! } ! } /* ! * Now free the old value, unless it's the same as the new ! * value (ie, we're doing "foo := foo"). Note that for ! * expanded objects, this test is necessary and cannot ! * reliably be made any earlier; we have to be looking at the ! * object's standard R/W pointer to be sure pointer equality ! * is meaningful. */ ! if (var->value != newvalue || var->isnull || isNull) ! free_var(var); var->value = newvalue; var->isnull = isNull; ! var->freeval = (!var->datatype->typbyval && !isNull); break; } *************** exec_assign_value(PLpgSQL_execstate *est *** 4505,4514 **** * * At present this doesn't handle PLpgSQL_expr or PLpgSQL_arrayelem datums. * ! * NOTE: caller must not modify the returned value, since it points right ! * at the stored value in the case of pass-by-reference datatypes. In some ! * cases we have to palloc a return value, and in such cases we put it into ! * the estate's short-term memory context. */ static void exec_eval_datum(PLpgSQL_execstate *estate, --- 4581,4594 ---- * * At present this doesn't handle PLpgSQL_expr or PLpgSQL_arrayelem datums. * ! * NOTE: the returned Datum points right at the stored value in the case of ! * pass-by-reference datatypes. Generally callers should take care not to ! * modify the stored value. Some callers intentionally manipulate variables ! * referenced by R/W expanded pointers, though; it is those callers' ! * responsibility that the results are semantically OK. ! * ! * In some cases we have to palloc a return value, and in such cases we put ! * it into the estate's short-term memory context. */ static void exec_eval_datum(PLpgSQL_execstate *estate, *************** setup_param_list(PLpgSQL_execstate *esta *** 5373,5379 **** PLpgSQL_var *var = (PLpgSQL_var *) datum; ParamExternData *prm = ¶mLI->params[dno]; ! prm->value = var->value; prm->isnull = var->isnull; prm->pflags = PARAM_FLAG_CONST; prm->ptype = var->datatype->typoid; --- 5453,5461 ---- PLpgSQL_var *var = (PLpgSQL_var *) datum; ParamExternData *prm = ¶mLI->params[dno]; ! prm->value = MakeExpandedObjectReadOnly(var->value, ! var->isnull, ! var->datatype->typlen); prm->isnull = var->isnull; prm->pflags = PARAM_FLAG_CONST; prm->ptype = var->datatype->typoid; *************** plpgsql_param_fetch(ParamListInfo params *** 5442,5447 **** --- 5524,5535 ---- exec_eval_datum(estate, datum, &prm->ptype, &prmtypmod, &prm->value, &prm->isnull); + + /* If it's a read/write expanded datum, convert reference to read-only */ + if (datum->dtype == PLPGSQL_DTYPE_VAR) + prm->value = MakeExpandedObjectReadOnly(prm->value, + prm->isnull, + ((PLpgSQL_var *) datum)->datatype->typlen); } *************** free_var(PLpgSQL_var *var) *** 6540,6546 **** { if (var->freeval) { ! pfree(DatumGetPointer(var->value)); var->freeval = false; } } --- 6628,6639 ---- { if (var->freeval) { ! if (DatumIsReadWriteExpandedObject(var->value, ! var->isnull, ! var->datatype->typlen)) ! DeleteExpandedObject(var->value); ! else ! pfree(DatumGetPointer(var->value)); var->freeval = false; } } *************** format_expr_params(PLpgSQL_execstate *es *** 6750,6757 **** curvar = (PLpgSQL_var *) estate->datums[dno]; ! exec_eval_datum(estate, (PLpgSQL_datum *) curvar, ¶mtypeid, ! ¶mtypmod, ¶mdatum, ¶misnull); appendStringInfo(¶mstr, "%s%s = ", paramno > 0 ? ", " : "", --- 6843,6851 ---- curvar = (PLpgSQL_var *) estate->datums[dno]; ! exec_eval_datum(estate, (PLpgSQL_datum *) curvar, ! ¶mtypeid, ¶mtypmod, ! ¶mdatum, ¶misnull); appendStringInfo(¶mstr, "%s%s = ", paramno > 0 ? ", " : "", diff --git a/src/pl/plpgsql/src/plpgsql.h b/src/pl/plpgsql/src/plpgsql.h index bec773a..d21ff0b 100644 *** a/src/pl/plpgsql/src/plpgsql.h --- b/src/pl/plpgsql/src/plpgsql.h *************** typedef struct *** 183,188 **** --- 183,189 ---- char typtype; Oid typrelid; Oid collation; /* from pg_type, but can be overridden */ + bool typisarray; /* is "true" array, or domain over one */ int32 atttypmod; /* typmod (taken from someplace else) */ } PLpgSQL_type;
On 03/28/2015 11:24 PM, Tom Lane wrote: > + /* > + * Macros for iterating through elements of a flat or expanded array. > + * Use "ARRAY_ITER ARRAY_ITER_VARS(name);" to declare the local variables > + * needed for an iterator (more than one set can be used in the same function, > + * if they have different names). > + * Use "ARRAY_ITER_SETUP(name, arrayptr);" to prepare to iterate, and > + * "ARRAY_ITER_NEXT(name, index, datumvar, isnullvar, ...);" to fetch the > + * next element into datumvar/isnullvar. "index" must be the zero-origin > + * element number; we make caller provide this since caller is generally > + * counting the elements anyway. > + */ > + #define ARRAY_ITER /* dummy type name to keep pgindent happy */ > + > + #define ARRAY_ITER_VARS(iter) \ > + Datum *iter##datumptr; \ > + bool *iter##isnullptr; \ > + char *iter##dataptr; \ > + bits8 *iter##bitmapptr; \ > + int iter##bitmask How about a struct instead? struct ArrayIter { Datum datumptr; bool isnullptr; char dataptr; bits8 bitmapptr; int bitmask } Seems more natural. > + #define ARRAY_ITER_SETUP(iter, arrayptr) \ > [long and complicated macro] > + > + #define ARRAY_ITER_NEXT(iter,i, datumvar,isnullvar, elmlen,elmbyval,elmalign) \ > [another long and complicated macro] How about turning these into functions? We have a bunch of macros like this, but IMHO functions are much more readable and easier to debug, so would prefer functions in new code. In general, refactoring the array iteration code to a macro/function like this is a good idea. It would make sense to commit that separately, regardless of the rest of the patch. - Heikki
Heikki Linnakangas <hlinnaka@iki.fi> writes: > On 03/28/2015 11:24 PM, Tom Lane wrote: >> + * Macros for iterating through elements of a flat or expanded array. > How about a struct instead? > struct ArrayIter { > Datum datumptr; > bool isnullptr; > char dataptr; > bits8 bitmapptr; > int bitmask > } > Seems more natural. Yes, and much less efficient I'm afraid. Most compilers would be unable to put the variables into registers, which is important for these inner loops. > How about turning these into functions? Likewise. The point of doing it like this was to avoid taking an efficiency hit compared to the existing code. It's conceivable that we could avoid such a hit by marking the functions all "inline", but I'm not certain that they'd get inlined, and the question of whether the variables could be in registers would remain. regards, tom lane
On 04/17/2015 03:58 PM, Tom Lane wrote: > Heikki Linnakangas <hlinnaka@iki.fi> writes: >> On 03/28/2015 11:24 PM, Tom Lane wrote: >>> + * Macros for iterating through elements of a flat or expanded array. > >> How about a struct instead? > >> struct ArrayIter { >> Datum datumptr; >> bool isnullptr; >> char dataptr; >> bits8 bitmapptr; >> int bitmask >> } > >> Seems more natural. > > Yes, and much less efficient I'm afraid. Most compilers would be unable > to put the variables into registers, which is important for these inner > loops. That would surprise me. Surely most compilers know to keep fields of a struct in registers, when the struct itself or a pointer to it is not passed anywhere. >> How about turning these into functions? > > Likewise. The point of doing it like this was to avoid taking an > efficiency hit compared to the existing code. > > It's conceivable that we could avoid such a hit by marking the functions > all "inline", but I'm not certain that they'd get inlined, and the > question of whether the variables could be in registers would remain. Ok, this one I believe. - Heikki
On 2015-03-28 17:24:36 -0400, Tom Lane wrote: > I wrote: > > [ expanded-arrays-1.0.patch ] > > This is overdue for a rebase; attached. No functional changes, but some > of what was in the original patch has already been merged, and other parts > were superseded. What are your plans with this WRT 9.5? Andres
Andres Freund <andres@anarazel.de> writes: > On 2015-03-28 17:24:36 -0400, Tom Lane wrote: >> This is overdue for a rebase; attached. No functional changes, but some >> of what was in the original patch has already been merged, and other parts >> were superseded. > What are your plans with this WRT 9.5? I'd like to get it committed into 9.5. I've been hoping somebody would do a performance review. regards, tom lane
On 2015-05-01 09:35:08 -0700, Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: > > What are your plans with this WRT 9.5? > > I'd like to get it committed into 9.5. I've been hoping somebody would do > a performance review. Ok. I'll try to have a look, but it'll be the second half of next week. Greetings, Andres Freund
2015-05-01 18:35 GMT+02:00 Tom Lane <tgl@sss.pgh.pa.us>:
Andres Freund <andres@anarazel.de> writes:
> On 2015-03-28 17:24:36 -0400, Tom Lane wrote:
>> This is overdue for a rebase; attached. No functional changes, but some
>> of what was in the original patch has already been merged, and other parts
>> were superseded.
> What are your plans with this WRT 9.5?
I'd like to get it committed into 9.5. I've been hoping somebody would do
a performance review.
I am looking on this patch, but it cannot be applied now.
lxml2 -lssl -lcrypto -lrt -lcrypt -ldl -lm -o postgres
utils/fmgrtab.o:(.rodata+0x2678): undefined reference to `array_append'
utils/fmgrtab.o:(.rodata+0x2698): undefined reference to `array_prepend'
collect2: error: ld returned 1 exit status
Makefile:57: recipe for target 'postgres' failed
make[2]: *** [postgres] Error 1
make[2]: Leaving directory '/home/pavel/src/postgresql/src/backend'
Makefile:34: recipe for target 'all-backend-recurse' failed
make[1]: *** [all-backend-recurse] Error 2
make[1]: Leaving directory '/home/pavel/src/postgresql/src'
GNUmakefile:11: recipe for target 'all-src-recurse' failed
lxml2 -lssl -lcrypto -lrt -lcrypt -ldl -lm -o postgres
utils/fmgrtab.o:(.rodata+0x2678): undefined reference to `array_append'
utils/fmgrtab.o:(.rodata+0x2698): undefined reference to `array_prepend'
collect2: error: ld returned 1 exit status
Makefile:57: recipe for target 'postgres' failed
make[2]: *** [postgres] Error 1
make[2]: Leaving directory '/home/pavel/src/postgresql/src/backend'
Makefile:34: recipe for target 'all-backend-recurse' failed
make[1]: *** [all-backend-recurse] Error 2
make[1]: Leaving directory '/home/pavel/src/postgresql/src'
GNUmakefile:11: recipe for target 'all-src-recurse' failed
Regards
Pavel
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Pavel Stehule <pavel.stehule@gmail.com> writes: > I am looking on this patch, but it cannot be applied now. > lxml2 -lssl -lcrypto -lrt -lcrypt -ldl -lm -o postgres > utils/fmgrtab.o:(.rodata+0x2678): undefined reference to `array_append' > utils/fmgrtab.o:(.rodata+0x2698): undefined reference to `array_prepend' What are you trying to apply it to? I see array_append() in src/backend/utils/adt/array_userfuncs.c in HEAD. Also, are you checking the 1.1 version of the patch? regards, tom lane
On 2015-05-01 11:11:14 -0700, Tom Lane wrote: > Pavel Stehule <pavel.stehule@gmail.com> writes: > > I am looking on this patch, but it cannot be applied now. > > > lxml2 -lssl -lcrypto -lrt -lcrypt -ldl -lm -o postgres > > utils/fmgrtab.o:(.rodata+0x2678): undefined reference to `array_append' > > utils/fmgrtab.o:(.rodata+0x2698): undefined reference to `array_prepend' > > What are you trying to apply it to? I see array_append() in > src/backend/utils/adt/array_userfuncs.c in HEAD. Also, are > you checking the 1.1 version of the patch? That's very likely due to the transforms patch, with added another column to pg_proc... Greetings, Andres Freund
Andres Freund <andres@anarazel.de> writes: > On 2015-05-01 11:11:14 -0700, Tom Lane wrote: >> What are you trying to apply it to? I see array_append() in >> src/backend/utils/adt/array_userfuncs.c in HEAD. Also, are >> you checking the 1.1 version of the patch? > That's very likely due to the transforms patch, with added another > column to pg_proc... No, my patch doesn't touch pg_proc.h. I'm certainly prepared to believe it's suffered bit rot in the last couple of weeks, but I don't understand how it would apply successfully and then generate a complaint about array_append not being there. array_append *is* there in HEAD, and has been for awhile. regards, tom lane
2015-05-01 20:11 GMT+02:00 Tom Lane <tgl@sss.pgh.pa.us>:
Pavel Stehule <pavel.stehule@gmail.com> writes:
> I am looking on this patch, but it cannot be applied now.
> lxml2 -lssl -lcrypto -lrt -lcrypt -ldl -lm -o postgres
> utils/fmgrtab.o:(.rodata+0x2678): undefined reference to `array_append'
> utils/fmgrtab.o:(.rodata+0x2698): undefined reference to `array_prepend'
What are you trying to apply it to? I see array_append() in
src/backend/utils/adt/array_userfuncs.c in HEAD. Also, are
you checking the 1.1 version of the patch?
I tested old version. 1.1. looks well.
Regards
Pavel
regards, tom lane
2015-05-01 20:53 GMT+02:00 Pavel Stehule <pavel.stehule@gmail.com>:
Original Patch
2015-05-01 20:11 GMT+02:00 Tom Lane <tgl@sss.pgh.pa.us>:Pavel Stehule <pavel.stehule@gmail.com> writes:
> I am looking on this patch, but it cannot be applied now.
> lxml2 -lssl -lcrypto -lrt -lcrypt -ldl -lm -o postgres
> utils/fmgrtab.o:(.rodata+0x2678): undefined reference to `array_append'
> utils/fmgrtab.o:(.rodata+0x2698): undefined reference to `array_prepend'
What are you trying to apply it to? I see array_append() in
src/backend/utils/adt/array_userfuncs.c in HEAD. Also, are
you checking the 1.1 version of the patch?I tested old version. 1.1. looks well.
It is hard to believe how it is fast
I use buble sort for plpgsql benchmarking. Following variant is suboptimal (but it is perfect for this test)
CREATE OR REPLACE FUNCTION public.buble(a anyarray, OUT r anyarray)
RETURNS anyarray
LANGUAGE plpgsql
AS $function$
DECLARE
aux r%type;
sorted bool := false;
BEGIN
r := a;
WHILE NOT sorted
LOOP
sorted := true;
FOR i IN array_lower(a,1) .. array_upper(a,1) - 1
LOOP
IF r[i] > r[i+1] THEN
sorted := false;
aux[1] := r[i];
r[i] := r[i+1]; r[i+1] := aux[1];
END IF;
END LOOP;
END LOOP;
END;
$function$
CREATE OR REPLACE FUNCTION public.array_generator(integer, anyelement, OUT r anyarray)
RETURNS anyarray
LANGUAGE plpgsql
AS $function$
BEGIN
r := (SELECT ARRAY(SELECT random()*$2 FROM generate_series(1,$1)));
END;
$function$
CREATE OR REPLACE FUNCTION public.buble(a anyarray, OUT r anyarray)
RETURNS anyarray
LANGUAGE plpgsql
AS $function$
DECLARE
aux r%type;
sorted bool := false;
BEGIN
r := a;
WHILE NOT sorted
LOOP
sorted := true;
FOR i IN array_lower(a,1) .. array_upper(a,1) - 1
LOOP
IF r[i] > r[i+1] THEN
sorted := false;
aux[1] := r[i];
r[i] := r[i+1]; r[i+1] := aux[1];
END IF;
END LOOP;
END LOOP;
END;
$function$
CREATE OR REPLACE FUNCTION public.array_generator(integer, anyelement, OUT r anyarray)
RETURNS anyarray
LANGUAGE plpgsql
AS $function$
BEGIN
r := (SELECT ARRAY(SELECT random()*$2 FROM generate_series(1,$1)));
END;
$function$
Test for 3000 elements:
Integer 55sec 8sec
Numeric 341sec 8sec
Quicksort is about 3x faster -- so a benefit of this patch is clear.
Regards
Pavel
RegardsPavel
regards, tom lane
Pavel Stehule <pavel.stehule@gmail.com> writes: > Test for 3000 elements: > Original Patch > Integer 55sec 8sec > Numeric 341sec 8sec > Quicksort is about 3x faster -- so a benefit of this patch is clear. Yeah, the patch should pretty much blow the doors off any case that's heavily dependent on access or update of individual array elements ... especially for arrays with variable-length element type, such as numeric. What I'm concerned about is that it could make things *slower* for scenarios where that isn't the main thing being done with the arrays, as a result of useless conversions between "flat" and "expanded" array formats. So what we need is to try to benchmark some cases that don't involve single-element operations but rather whole-array operations (on arrays that are plpgsql variables), and see if those cases have gotten noticeably worse. regards, tom lane
Hi
I did some test with unlogged table in shared buffersdo $$ declare a int[] := '{}'; begin for i in 1..90000 loop a := a || 10; end loop; end$$ language plpgsql;
do $$ declare a numeric[] := '{}'; begin for i in 1..90000 loop a := a || 10.1; end loop; end$$ language plpgsql;
2015-05-01 21:59 GMT+02:00 Tom Lane <tgl@sss.pgh.pa.us>:
Pavel Stehule <pavel.stehule@gmail.com> writes:
> Test for 3000 elements:
> Original Patch
> Integer 55sec 8sec
> Numeric 341sec 8sec
> Quicksort is about 3x faster -- so a benefit of this patch is clear.
Yeah, the patch should pretty much blow the doors off any case that's
heavily dependent on access or update of individual array elements ...
especially for arrays with variable-length element type, such as numeric.
What I'm concerned about is that it could make things *slower* for
scenarios where that isn't the main thing being done with the arrays,
as a result of useless conversions between "flat" and "expanded"
array formats. So what we need is to try to benchmark some cases
that don't involve single-element operations but rather whole-array
operations (on arrays that are plpgsql variables), and see if those
cases have gotten noticeably worse.
regards, tom lane
Pavel Stehule <pavel.stehule@gmail.com> writes: > Some slowdown is visible (about 10%) for query > update foo set a = a || 1; > Significant slowdown is on following test: > do $$ declare a int[] := '{}'; begin for i in 1..90000 loop a := a || 10; > end loop; end$$ language plpgsql; > do $$ declare a numeric[] := '{}'; begin for i in 1..90000 loop a := a || > 10.1; end loop; end$$ language plpgsql; > integer master 14sec x patched 55sec > numeric master 43sec x patched 108sec > It is probably worst case - and it is known plpgsql antipattern Yeah, I have not expended a great deal of effort on the array_append/ array_prepend/array_cat code paths. Still, in these plpgsql cases, we should in principle have gotten down from two array copies per loop to one, so it's disappointing to not have better results there, even granting that the new "copy" step is not just a byte-by-byte copy. Let me see if there's anything simple to be done about that. regards, tom lane
I wrote: > Pavel Stehule <pavel.stehule@gmail.com> writes: >> Significant slowdown is on following test: >> do $$ declare a int[] := '{}'; begin for i in 1..90000 loop a := a || 10; >> end loop; end$$ language plpgsql; >> do $$ declare a numeric[] := '{}'; begin for i in 1..90000 loop a := a || >> 10.1; end loop; end$$ language plpgsql; >> integer master 14sec x patched 55sec >> numeric master 43sec x patched 108sec >> It is probably worst case - and it is known plpgsql antipattern > Yeah, I have not expended a great deal of effort on the array_append/ > array_prepend/array_cat code paths. Still, in these plpgsql cases, > we should in principle have gotten down from two array copies per loop to > one, so it's disappointing to not have better results there, even granting > that the new "copy" step is not just a byte-by-byte copy. Let me see if > there's anything simple to be done about that. The attached updated patch reduces both of those do-loop tests to about 60 msec on my machine. It contains two improvements over the 1.1 patch: 1. There's a fast path for copying an expanded array to another expanded array when the element type is pass-by-value: we can just memcpy the Datum array instead of working element-by-element. In isolation, that change made the patch a little faster than 9.4 on your int-array case, though of course it doesn't help for the numeric-array case (and I do not see a way to avoid working element-by-element for pass-by-ref cases). 2. pl/pgsql now detects cases like "a := a || x" and allows the array "a" to be passed as a read-write pointer to array_append, so that array_append can modify expanded arrays in-place and avoid inessential data copying altogether. (The earlier patch had made array_append and array_prepend safe for this usage, but there wasn't actually any way to invoke them with read-write pointers.) I had speculated about doing this in my earliest discussion of this patch, but there was no code for it before. The key question for change #2 is how do we identify what is a "safe" top-level function that can be trusted not to corrupt the read-write value if it fails partway through. I did not have a good answer before, and I still don't; what this version of the patch does is to hard-wire array_append and array_prepend as the functions considered safe. Obviously that is crying out for improvement, but we can leave that question for later; at least now we have infrastructure that makes it possible to do it. Change #1 is actually not relevant to these example cases, because we don't copy any arrays within the loop given change #2. But I left it in because it's not much code and it will help for situations where change #2 doesn't apply. regards, tom lane diff --git a/doc/src/sgml/storage.sgml b/doc/src/sgml/storage.sgml index d8c5287..e5b7b4b 100644 *** a/doc/src/sgml/storage.sgml --- b/doc/src/sgml/storage.sgml *************** comparison table, in which all the HTML *** 503,510 **** <acronym>TOAST</> pointers can point to data that is not on disk, but is elsewhere in the memory of the current server process. Such pointers obviously cannot be long-lived, but they are nonetheless useful. There ! is currently just one sub-case: ! pointers to <firstterm>indirect</> data. </para> <para> --- 503,511 ---- <acronym>TOAST</> pointers can point to data that is not on disk, but is elsewhere in the memory of the current server process. Such pointers obviously cannot be long-lived, but they are nonetheless useful. There ! are currently two sub-cases: ! pointers to <firstterm>indirect</> data and ! pointers to <firstterm>expanded</> data. </para> <para> *************** and there is no infrastructure to help w *** 519,524 **** --- 520,562 ---- </para> <para> + Expanded <acronym>TOAST</> pointers are useful for complex data types + whose on-disk representation is not especially suited for computational + purposes. As an example, the standard varlena representation of a + <productname>PostgreSQL</> array includes dimensionality information, a + nulls bitmap if there are any null elements, then the values of all the + elements in order. When the element type itself is variable-length, the + only way to find the <replaceable>N</>'th element is to scan through all the + preceding elements. This representation is appropriate for on-disk storage + because of its compactness, but for computations with the array it's much + nicer to have an <quote>expanded</> or <quote>deconstructed</> + representation in which all the element starting locations have been + identified. The <acronym>TOAST</> pointer mechanism supports this need by + allowing a pass-by-reference Datum to point to either a standard varlena + value (the on-disk representation) or a <acronym>TOAST</> pointer that + points to an expanded representation somewhere in memory. The details of + this expanded representation are up to the data type, though it must have + a standard header and meet the other API requirements given + in <filename>src/include/utils/expandeddatum.h</>. C-level functions + working with the data type can choose to handle either representation. + Functions that do not know about the expanded representation, but simply + apply <function>PG_DETOAST_DATUM</> to their inputs, will automatically + receive the traditional varlena representation; so support for an expanded + representation can be introduced incrementally, one function at a time. + </para> + + <para> + <acronym>TOAST</> pointers to expanded values are further broken down + into <firstterm>read-write</> and <firstterm>read-only</> pointers. + The pointed-to representation is the same either way, but a function that + receives a read-write pointer is allowed to modify the referenced value + in-place, whereas one that receives a read-only pointer must not; it must + first create a copy if it wants to make a modified version of the value. + This distinction and some associated conventions make it possible to avoid + unnecessary copying of expanded values during query execution. + </para> + + <para> For all types of in-memory <acronym>TOAST</> pointer, the <acronym>TOAST</> management code ensures that no such pointer datum can accidentally get stored on disk. In-memory <acronym>TOAST</> pointers are automatically diff --git a/doc/src/sgml/xtypes.sgml b/doc/src/sgml/xtypes.sgml index 2459616..ac0b8a2 100644 *** a/doc/src/sgml/xtypes.sgml --- b/doc/src/sgml/xtypes.sgml *************** CREATE TYPE complex ( *** 300,305 **** --- 300,376 ---- </para> </note> + <para> + Another feature that's enabled by <acronym>TOAST</> support is the + possibility of having an <firstterm>expanded</> in-memory data + representation that is more convenient to work with than the format that + is stored on disk. The regular or <quote>flat</> varlena storage format + is ultimately just a blob of bytes; it cannot for example contain + pointers, since it may get copied to other locations in memory. + For complex data types, the flat format may be quite expensive to work + with, so <productname>PostgreSQL</> provides a way to <quote>expand</> + the flat format into a representation that is more suited to computation, + and then pass that format in-memory between functions of the data type. + </para> + + <para> + To use expanded storage, a data type must define an expanded format that + follows the rules given in <filename>src/include/utils/expandeddatum.h</>, + and provide functions to <quote>expand</> a flat varlena value into + expanded format and <quote>flatten</> the expanded format back to the + regular varlena representation. Then ensure that all C functions for + the data type can accept either representation, possibly by converting + one into the other immediately upon receipt. This does not require fixing + all existing functions for the data type at once, because the standard + <function>PG_DETOAST_DATUM</> macro is defined to convert expanded inputs + into regular flat format. Therefore, existing functions that work with + the flat varlena format will continue to work, though slightly + inefficiently, with expanded inputs; they need not be converted until and + unless better performance is important. + </para> + + <para> + C functions that know how to work with an expanded representation + typically fall into two categories: those that can only handle expanded + format, and those that can handle either expanded or flat varlena inputs. + The former are easier to write but may be less efficient overall, because + converting a flat input to expanded form for use by a single function may + cost more than is saved by operating on the expanded format. + When only expanded format need be handled, conversion of flat inputs to + expanded form can be hidden inside an argument-fetching macro, so that + the function appears no more complex than one working with traditional + varlena input. + To handle both types of input, write an argument-fetching function that + will detoast external, short-header, and compressed varlena inputs, but + not expanded inputs. Such a function can be defined as returning a + pointer to a union of the flat varlena format and the expanded format. + Callers can use the <function>VARATT_IS_EXPANDED_HEADER()</> macro to + determine which format they received. + </para> + + <para> + The <acronym>TOAST</> infrastructure not only allows regular varlena + values to be distinguished from expanded values, but also + distinguishes <quote>read-write</> and <quote>read-only</> pointers to + expanded values. C functions that only need to examine an expanded + value, or will only change it in safe and non-semantically-visible ways, + need not care which type of pointer they receive. C functions that + produce a modified version of an input value are allowed to modify an + expanded input value in-place if they receive a read-write pointer, but + must not modify the input if they receive a read-only pointer; in that + case they have to copy the value first, producing a new value to modify. + A C function that has constructed a new expanded value should always + return a read-write pointer to it. Also, a C function that is modifying + a read-write expanded value in-place should take care to leave the value + in a sane state if it fails partway through. + </para> + + <para> + For examples of working with expanded values, see the standard array + infrastructure, particularly + <filename>src/backend/utils/adt/array_expanded.c</>. + </para> + </sect2> </sect1> diff --git a/src/backend/access/common/heaptuple.c b/src/backend/access/common/heaptuple.c index 6cd4e8e..de7f02f 100644 *** a/src/backend/access/common/heaptuple.c --- b/src/backend/access/common/heaptuple.c *************** *** 60,65 **** --- 60,66 ---- #include "access/sysattr.h" #include "access/tuptoaster.h" #include "executor/tuptable.h" + #include "utils/expandeddatum.h" /* Does att's datatype allow packing into the 1-byte-header varlena format? */ *************** heap_compute_data_size(TupleDesc tupleDe *** 93,105 **** for (i = 0; i < numberOfAttributes; i++) { Datum val; if (isnull[i]) continue; val = values[i]; ! if (ATT_IS_PACKABLE(att[i]) && VARATT_CAN_MAKE_SHORT(DatumGetPointer(val))) { /* --- 94,108 ---- for (i = 0; i < numberOfAttributes; i++) { Datum val; + Form_pg_attribute atti; if (isnull[i]) continue; val = values[i]; + atti = att[i]; ! if (ATT_IS_PACKABLE(atti) && VARATT_CAN_MAKE_SHORT(DatumGetPointer(val))) { /* *************** heap_compute_data_size(TupleDesc tupleDe *** 108,118 **** */ data_length += VARATT_CONVERTED_SHORT_SIZE(DatumGetPointer(val)); } else { ! data_length = att_align_datum(data_length, att[i]->attalign, ! att[i]->attlen, val); ! data_length = att_addlength_datum(data_length, att[i]->attlen, val); } } --- 111,131 ---- */ data_length += VARATT_CONVERTED_SHORT_SIZE(DatumGetPointer(val)); } + else if (atti->attlen == -1 && + VARATT_IS_EXTERNAL_EXPANDED(DatumGetPointer(val))) + { + /* + * we want to flatten the expanded value so that the constructed + * tuple doesn't depend on it + */ + data_length = att_align_nominal(data_length, atti->attalign); + data_length += EOH_get_flat_size(DatumGetEOHP(val)); + } else { ! data_length = att_align_datum(data_length, atti->attalign, ! atti->attlen, val); ! data_length = att_addlength_datum(data_length, atti->attlen, val); } } *************** heap_fill_tuple(TupleDesc tupleDesc, *** 203,212 **** *infomask |= HEAP_HASVARWIDTH; if (VARATT_IS_EXTERNAL(val)) { ! *infomask |= HEAP_HASEXTERNAL; ! /* no alignment, since it's short by definition */ ! data_length = VARSIZE_EXTERNAL(val); ! memcpy(data, val, data_length); } else if (VARATT_IS_SHORT(val)) { --- 216,241 ---- *infomask |= HEAP_HASVARWIDTH; if (VARATT_IS_EXTERNAL(val)) { ! if (VARATT_IS_EXTERNAL_EXPANDED(val)) ! { ! /* ! * we want to flatten the expanded value so that the ! * constructed tuple doesn't depend on it ! */ ! ExpandedObjectHeader *eoh = DatumGetEOHP(values[i]); ! ! data = (char *) att_align_nominal(data, ! att[i]->attalign); ! data_length = EOH_get_flat_size(eoh); ! EOH_flatten_into(eoh, data, data_length); ! } ! else ! { ! *infomask |= HEAP_HASEXTERNAL; ! /* no alignment, since it's short by definition */ ! data_length = VARSIZE_EXTERNAL(val); ! memcpy(data, val, data_length); ! } } else if (VARATT_IS_SHORT(val)) { diff --git a/src/backend/access/heap/tuptoaster.c b/src/backend/access/heap/tuptoaster.c index 8464e87..c3ebbef 100644 *** a/src/backend/access/heap/tuptoaster.c --- b/src/backend/access/heap/tuptoaster.c *************** *** 37,42 **** --- 37,43 ---- #include "catalog/catalog.h" #include "common/pg_lzcompress.h" #include "miscadmin.h" + #include "utils/expandeddatum.h" #include "utils/fmgroids.h" #include "utils/rel.h" #include "utils/typcache.h" *************** heap_tuple_fetch_attr(struct varlena * a *** 130,135 **** --- 131,149 ---- result = (struct varlena *) palloc(VARSIZE_ANY(attr)); memcpy(result, attr, VARSIZE_ANY(attr)); } + else if (VARATT_IS_EXTERNAL_EXPANDED(attr)) + { + /* + * This is an expanded-object pointer --- get flat format + */ + ExpandedObjectHeader *eoh; + Size resultsize; + + eoh = DatumGetEOHP(PointerGetDatum(attr)); + resultsize = EOH_get_flat_size(eoh); + result = (struct varlena *) palloc(resultsize); + EOH_flatten_into(eoh, (void *) result, resultsize); + } else { /* *************** heap_tuple_untoast_attr(struct varlena * *** 196,201 **** --- 210,224 ---- attr = result; } } + else if (VARATT_IS_EXTERNAL_EXPANDED(attr)) + { + /* + * This is an expanded-object pointer --- get flat format + */ + attr = heap_tuple_fetch_attr(attr); + /* flatteners are not allowed to produce compressed/short output */ + Assert(!VARATT_IS_EXTENDED(attr)); + } else if (VARATT_IS_COMPRESSED(attr)) { /* *************** heap_tuple_untoast_attr_slice(struct var *** 263,268 **** --- 286,296 ---- return heap_tuple_untoast_attr_slice(redirect.pointer, sliceoffset, slicelength); } + else if (VARATT_IS_EXTERNAL_EXPANDED(attr)) + { + /* pass it off to heap_tuple_fetch_attr to flatten */ + preslice = heap_tuple_fetch_attr(attr); + } else preslice = attr; *************** toast_raw_datum_size(Datum value) *** 344,349 **** --- 372,381 ---- return toast_raw_datum_size(PointerGetDatum(toast_pointer.pointer)); } + else if (VARATT_IS_EXTERNAL_EXPANDED(attr)) + { + result = EOH_get_flat_size(DatumGetEOHP(value)); + } else if (VARATT_IS_COMPRESSED(attr)) { /* here, va_rawsize is just the payload size */ *************** toast_datum_size(Datum value) *** 400,405 **** --- 432,441 ---- return toast_datum_size(PointerGetDatum(toast_pointer.pointer)); } + else if (VARATT_IS_EXTERNAL_EXPANDED(attr)) + { + result = EOH_get_flat_size(DatumGetEOHP(value)); + } else if (VARATT_IS_SHORT(attr)) { result = VARSIZE_SHORT(attr); diff --git a/src/backend/executor/execQual.c b/src/backend/executor/execQual.c index d94fe58..e599411 100644 *** a/src/backend/executor/execQual.c --- b/src/backend/executor/execQual.c *************** ExecEvalArrayCoerceExpr(ArrayCoerceExprS *** 4248,4254 **** { ArrayCoerceExpr *acoerce = (ArrayCoerceExpr *) astate->xprstate.expr; Datum result; - ArrayType *array; FunctionCallInfoData locfcinfo; result = ExecEvalExpr(astate->arg, econtext, isNull, isDone); --- 4248,4253 ---- *************** ExecEvalArrayCoerceExpr(ArrayCoerceExprS *** 4265,4278 **** if (!OidIsValid(acoerce->elemfuncid)) { /* Detoast input array if necessary, and copy in any case */ ! array = DatumGetArrayTypePCopy(result); ARR_ELEMTYPE(array) = astate->resultelemtype; PG_RETURN_ARRAYTYPE_P(array); } - /* Detoast input array if necessary, but don't make a useless copy */ - array = DatumGetArrayTypeP(result); - /* Initialize function cache if first time through */ if (astate->elemfunc.fn_oid == InvalidOid) { --- 4264,4275 ---- if (!OidIsValid(acoerce->elemfuncid)) { /* Detoast input array if necessary, and copy in any case */ ! ArrayType *array = DatumGetArrayTypePCopy(result); ! ARR_ELEMTYPE(array) = astate->resultelemtype; PG_RETURN_ARRAYTYPE_P(array); } /* Initialize function cache if first time through */ if (astate->elemfunc.fn_oid == InvalidOid) { *************** ExecEvalArrayCoerceExpr(ArrayCoerceExprS *** 4302,4316 **** */ InitFunctionCallInfoData(locfcinfo, &(astate->elemfunc), 3, InvalidOid, NULL, NULL); ! locfcinfo.arg[0] = PointerGetDatum(array); locfcinfo.arg[1] = Int32GetDatum(acoerce->resulttypmod); locfcinfo.arg[2] = BoolGetDatum(acoerce->isExplicit); locfcinfo.argnull[0] = false; locfcinfo.argnull[1] = false; locfcinfo.argnull[2] = false; ! return array_map(&locfcinfo, ARR_ELEMTYPE(array), astate->resultelemtype, ! astate->amstate); } /* ---------------------------------------------------------------- --- 4299,4312 ---- */ InitFunctionCallInfoData(locfcinfo, &(astate->elemfunc), 3, InvalidOid, NULL, NULL); ! locfcinfo.arg[0] = result; locfcinfo.arg[1] = Int32GetDatum(acoerce->resulttypmod); locfcinfo.arg[2] = BoolGetDatum(acoerce->isExplicit); locfcinfo.argnull[0] = false; locfcinfo.argnull[1] = false; locfcinfo.argnull[2] = false; ! return array_map(&locfcinfo, astate->resultelemtype, astate->amstate); } /* ---------------------------------------------------------------- diff --git a/src/backend/executor/execTuples.c b/src/backend/executor/execTuples.c index 753754d..a05d8b1 100644 *** a/src/backend/executor/execTuples.c --- b/src/backend/executor/execTuples.c *************** *** 88,93 **** --- 88,94 ---- #include "nodes/nodeFuncs.h" #include "storage/bufmgr.h" #include "utils/builtins.h" + #include "utils/expandeddatum.h" #include "utils/lsyscache.h" #include "utils/typcache.h" *************** ExecCopySlot(TupleTableSlot *dstslot, Tu *** 812,817 **** --- 813,864 ---- return ExecStoreTuple(newTuple, dstslot, InvalidBuffer, true); } + /* -------------------------------- + * ExecMakeSlotContentsReadOnly + * Mark any R/W expanded datums in the slot as read-only. + * + * This is needed when a slot that might contain R/W datum references is to be + * used as input for general expression evaluation. Since the expression(s) + * might contain more than one Var referencing the same R/W datum, we could + * get wrong answers if functions acting on those Vars thought they could + * modify the expanded value in-place. + * + * For notational reasons, we return the same slot passed in. + * -------------------------------- + */ + TupleTableSlot * + ExecMakeSlotContentsReadOnly(TupleTableSlot *slot) + { + /* + * sanity checks + */ + Assert(slot != NULL); + Assert(slot->tts_tupleDescriptor != NULL); + Assert(!slot->tts_isempty); + + /* + * If the slot contains a physical tuple, it can't contain any expanded + * datums, because we flatten those when making a physical tuple. This + * might change later; but for now, we need do nothing unless the slot is + * virtual. + */ + if (slot->tts_tuple == NULL) + { + Form_pg_attribute *att = slot->tts_tupleDescriptor->attrs; + int attnum; + + for (attnum = 0; attnum < slot->tts_nvalid; attnum++) + { + slot->tts_values[attnum] = + MakeExpandedObjectReadOnly(slot->tts_values[attnum], + slot->tts_isnull[attnum], + att[attnum]->attlen); + } + } + + return slot; + } + /* ---------------------------------------------------------------- * convenience initialization routines diff --git a/src/backend/executor/nodeSubqueryscan.c b/src/backend/executor/nodeSubqueryscan.c index 3f66e24..e5d1e54 100644 *** a/src/backend/executor/nodeSubqueryscan.c --- b/src/backend/executor/nodeSubqueryscan.c *************** SubqueryNext(SubqueryScanState *node) *** 56,62 **** --- 56,70 ---- * We just return the subplan's result slot, rather than expending extra * cycles for ExecCopySlot(). (Our own ScanTupleSlot is used only for * EvalPlanQual rechecks.) + * + * We do need to mark the slot contents read-only to prevent interference + * between different functions reading the same datum from the slot. It's + * a bit hokey to do this to the subplan's slot, but should be safe + * enough. */ + if (!TupIsNull(slot)) + slot = ExecMakeSlotContentsReadOnly(slot); + return slot; } diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c index 557d153..472de41 100644 *** a/src/backend/executor/spi.c --- b/src/backend/executor/spi.c *************** SPI_pfree(void *pointer) *** 1015,1020 **** --- 1015,1041 ---- pfree(pointer); } + Datum + SPI_datumTransfer(Datum value, bool typByVal, int typLen) + { + MemoryContext oldcxt = NULL; + Datum result; + + if (_SPI_curid + 1 == _SPI_connected) /* connected */ + { + if (_SPI_current != &(_SPI_stack[_SPI_curid + 1])) + elog(ERROR, "SPI stack corrupted"); + oldcxt = MemoryContextSwitchTo(_SPI_current->savedcxt); + } + + result = datumTransfer(value, typByVal, typLen); + + if (oldcxt) + MemoryContextSwitchTo(oldcxt); + + return result; + } + void SPI_freetuple(HeapTuple tuple) { diff --git a/src/backend/utils/adt/Makefile b/src/backend/utils/adt/Makefile index 1f1bee7..3ed0b44 100644 *** a/src/backend/utils/adt/Makefile --- b/src/backend/utils/adt/Makefile *************** endif *** 16,25 **** endif # keep this list arranged alphabetically or it gets to be a mess ! OBJS = acl.o arrayfuncs.o array_selfuncs.o array_typanalyze.o \ ! array_userfuncs.o arrayutils.o ascii.o bool.o \ ! cash.o char.o date.o datetime.o datum.o dbsize.o domains.o \ ! encode.o enum.o float.o format_type.o formatting.o genfile.o \ geo_ops.o geo_selfuncs.o inet_cidr_ntop.o inet_net_pton.o int.o \ int8.o json.o jsonb.o jsonb_gin.o jsonb_op.o jsonb_util.o \ jsonfuncs.o like.o lockfuncs.o mac.o misc.o nabstime.o name.o \ --- 16,26 ---- endif # keep this list arranged alphabetically or it gets to be a mess ! OBJS = acl.o arrayfuncs.o array_expanded.o array_selfuncs.o \ ! array_typanalyze.o array_userfuncs.o arrayutils.o ascii.o \ ! bool.o cash.o char.o date.o datetime.o datum.o dbsize.o domains.o \ ! encode.o enum.o expandeddatum.o \ ! float.o format_type.o formatting.o genfile.o \ geo_ops.o geo_selfuncs.o inet_cidr_ntop.o inet_net_pton.o int.o \ int8.o json.o jsonb.o jsonb_gin.o jsonb_op.o jsonb_util.o \ jsonfuncs.o like.o lockfuncs.o mac.o misc.o nabstime.o name.o \ diff --git a/src/backend/utils/adt/array_expanded.c b/src/backend/utils/adt/array_expanded.c index ...97fd444 . *** a/src/backend/utils/adt/array_expanded.c --- b/src/backend/utils/adt/array_expanded.c *************** *** 0 **** --- 1,455 ---- + /*------------------------------------------------------------------------- + * + * array_expanded.c + * Basic functions for manipulating expanded arrays. + * + * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group + * Portions Copyright (c) 1994, Regents of the University of California + * + * + * IDENTIFICATION + * src/backend/utils/adt/array_expanded.c + * + *------------------------------------------------------------------------- + */ + #include "postgres.h" + + #include "access/tupmacs.h" + #include "utils/array.h" + #include "utils/lsyscache.h" + #include "utils/memutils.h" + + + /* "Methods" required for an expanded object */ + static Size EA_get_flat_size(ExpandedObjectHeader *eohptr); + static void EA_flatten_into(ExpandedObjectHeader *eohptr, + void *result, Size allocated_size); + + static const ExpandedObjectMethods EA_methods = + { + EA_get_flat_size, + EA_flatten_into + }; + + /* Other local functions */ + static void copy_byval_expanded_array(ExpandedArrayHeader *eah, + ExpandedArrayHeader *oldeah); + + + /* + * expand_array: convert an array Datum into an expanded array + * + * The expanded object will be a child of parentcontext. + * + * Some callers can provide cache space to avoid repeated lookups of element + * type data across calls; if so, pass a metacache pointer, making sure that + * metacache->element_type is initialized to InvalidOid before first call. + * If no cross-call caching is required, pass NULL for metacache. + */ + Datum + expand_array(Datum arraydatum, MemoryContext parentcontext, + ArrayMetaState *metacache) + { + ArrayType *array; + ExpandedArrayHeader *eah; + MemoryContext objcxt; + MemoryContext oldcxt; + ArrayMetaState fakecache; + + /* + * Allocate private context for expanded object. We start by assuming + * that the array won't be very large; but if it does grow a lot, don't + * constrain aset.c's large-context behavior. + */ + objcxt = AllocSetContextCreate(parentcontext, + "expanded array", + ALLOCSET_SMALL_MINSIZE, + ALLOCSET_SMALL_INITSIZE, + ALLOCSET_DEFAULT_MAXSIZE); + + /* Set up expanded array header */ + eah = (ExpandedArrayHeader *) + MemoryContextAlloc(objcxt, sizeof(ExpandedArrayHeader)); + + EOH_init_header(&eah->hdr, &EA_methods, objcxt); + eah->ea_magic = EA_MAGIC; + + /* If the source is an expanded array, we may be able to optimize */ + if (VARATT_IS_EXTERNAL_EXPANDED(DatumGetPointer(arraydatum))) + { + ExpandedArrayHeader *oldeah = (ExpandedArrayHeader *) DatumGetEOHP(arraydatum); + + Assert(oldeah->ea_magic == EA_MAGIC); + + /* + * Update caller's cache if provided; we don't need it this time, but + * next call might be for a non-expanded source array. Furthermore, + * if the caller didn't provide a cache area, use some local storage + * to cache anyway, thereby avoiding a catalog lookup in the case + * where we fall through to the flat-copy code path. + */ + if (metacache == NULL) + metacache = &fakecache; + metacache->element_type = oldeah->element_type; + metacache->typlen = oldeah->typlen; + metacache->typbyval = oldeah->typbyval; + metacache->typalign = oldeah->typalign; + + /* + * If element type is pass-by-value and we have a Datum-array + * representation, just copy the source's metadata and Datum/isnull + * arrays. The original flat array, if present at all, adds no + * additional information so we need not copy it. + */ + if (oldeah->typbyval && oldeah->dvalues != NULL) + { + copy_byval_expanded_array(eah, oldeah); + /* return a R/W pointer to the expanded array */ + return EOHPGetRWDatum(&eah->hdr); + } + + /* + * Otherwise, either we have only a flat representation or the + * elements are pass-by-reference. In either case, the best thing + * seems to be to copy the source as a flat representation and then + * deconstruct that later if necessary. For the pass-by-ref case, we + * could perhaps save some cycles with custom code that generates the + * deconstructed representation in parallel with copying the values, + * but it would be a lot of extra code for fairly marginal gain. So, + * fall through into the flat-source code path. + */ + } + + /* + * Detoast and copy source array into private context, as a flat array. + * + * Note that this coding risks leaking some memory in the private context + * if we have to fetch data from a TOAST table; however, experimentation + * says that the leak is minimal. Doing it this way saves a copy step, + * which seems worthwhile, especially if the array is large enough to need + * external storage. + */ + oldcxt = MemoryContextSwitchTo(objcxt); + array = DatumGetArrayTypePCopy(arraydatum); + MemoryContextSwitchTo(oldcxt); + + eah->ndims = ARR_NDIM(array); + /* note these pointers point into the fvalue header! */ + eah->dims = ARR_DIMS(array); + eah->lbound = ARR_LBOUND(array); + + /* Save array's element-type data for possible use later */ + eah->element_type = ARR_ELEMTYPE(array); + if (metacache && metacache->element_type == eah->element_type) + { + /* We have a valid cache of representational data */ + eah->typlen = metacache->typlen; + eah->typbyval = metacache->typbyval; + eah->typalign = metacache->typalign; + } + else + { + /* No, so look it up */ + get_typlenbyvalalign(eah->element_type, + &eah->typlen, + &eah->typbyval, + &eah->typalign); + /* Update cache if provided */ + if (metacache) + { + metacache->element_type = eah->element_type; + metacache->typlen = eah->typlen; + metacache->typbyval = eah->typbyval; + metacache->typalign = eah->typalign; + } + } + + /* we don't make a deconstructed representation now */ + eah->dvalues = NULL; + eah->dnulls = NULL; + eah->dvalueslen = 0; + eah->nelems = 0; + eah->flat_size = 0; + + /* remember we have a flat representation */ + eah->fvalue = array; + eah->fstartptr = ARR_DATA_PTR(array); + eah->fendptr = ((char *) array) + ARR_SIZE(array); + + /* return a R/W pointer to the expanded array */ + return EOHPGetRWDatum(&eah->hdr); + } + + /* + * helper for expand_array(): copy pass-by-value Datum-array representation + */ + static void + copy_byval_expanded_array(ExpandedArrayHeader *eah, + ExpandedArrayHeader *oldeah) + { + MemoryContext objcxt = eah->hdr.eoh_context; + int ndims = oldeah->ndims; + int dvalueslen = oldeah->dvalueslen; + + /* Copy array dimensionality information */ + eah->ndims = ndims; + /* We can alloc both dimensionality arrays with one palloc */ + eah->dims = (int *) MemoryContextAlloc(objcxt, ndims * 2 * sizeof(int)); + eah->lbound = eah->dims + ndims; + /* .. but don't assume the source's arrays are contiguous */ + memcpy(eah->dims, oldeah->dims, ndims * sizeof(int)); + memcpy(eah->lbound, oldeah->lbound, ndims * sizeof(int)); + + /* Copy element-type data */ + eah->element_type = oldeah->element_type; + eah->typlen = oldeah->typlen; + eah->typbyval = oldeah->typbyval; + eah->typalign = oldeah->typalign; + + /* Copy the deconstructed representation */ + eah->dvalues = (Datum *) MemoryContextAlloc(objcxt, + dvalueslen * sizeof(Datum)); + memcpy(eah->dvalues, oldeah->dvalues, dvalueslen * sizeof(Datum)); + if (oldeah->dnulls) + { + eah->dnulls = (bool *) MemoryContextAlloc(objcxt, + dvalueslen * sizeof(bool)); + memcpy(eah->dnulls, oldeah->dnulls, dvalueslen * sizeof(bool)); + } + else + eah->dnulls = NULL; + eah->dvalueslen = dvalueslen; + eah->nelems = oldeah->nelems; + eah->flat_size = oldeah->flat_size; + + /* we don't make a flat representation */ + eah->fvalue = NULL; + eah->fstartptr = NULL; + eah->fendptr = NULL; + } + + /* + * get_flat_size method for expanded arrays + */ + static Size + EA_get_flat_size(ExpandedObjectHeader *eohptr) + { + ExpandedArrayHeader *eah = (ExpandedArrayHeader *) eohptr; + int nelems; + int ndims; + Datum *dvalues; + bool *dnulls; + Size nbytes; + int i; + + Assert(eah->ea_magic == EA_MAGIC); + + /* Easy if we have a valid flattened value */ + if (eah->fvalue) + return ARR_SIZE(eah->fvalue); + + /* If we have a cached size value, believe that */ + if (eah->flat_size) + return eah->flat_size; + + /* + * Compute space needed by examining dvalues/dnulls. Note that the result + * array will have a nulls bitmap if dnulls isn't NULL, even if the array + * doesn't actually contain any nulls now. + */ + nelems = eah->nelems; + ndims = eah->ndims; + Assert(nelems == ArrayGetNItems(ndims, eah->dims)); + dvalues = eah->dvalues; + dnulls = eah->dnulls; + nbytes = 0; + for (i = 0; i < nelems; i++) + { + if (dnulls && dnulls[i]) + continue; + nbytes = att_addlength_datum(nbytes, eah->typlen, dvalues[i]); + nbytes = att_align_nominal(nbytes, eah->typalign); + /* check for overflow of total request */ + if (!AllocSizeIsValid(nbytes)) + ereport(ERROR, + (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED), + errmsg("array size exceeds the maximum allowed (%d)", + (int) MaxAllocSize))); + } + + if (dnulls) + nbytes += ARR_OVERHEAD_WITHNULLS(ndims, nelems); + else + nbytes += ARR_OVERHEAD_NONULLS(ndims); + + /* cache for next time */ + eah->flat_size = nbytes; + + return nbytes; + } + + /* + * flatten_into method for expanded arrays + */ + static void + EA_flatten_into(ExpandedObjectHeader *eohptr, + void *result, Size allocated_size) + { + ExpandedArrayHeader *eah = (ExpandedArrayHeader *) eohptr; + ArrayType *aresult = (ArrayType *) result; + int nelems; + int ndims; + int32 dataoffset; + + Assert(eah->ea_magic == EA_MAGIC); + + /* Easy if we have a valid flattened value */ + if (eah->fvalue) + { + Assert(allocated_size == ARR_SIZE(eah->fvalue)); + memcpy(result, eah->fvalue, allocated_size); + return; + } + + /* Else allocation should match previous get_flat_size result */ + Assert(allocated_size == eah->flat_size); + + /* Fill result array from dvalues/dnulls */ + nelems = eah->nelems; + ndims = eah->ndims; + + if (eah->dnulls) + dataoffset = ARR_OVERHEAD_WITHNULLS(ndims, nelems); + else + dataoffset = 0; /* marker for no null bitmap */ + + /* We must ensure that any pad space is zero-filled */ + memset(aresult, 0, allocated_size); + + SET_VARSIZE(aresult, allocated_size); + aresult->ndim = ndims; + aresult->dataoffset = dataoffset; + aresult->elemtype = eah->element_type; + memcpy(ARR_DIMS(aresult), eah->dims, ndims * sizeof(int)); + memcpy(ARR_LBOUND(aresult), eah->lbound, ndims * sizeof(int)); + + CopyArrayEls(aresult, + eah->dvalues, eah->dnulls, nelems, + eah->typlen, eah->typbyval, eah->typalign, + false); + } + + /* + * Argument fetching support code + */ + + /* + * DatumGetExpandedArray: get a writable expanded array from an input argument + * + * Caution: if the input is a read/write pointer, this returns the input + * argument; so callers must be sure that their changes are "safe", that is + * they cannot leave the array in a corrupt state. + */ + ExpandedArrayHeader * + DatumGetExpandedArray(Datum d) + { + /* If it's a writable expanded array already, just return it */ + if (VARATT_IS_EXTERNAL_EXPANDED_RW(DatumGetPointer(d))) + { + ExpandedArrayHeader *eah = (ExpandedArrayHeader *) DatumGetEOHP(d); + + Assert(eah->ea_magic == EA_MAGIC); + return eah; + } + + /* Else expand the hard way */ + d = expand_array(d, CurrentMemoryContext, NULL); + return (ExpandedArrayHeader *) DatumGetEOHP(d); + } + + /* + * As above, when caller has the ability to cache element type info + */ + ExpandedArrayHeader * + DatumGetExpandedArrayX(Datum d, ArrayMetaState *metacache) + { + /* If it's a writable expanded array already, just return it */ + if (VARATT_IS_EXTERNAL_EXPANDED_RW(DatumGetPointer(d))) + { + ExpandedArrayHeader *eah = (ExpandedArrayHeader *) DatumGetEOHP(d); + + Assert(eah->ea_magic == EA_MAGIC); + /* Update cache if provided */ + if (metacache) + { + metacache->element_type = eah->element_type; + metacache->typlen = eah->typlen; + metacache->typbyval = eah->typbyval; + metacache->typalign = eah->typalign; + } + return eah; + } + + /* Else expand using caller's cache if any */ + d = expand_array(d, CurrentMemoryContext, metacache); + return (ExpandedArrayHeader *) DatumGetEOHP(d); + } + + /* + * DatumGetAnyArray: return either an expanded array or a detoasted varlena + * array. The result must not be modified in-place. + */ + AnyArrayType * + DatumGetAnyArray(Datum d) + { + ExpandedArrayHeader *eah; + + /* + * If it's an expanded array (RW or RO), return the header pointer. + */ + if (VARATT_IS_EXTERNAL_EXPANDED(DatumGetPointer(d))) + { + eah = (ExpandedArrayHeader *) DatumGetEOHP(d); + Assert(eah->ea_magic == EA_MAGIC); + return (AnyArrayType *) eah; + } + + /* Else do regular detoasting as needed */ + return (AnyArrayType *) PG_DETOAST_DATUM(d); + } + + /* + * Create the Datum/isnull representation of an expanded array object + * if we didn't do so previously + */ + void + deconstruct_expanded_array(ExpandedArrayHeader *eah) + { + if (eah->dvalues == NULL) + { + MemoryContext oldcxt = MemoryContextSwitchTo(eah->hdr.eoh_context); + Datum *dvalues; + bool *dnulls; + int nelems; + + dnulls = NULL; + deconstruct_array(eah->fvalue, + eah->element_type, + eah->typlen, eah->typbyval, eah->typalign, + &dvalues, + ARR_HASNULL(eah->fvalue) ? &dnulls : NULL, + &nelems); + + /* + * Update header only after successful completion of this step. If + * deconstruct_array fails partway through, worst consequence is some + * leaked memory in the object's context. If the caller fails at a + * later point, that's fine, since the deconstructed representation is + * valid anyhow. + */ + eah->dvalues = dvalues; + eah->dnulls = dnulls; + eah->dvalueslen = eah->nelems = nelems; + MemoryContextSwitchTo(oldcxt); + } + } diff --git a/src/backend/utils/adt/array_userfuncs.c b/src/backend/utils/adt/array_userfuncs.c index 4177d2d..f7b57da 100644 *** a/src/backend/utils/adt/array_userfuncs.c --- b/src/backend/utils/adt/array_userfuncs.c *************** static Datum array_position_common(Funct *** 25,46 **** /* * fetch_array_arg_replace_nulls * ! * Fetch an array-valued argument; if it's null, construct an empty array ! * value of the proper data type. Also cache basic element type information ! * in fn_extra. */ ! static ArrayType * fetch_array_arg_replace_nulls(FunctionCallInfo fcinfo, int argno) { ! ArrayType *v; Oid element_type; ArrayMetaState *my_extra; ! /* First collect the array value */ if (!PG_ARGISNULL(argno)) { ! v = PG_GETARG_ARRAYTYPE_P(argno); ! element_type = ARR_ELEMTYPE(v); } else { --- 25,60 ---- /* * fetch_array_arg_replace_nulls * ! * Fetch an array-valued argument in expanded form; if it's null, construct an ! * empty array value of the proper data type. Also cache basic element type ! * information in fn_extra. ! * ! * Caution: if the input is a read/write pointer, this returns the input ! * argument; so callers must be sure that their changes are "safe", that is ! * they cannot leave the array in a corrupt state. */ ! static ExpandedArrayHeader * fetch_array_arg_replace_nulls(FunctionCallInfo fcinfo, int argno) { ! ExpandedArrayHeader *eah; Oid element_type; ArrayMetaState *my_extra; ! /* If first time through, create datatype cache struct */ ! my_extra = (ArrayMetaState *) fcinfo->flinfo->fn_extra; ! if (my_extra == NULL) ! { ! my_extra = (ArrayMetaState *) ! MemoryContextAlloc(fcinfo->flinfo->fn_mcxt, ! sizeof(ArrayMetaState)); ! my_extra->element_type = InvalidOid; ! fcinfo->flinfo->fn_extra = my_extra; ! } ! ! /* Now collect the array value */ if (!PG_ARGISNULL(argno)) { ! eah = PG_GETARG_EXPANDED_ARRAYX(argno, my_extra); } else { *************** fetch_array_arg_replace_nulls(FunctionCa *** 57,86 **** (errcode(ERRCODE_DATATYPE_MISMATCH), errmsg("input data type is not an array"))); ! v = construct_empty_array(element_type); ! } ! ! /* Now cache required info, which might change from call to call */ ! my_extra = (ArrayMetaState *) fcinfo->flinfo->fn_extra; ! if (my_extra == NULL) ! { ! my_extra = (ArrayMetaState *) ! MemoryContextAlloc(fcinfo->flinfo->fn_mcxt, ! sizeof(ArrayMetaState)); ! my_extra->element_type = InvalidOid; ! fcinfo->flinfo->fn_extra = my_extra; ! } ! ! if (my_extra->element_type != element_type) ! { ! get_typlenbyvalalign(element_type, ! &my_extra->typlen, ! &my_extra->typbyval, ! &my_extra->typalign); ! my_extra->element_type = element_type; } ! return v; } /*----------------------------------------------------------------------------- --- 71,82 ---- (errcode(ERRCODE_DATATYPE_MISMATCH), errmsg("input data type is not an array"))); ! eah = construct_empty_expanded_array(element_type, ! CurrentMemoryContext, ! my_extra); } ! return eah; } /*----------------------------------------------------------------------------- *************** fetch_array_arg_replace_nulls(FunctionCa *** 91,119 **** Datum array_append(PG_FUNCTION_ARGS) { ! ArrayType *v; Datum newelem; bool isNull; ! ArrayType *result; int *dimv, *lb; int indx; ArrayMetaState *my_extra; ! v = fetch_array_arg_replace_nulls(fcinfo, 0); isNull = PG_ARGISNULL(1); if (isNull) newelem = (Datum) 0; else newelem = PG_GETARG_DATUM(1); ! if (ARR_NDIM(v) == 1) { /* append newelem */ int ub; ! lb = ARR_LBOUND(v); ! dimv = ARR_DIMS(v); ub = dimv[0] + lb[0] - 1; indx = ub + 1; --- 87,115 ---- Datum array_append(PG_FUNCTION_ARGS) { ! ExpandedArrayHeader *eah; Datum newelem; bool isNull; ! Datum result; int *dimv, *lb; int indx; ArrayMetaState *my_extra; ! eah = fetch_array_arg_replace_nulls(fcinfo, 0); isNull = PG_ARGISNULL(1); if (isNull) newelem = (Datum) 0; else newelem = PG_GETARG_DATUM(1); ! if (eah->ndims == 1) { /* append newelem */ int ub; ! lb = eah->lbound; ! dimv = eah->dims; ub = dimv[0] + lb[0] - 1; indx = ub + 1; *************** array_append(PG_FUNCTION_ARGS) *** 123,129 **** (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE), errmsg("integer out of range"))); } ! else if (ARR_NDIM(v) == 0) indx = 1; else ereport(ERROR, --- 119,125 ---- (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE), errmsg("integer out of range"))); } ! else if (eah->ndims == 0) indx = 1; else ereport(ERROR, *************** array_append(PG_FUNCTION_ARGS) *** 133,142 **** /* Perform element insertion */ my_extra = (ArrayMetaState *) fcinfo->flinfo->fn_extra; ! result = array_set(v, 1, &indx, newelem, isNull, -1, my_extra->typlen, my_extra->typbyval, my_extra->typalign); ! PG_RETURN_ARRAYTYPE_P(result); } /*----------------------------------------------------------------------------- --- 129,139 ---- /* Perform element insertion */ my_extra = (ArrayMetaState *) fcinfo->flinfo->fn_extra; ! result = array_set_element(EOHPGetRWDatum(&eah->hdr), ! 1, &indx, newelem, isNull, -1, my_extra->typlen, my_extra->typbyval, my_extra->typalign); ! PG_RETURN_DATUM(result); } /*----------------------------------------------------------------------------- *************** array_append(PG_FUNCTION_ARGS) *** 147,158 **** Datum array_prepend(PG_FUNCTION_ARGS) { ! ArrayType *v; Datum newelem; bool isNull; ! ArrayType *result; int *lb; int indx; ArrayMetaState *my_extra; isNull = PG_ARGISNULL(0); --- 144,156 ---- Datum array_prepend(PG_FUNCTION_ARGS) { ! ExpandedArrayHeader *eah; Datum newelem; bool isNull; ! Datum result; int *lb; int indx; + int lb0; ArrayMetaState *my_extra; isNull = PG_ARGISNULL(0); *************** array_prepend(PG_FUNCTION_ARGS) *** 160,172 **** newelem = (Datum) 0; else newelem = PG_GETARG_DATUM(0); ! v = fetch_array_arg_replace_nulls(fcinfo, 1); ! if (ARR_NDIM(v) == 1) { /* prepend newelem */ ! lb = ARR_LBOUND(v); indx = lb[0] - 1; /* overflow? */ if (indx > lb[0]) --- 158,171 ---- newelem = (Datum) 0; else newelem = PG_GETARG_DATUM(0); ! eah = fetch_array_arg_replace_nulls(fcinfo, 1); ! if (eah->ndims == 1) { /* prepend newelem */ ! lb = eah->lbound; indx = lb[0] - 1; + lb0 = lb[0]; /* overflow? */ if (indx > lb[0]) *************** array_prepend(PG_FUNCTION_ARGS) *** 174,181 **** (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE), errmsg("integer out of range"))); } ! else if (ARR_NDIM(v) == 0) indx = 1; else ereport(ERROR, (errcode(ERRCODE_DATA_EXCEPTION), --- 173,183 ---- (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE), errmsg("integer out of range"))); } ! else if (eah->ndims == 0) ! { indx = 1; + lb0 = 1; + } else ereport(ERROR, (errcode(ERRCODE_DATA_EXCEPTION), *************** array_prepend(PG_FUNCTION_ARGS) *** 184,197 **** /* Perform element insertion */ my_extra = (ArrayMetaState *) fcinfo->flinfo->fn_extra; ! result = array_set(v, 1, &indx, newelem, isNull, -1, my_extra->typlen, my_extra->typbyval, my_extra->typalign); /* Readjust result's LB to match the input's, as expected for prepend */ ! if (ARR_NDIM(v) == 1) ! ARR_LBOUND(result)[0] = ARR_LBOUND(v)[0]; ! PG_RETURN_ARRAYTYPE_P(result); } /*----------------------------------------------------------------------------- --- 186,204 ---- /* Perform element insertion */ my_extra = (ArrayMetaState *) fcinfo->flinfo->fn_extra; ! result = array_set_element(EOHPGetRWDatum(&eah->hdr), ! 1, &indx, newelem, isNull, -1, my_extra->typlen, my_extra->typbyval, my_extra->typalign); /* Readjust result's LB to match the input's, as expected for prepend */ ! Assert(result == EOHPGetRWDatum(&eah->hdr)); ! if (eah->ndims == 1) ! { ! /* This is ok whether we've deconstructed or not */ ! eah->lbound[0] = lb0; ! } ! PG_RETURN_DATUM(result); } /*----------------------------------------------------------------------------- diff --git a/src/backend/utils/adt/arrayfuncs.c b/src/backend/utils/adt/arrayfuncs.c index 9117a55..26fa648 100644 *** a/src/backend/utils/adt/arrayfuncs.c --- b/src/backend/utils/adt/arrayfuncs.c *************** bool Array_nulls = true; *** 42,47 **** --- 42,53 ---- */ #define ASSGN "=" + #define AARR_FREE_IF_COPY(array,n) \ + do { \ + if (!VARATT_IS_EXPANDED_HEADER(array)) \ + PG_FREE_IF_COPY(array, n); \ + } while (0) + typedef enum { ARRAY_NO_LEVEL, *************** static void ReadArrayBinary(StringInfo b *** 93,102 **** int typlen, bool typbyval, char typalign, Datum *values, bool *nulls, bool *hasnulls, int32 *nbytes); ! static void CopyArrayEls(ArrayType *array, ! Datum *values, bool *nulls, int nitems, ! int typlen, bool typbyval, char typalign, ! bool freedata); static bool array_get_isnull(const bits8 *nullbitmap, int offset); static void array_set_isnull(bits8 *nullbitmap, int offset, bool isNull); static Datum ArrayCast(char *value, bool byval, int len); --- 99,114 ---- int typlen, bool typbyval, char typalign, Datum *values, bool *nulls, bool *hasnulls, int32 *nbytes); ! static Datum array_get_element_expanded(Datum arraydatum, ! int nSubscripts, int *indx, ! int arraytyplen, ! int elmlen, bool elmbyval, char elmalign, ! bool *isNull); ! static Datum array_set_element_expanded(Datum arraydatum, ! int nSubscripts, int *indx, ! Datum dataValue, bool isNull, ! int arraytyplen, ! int elmlen, bool elmbyval, char elmalign); static bool array_get_isnull(const bits8 *nullbitmap, int offset); static void array_set_isnull(bits8 *nullbitmap, int offset, bool isNull); static Datum ArrayCast(char *value, bool byval, int len); *************** ReadArrayStr(char *arrayStr, *** 939,945 **** * the values are not toasted. (Doing it here doesn't work since the * caller has already allocated space for the array...) */ ! static void CopyArrayEls(ArrayType *array, Datum *values, bool *nulls, --- 951,957 ---- * the values are not toasted. (Doing it here doesn't work since the * caller has already allocated space for the array...) */ ! void CopyArrayEls(ArrayType *array, Datum *values, bool *nulls, *************** CopyArrayEls(ArrayType *array, *** 997,1004 **** Datum array_out(PG_FUNCTION_ARGS) { ! ArrayType *v = PG_GETARG_ARRAYTYPE_P(0); ! Oid element_type = ARR_ELEMTYPE(v); int typlen; bool typbyval; char typalign; --- 1009,1016 ---- Datum array_out(PG_FUNCTION_ARGS) { ! AnyArrayType *v = PG_GETARG_ANY_ARRAY(0); ! Oid element_type = AARR_ELEMTYPE(v); int typlen; bool typbyval; char typalign; *************** array_out(PG_FUNCTION_ARGS) *** 1014,1021 **** * * +2 allows for assignment operator + trailing null */ - bits8 *bitmap; - int bitmask; bool *needquotes, needdims = false; int nitems, --- 1026,1031 ---- *************** array_out(PG_FUNCTION_ARGS) *** 1027,1032 **** --- 1037,1043 ---- int ndim, *dims, *lb; + ARRAY_ITER ARRAY_ITER_VARS(iter); ArrayMetaState *my_extra; /* *************** array_out(PG_FUNCTION_ARGS) *** 1061,1069 **** typalign = my_extra->typalign; typdelim = my_extra->typdelim; ! ndim = ARR_NDIM(v); ! dims = ARR_DIMS(v); ! lb = ARR_LBOUND(v); nitems = ArrayGetNItems(ndim, dims); if (nitems == 0) --- 1072,1080 ---- typalign = my_extra->typalign; typdelim = my_extra->typdelim; ! ndim = AARR_NDIM(v); ! dims = AARR_DIMS(v); ! lb = AARR_LBOUND(v); nitems = ArrayGetNItems(ndim, dims); if (nitems == 0) *************** array_out(PG_FUNCTION_ARGS) *** 1094,1109 **** needquotes = (bool *) palloc(nitems * sizeof(bool)); overall_length = 1; /* don't forget to count \0 at end. */ ! p = ARR_DATA_PTR(v); ! bitmap = ARR_NULLBITMAP(v); ! bitmask = 1; for (i = 0; i < nitems; i++) { bool needquote; /* Get source element, checking for NULL */ ! if (bitmap && (*bitmap & bitmask) == 0) { values[i] = pstrdup("NULL"); overall_length += 4; --- 1105,1122 ---- needquotes = (bool *) palloc(nitems * sizeof(bool)); overall_length = 1; /* don't forget to count \0 at end. */ ! ARRAY_ITER_SETUP(iter, v); for (i = 0; i < nitems; i++) { + Datum itemvalue; + bool isnull; bool needquote; /* Get source element, checking for NULL */ ! ARRAY_ITER_NEXT(iter, i, itemvalue, isnull, typlen, typbyval, typalign); ! ! if (isnull) { values[i] = pstrdup("NULL"); overall_length += 4; *************** array_out(PG_FUNCTION_ARGS) *** 1111,1122 **** } else { - Datum itemvalue; - - itemvalue = fetch_att(p, typbyval, typlen); values[i] = OutputFunctionCall(&my_extra->proc, itemvalue); - p = att_addlength_pointer(p, typlen, p); - p = (char *) att_align_nominal(p, typalign); /* count data plus backslashes; detect chars needing quotes */ if (values[i][0] == '\0') --- 1124,1130 ---- *************** array_out(PG_FUNCTION_ARGS) *** 1149,1165 **** overall_length += 2; /* and the comma */ overall_length += 1; - - /* advance bitmap pointer if any */ - if (bitmap) - { - bitmask <<= 1; - if (bitmask == 0x100) - { - bitmap++; - bitmask = 1; - } - } } /* --- 1157,1162 ---- *************** ReadArrayBinary(StringInfo buf, *** 1534,1552 **** Datum array_send(PG_FUNCTION_ARGS) { ! ArrayType *v = PG_GETARG_ARRAYTYPE_P(0); ! Oid element_type = ARR_ELEMTYPE(v); int typlen; bool typbyval; char typalign; - char *p; - bits8 *bitmap; - int bitmask; int nitems, i; int ndim, ! *dim; StringInfoData buf; ArrayMetaState *my_extra; /* --- 1531,1548 ---- Datum array_send(PG_FUNCTION_ARGS) { ! AnyArrayType *v = PG_GETARG_ANY_ARRAY(0); ! Oid element_type = AARR_ELEMTYPE(v); int typlen; bool typbyval; char typalign; int nitems, i; int ndim, ! *dim, ! *lb; StringInfoData buf; + ARRAY_ITER ARRAY_ITER_VARS(iter); ArrayMetaState *my_extra; /* *************** array_send(PG_FUNCTION_ARGS) *** 1583,1642 **** typbyval = my_extra->typbyval; typalign = my_extra->typalign; ! ndim = ARR_NDIM(v); ! dim = ARR_DIMS(v); nitems = ArrayGetNItems(ndim, dim); pq_begintypsend(&buf); /* Send the array header information */ pq_sendint(&buf, ndim, 4); ! pq_sendint(&buf, ARR_HASNULL(v) ? 1 : 0, 4); pq_sendint(&buf, element_type, sizeof(Oid)); for (i = 0; i < ndim; i++) { ! pq_sendint(&buf, ARR_DIMS(v)[i], 4); ! pq_sendint(&buf, ARR_LBOUND(v)[i], 4); } /* Send the array elements using the element's own sendproc */ ! p = ARR_DATA_PTR(v); ! bitmap = ARR_NULLBITMAP(v); ! bitmask = 1; for (i = 0; i < nitems; i++) { /* Get source element, checking for NULL */ ! if (bitmap && (*bitmap & bitmask) == 0) { /* -1 length means a NULL */ pq_sendint(&buf, -1, 4); } else { - Datum itemvalue; bytea *outputbytes; - itemvalue = fetch_att(p, typbyval, typlen); outputbytes = SendFunctionCall(&my_extra->proc, itemvalue); pq_sendint(&buf, VARSIZE(outputbytes) - VARHDRSZ, 4); pq_sendbytes(&buf, VARDATA(outputbytes), VARSIZE(outputbytes) - VARHDRSZ); pfree(outputbytes); - - p = att_addlength_pointer(p, typlen, p); - p = (char *) att_align_nominal(p, typalign); - } - - /* advance bitmap pointer if any */ - if (bitmap) - { - bitmask <<= 1; - if (bitmask == 0x100) - { - bitmap++; - bitmask = 1; - } } } --- 1579,1626 ---- typbyval = my_extra->typbyval; typalign = my_extra->typalign; ! ndim = AARR_NDIM(v); ! dim = AARR_DIMS(v); ! lb = AARR_LBOUND(v); nitems = ArrayGetNItems(ndim, dim); pq_begintypsend(&buf); /* Send the array header information */ pq_sendint(&buf, ndim, 4); ! pq_sendint(&buf, AARR_HASNULL(v) ? 1 : 0, 4); pq_sendint(&buf, element_type, sizeof(Oid)); for (i = 0; i < ndim; i++) { ! pq_sendint(&buf, dim[i], 4); ! pq_sendint(&buf, lb[i], 4); } /* Send the array elements using the element's own sendproc */ ! ARRAY_ITER_SETUP(iter, v); for (i = 0; i < nitems; i++) { + Datum itemvalue; + bool isnull; + /* Get source element, checking for NULL */ ! ARRAY_ITER_NEXT(iter, i, itemvalue, isnull, typlen, typbyval, typalign); ! ! if (isnull) { /* -1 length means a NULL */ pq_sendint(&buf, -1, 4); } else { bytea *outputbytes; outputbytes = SendFunctionCall(&my_extra->proc, itemvalue); pq_sendint(&buf, VARSIZE(outputbytes) - VARHDRSZ, 4); pq_sendbytes(&buf, VARDATA(outputbytes), VARSIZE(outputbytes) - VARHDRSZ); pfree(outputbytes); } } *************** array_send(PG_FUNCTION_ARGS) *** 1650,1662 **** Datum array_ndims(PG_FUNCTION_ARGS) { ! ArrayType *v = PG_GETARG_ARRAYTYPE_P(0); /* Sanity check: does it look like an array at all? */ ! if (ARR_NDIM(v) <= 0 || ARR_NDIM(v) > MAXDIM) PG_RETURN_NULL(); ! PG_RETURN_INT32(ARR_NDIM(v)); } /* --- 1634,1646 ---- Datum array_ndims(PG_FUNCTION_ARGS) { ! AnyArrayType *v = PG_GETARG_ANY_ARRAY(0); /* Sanity check: does it look like an array at all? */ ! if (AARR_NDIM(v) <= 0 || AARR_NDIM(v) > MAXDIM) PG_RETURN_NULL(); ! PG_RETURN_INT32(AARR_NDIM(v)); } /* *************** array_ndims(PG_FUNCTION_ARGS) *** 1666,1672 **** Datum array_dims(PG_FUNCTION_ARGS) { ! ArrayType *v = PG_GETARG_ARRAYTYPE_P(0); char *p; int i; int *dimv, --- 1650,1656 ---- Datum array_dims(PG_FUNCTION_ARGS) { ! AnyArrayType *v = PG_GETARG_ANY_ARRAY(0); char *p; int i; int *dimv, *************** array_dims(PG_FUNCTION_ARGS) *** 1680,1693 **** char buf[MAXDIM * 33 + 1]; /* Sanity check: does it look like an array at all? */ ! if (ARR_NDIM(v) <= 0 || ARR_NDIM(v) > MAXDIM) PG_RETURN_NULL(); ! dimv = ARR_DIMS(v); ! lb = ARR_LBOUND(v); p = buf; ! for (i = 0; i < ARR_NDIM(v); i++) { sprintf(p, "[%d:%d]", lb[i], dimv[i] + lb[i] - 1); p += strlen(p); --- 1664,1677 ---- char buf[MAXDIM * 33 + 1]; /* Sanity check: does it look like an array at all? */ ! if (AARR_NDIM(v) <= 0 || AARR_NDIM(v) > MAXDIM) PG_RETURN_NULL(); ! dimv = AARR_DIMS(v); ! lb = AARR_LBOUND(v); p = buf; ! for (i = 0; i < AARR_NDIM(v); i++) { sprintf(p, "[%d:%d]", lb[i], dimv[i] + lb[i] - 1); p += strlen(p); *************** array_dims(PG_FUNCTION_ARGS) *** 1704,1723 **** Datum array_lower(PG_FUNCTION_ARGS) { ! ArrayType *v = PG_GETARG_ARRAYTYPE_P(0); int reqdim = PG_GETARG_INT32(1); int *lb; int result; /* Sanity check: does it look like an array at all? */ ! if (ARR_NDIM(v) <= 0 || ARR_NDIM(v) > MAXDIM) PG_RETURN_NULL(); /* Sanity check: was the requested dim valid */ ! if (reqdim <= 0 || reqdim > ARR_NDIM(v)) PG_RETURN_NULL(); ! lb = ARR_LBOUND(v); result = lb[reqdim - 1]; PG_RETURN_INT32(result); --- 1688,1707 ---- Datum array_lower(PG_FUNCTION_ARGS) { ! AnyArrayType *v = PG_GETARG_ANY_ARRAY(0); int reqdim = PG_GETARG_INT32(1); int *lb; int result; /* Sanity check: does it look like an array at all? */ ! if (AARR_NDIM(v) <= 0 || AARR_NDIM(v) > MAXDIM) PG_RETURN_NULL(); /* Sanity check: was the requested dim valid */ ! if (reqdim <= 0 || reqdim > AARR_NDIM(v)) PG_RETURN_NULL(); ! lb = AARR_LBOUND(v); result = lb[reqdim - 1]; PG_RETURN_INT32(result); *************** array_lower(PG_FUNCTION_ARGS) *** 1731,1752 **** Datum array_upper(PG_FUNCTION_ARGS) { ! ArrayType *v = PG_GETARG_ARRAYTYPE_P(0); int reqdim = PG_GETARG_INT32(1); int *dimv, *lb; int result; /* Sanity check: does it look like an array at all? */ ! if (ARR_NDIM(v) <= 0 || ARR_NDIM(v) > MAXDIM) PG_RETURN_NULL(); /* Sanity check: was the requested dim valid */ ! if (reqdim <= 0 || reqdim > ARR_NDIM(v)) PG_RETURN_NULL(); ! lb = ARR_LBOUND(v); ! dimv = ARR_DIMS(v); result = dimv[reqdim - 1] + lb[reqdim - 1] - 1; --- 1715,1736 ---- Datum array_upper(PG_FUNCTION_ARGS) { ! AnyArrayType *v = PG_GETARG_ANY_ARRAY(0); int reqdim = PG_GETARG_INT32(1); int *dimv, *lb; int result; /* Sanity check: does it look like an array at all? */ ! if (AARR_NDIM(v) <= 0 || AARR_NDIM(v) > MAXDIM) PG_RETURN_NULL(); /* Sanity check: was the requested dim valid */ ! if (reqdim <= 0 || reqdim > AARR_NDIM(v)) PG_RETURN_NULL(); ! lb = AARR_LBOUND(v); ! dimv = AARR_DIMS(v); result = dimv[reqdim - 1] + lb[reqdim - 1] - 1; *************** array_upper(PG_FUNCTION_ARGS) *** 1761,1780 **** Datum array_length(PG_FUNCTION_ARGS) { ! ArrayType *v = PG_GETARG_ARRAYTYPE_P(0); int reqdim = PG_GETARG_INT32(1); int *dimv; int result; /* Sanity check: does it look like an array at all? */ ! if (ARR_NDIM(v) <= 0 || ARR_NDIM(v) > MAXDIM) PG_RETURN_NULL(); /* Sanity check: was the requested dim valid */ ! if (reqdim <= 0 || reqdim > ARR_NDIM(v)) PG_RETURN_NULL(); ! dimv = ARR_DIMS(v); result = dimv[reqdim - 1]; --- 1745,1764 ---- Datum array_length(PG_FUNCTION_ARGS) { ! AnyArrayType *v = PG_GETARG_ANY_ARRAY(0); int reqdim = PG_GETARG_INT32(1); int *dimv; int result; /* Sanity check: does it look like an array at all? */ ! if (AARR_NDIM(v) <= 0 || AARR_NDIM(v) > MAXDIM) PG_RETURN_NULL(); /* Sanity check: was the requested dim valid */ ! if (reqdim <= 0 || reqdim > AARR_NDIM(v)) PG_RETURN_NULL(); ! dimv = AARR_DIMS(v); result = dimv[reqdim - 1]; *************** array_length(PG_FUNCTION_ARGS) *** 1788,1796 **** Datum array_cardinality(PG_FUNCTION_ARGS) { ! ArrayType *v = PG_GETARG_ARRAYTYPE_P(0); ! PG_RETURN_INT32(ArrayGetNItems(ARR_NDIM(v), ARR_DIMS(v))); } --- 1772,1780 ---- Datum array_cardinality(PG_FUNCTION_ARGS) { ! AnyArrayType *v = PG_GETARG_ANY_ARRAY(0); ! PG_RETURN_INT32(ArrayGetNItems(AARR_NDIM(v), AARR_DIMS(v))); } *************** array_get_element(Datum arraydatum, *** 1825,1831 **** char elmalign, bool *isNull) { - ArrayType *array; int i, ndim, *dim, --- 1809,1814 ---- *************** array_get_element(Datum arraydatum, *** 1850,1859 **** arraydataptr = (char *) DatumGetPointer(arraydatum); arraynullsptr = NULL; } else { ! /* detoast input array if necessary */ ! array = DatumGetArrayTypeP(arraydatum); ndim = ARR_NDIM(array); dim = ARR_DIMS(array); --- 1833,1854 ---- arraydataptr = (char *) DatumGetPointer(arraydatum); arraynullsptr = NULL; } + else if (VARATT_IS_EXTERNAL_EXPANDED(DatumGetPointer(arraydatum))) + { + /* expanded array: let's do this in a separate function */ + return array_get_element_expanded(arraydatum, + nSubscripts, + indx, + arraytyplen, + elmlen, + elmbyval, + elmalign, + isNull); + } else { ! /* detoast array if necessary, producing normal varlena input */ ! ArrayType *array = DatumGetArrayTypeP(arraydatum); ndim = ARR_NDIM(array); dim = ARR_DIMS(array); *************** array_get_element(Datum arraydatum, *** 1903,1908 **** --- 1898,1985 ---- } /* + * Implementation of array_get_element() for an expanded array + */ + static Datum + array_get_element_expanded(Datum arraydatum, + int nSubscripts, int *indx, + int arraytyplen, + int elmlen, bool elmbyval, char elmalign, + bool *isNull) + { + ExpandedArrayHeader *eah; + int i, + ndim, + *dim, + *lb, + offset; + Datum *dvalues; + bool *dnulls; + + eah = (ExpandedArrayHeader *) DatumGetEOHP(arraydatum); + Assert(eah->ea_magic == EA_MAGIC); + + /* sanity-check caller's info against object */ + Assert(arraytyplen == -1); + Assert(elmlen == eah->typlen); + Assert(elmbyval == eah->typbyval); + Assert(elmalign == eah->typalign); + + ndim = eah->ndims; + dim = eah->dims; + lb = eah->lbound; + + /* + * Return NULL for invalid subscript + */ + if (ndim != nSubscripts || ndim <= 0 || ndim > MAXDIM) + { + *isNull = true; + return (Datum) 0; + } + for (i = 0; i < ndim; i++) + { + if (indx[i] < lb[i] || indx[i] >= (dim[i] + lb[i])) + { + *isNull = true; + return (Datum) 0; + } + } + + /* + * Calculate the element number + */ + offset = ArrayGetOffset(nSubscripts, dim, lb, indx); + + /* + * Deconstruct array if we didn't already. Note that we apply this even + * if the input is nominally read-only: it should be safe enough. + */ + deconstruct_expanded_array(eah); + + dvalues = eah->dvalues; + dnulls = eah->dnulls; + + /* + * Check for NULL array element + */ + if (dnulls && dnulls[offset]) + { + *isNull = true; + return (Datum) 0; + } + + /* + * OK, get the element. It's OK to return a pass-by-ref value as a + * pointer into the expanded array, for the same reason that regular + * array_get_element can return a pointer into flat arrays: the value is + * assumed not to change for as long as the Datum reference can exist. + */ + *isNull = false; + return dvalues[offset]; + } + + /* * array_get_slice : * This routine takes an array and a range of indices (upperIndex and * lowerIndx), creates a new array structure for the referred elements *************** array_get_slice(Datum arraydatum, *** 2083,2089 **** * * Result: * A new array is returned, just like the old except for the one ! * modified entry. The original array object is not changed. * * For one-dimensional arrays only, we allow the array to be extended * by assigning to a position outside the existing subscript range; any --- 2160,2168 ---- * * Result: * A new array is returned, just like the old except for the one ! * modified entry. The original array object is not changed, ! * unless what is passed is a read-write reference to an expanded ! * array object; in that case the expanded array is updated in-place. * * For one-dimensional arrays only, we allow the array to be extended * by assigning to a position outside the existing subscript range; any *************** array_set_element(Datum arraydatum, *** 2166,2171 **** --- 2245,2264 ---- if (elmlen == -1 && !isNull) dataValue = PointerGetDatum(PG_DETOAST_DATUM(dataValue)); + if (VARATT_IS_EXTERNAL_EXPANDED(DatumGetPointer(arraydatum))) + { + /* expanded array: let's do this in a separate function */ + return array_set_element_expanded(arraydatum, + nSubscripts, + indx, + dataValue, + isNull, + arraytyplen, + elmlen, + elmbyval, + elmalign); + } + /* detoast input array if necessary */ array = DatumGetArrayTypeP(arraydatum); *************** array_set_element(Datum arraydatum, *** 2355,2360 **** --- 2448,2698 ---- } /* + * Implementation of array_set_element() for an expanded array + * + * Note: as with any operation on a read/write expanded object, we must + * take pains not to leave the object in a corrupt state if we fail partway + * through. + */ + static Datum + array_set_element_expanded(Datum arraydatum, + int nSubscripts, int *indx, + Datum dataValue, bool isNull, + int arraytyplen, + int elmlen, bool elmbyval, char elmalign) + { + ExpandedArrayHeader *eah; + Datum *dvalues; + bool *dnulls; + int i, + ndim, + dim[MAXDIM], + lb[MAXDIM], + offset; + bool dimschanged, + newhasnulls; + int addedbefore, + addedafter; + char *oldValue; + + /* Convert to R/W object if not so already */ + eah = DatumGetExpandedArray(arraydatum); + + /* Sanity-check caller's info against object; we don't use it otherwise */ + Assert(arraytyplen == -1); + Assert(elmlen == eah->typlen); + Assert(elmbyval == eah->typbyval); + Assert(elmalign == eah->typalign); + + /* + * Copy dimension info into local storage. This allows us to modify the + * dimensions if needed, while not messing up the expanded value if we + * fail partway through. + */ + ndim = eah->ndims; + Assert(ndim >= 0 && ndim <= MAXDIM); + memcpy(dim, eah->dims, ndim * sizeof(int)); + memcpy(lb, eah->lbound, ndim * sizeof(int)); + dimschanged = false; + + /* + * if number of dims is zero, i.e. an empty array, create an array with + * nSubscripts dimensions, and set the lower bounds to the supplied + * subscripts. + */ + if (ndim == 0) + { + /* + * Allocate adequate space for new dimension info. This is harmless + * if we fail later. + */ + Assert(nSubscripts > 0 && nSubscripts <= MAXDIM); + eah->dims = (int *) MemoryContextAllocZero(eah->hdr.eoh_context, + nSubscripts * sizeof(int)); + eah->lbound = (int *) MemoryContextAllocZero(eah->hdr.eoh_context, + nSubscripts * sizeof(int)); + + /* Update local copies of dimension info */ + ndim = nSubscripts; + for (i = 0; i < nSubscripts; i++) + { + dim[i] = 0; + lb[i] = indx[i]; + } + dimschanged = true; + } + else if (ndim != nSubscripts) + ereport(ERROR, + (errcode(ERRCODE_ARRAY_SUBSCRIPT_ERROR), + errmsg("wrong number of array subscripts"))); + + /* + * Deconstruct array if we didn't already. (Someday maybe add a special + * case path for fixed-length, no-nulls cases, where we can overwrite an + * element in place without ever deconstructing. But today is not that + * day.) + */ + deconstruct_expanded_array(eah); + + /* + * Copy new element into array's context, if needed (we assume it's + * already detoasted, so no junk should be created). If we fail further + * down, this memory is leaked, but that's reasonably harmless. + */ + if (!eah->typbyval && !isNull) + { + MemoryContext oldcxt = MemoryContextSwitchTo(eah->hdr.eoh_context); + + dataValue = datumCopy(dataValue, false, eah->typlen); + MemoryContextSwitchTo(oldcxt); + } + + dvalues = eah->dvalues; + dnulls = eah->dnulls; + + newhasnulls = ((dnulls != NULL) || isNull); + addedbefore = addedafter = 0; + + /* + * Check subscripts (this logic matches original array_set_element) + */ + if (ndim == 1) + { + if (indx[0] < lb[0]) + { + addedbefore = lb[0] - indx[0]; + dim[0] += addedbefore; + lb[0] = indx[0]; + dimschanged = true; + if (addedbefore > 1) + newhasnulls = true; /* will insert nulls */ + } + if (indx[0] >= (dim[0] + lb[0])) + { + addedafter = indx[0] - (dim[0] + lb[0]) + 1; + dim[0] += addedafter; + dimschanged = true; + if (addedafter > 1) + newhasnulls = true; /* will insert nulls */ + } + } + else + { + /* + * XXX currently we do not support extending multi-dimensional arrays + * during assignment + */ + for (i = 0; i < ndim; i++) + { + if (indx[i] < lb[i] || + indx[i] >= (dim[i] + lb[i])) + ereport(ERROR, + (errcode(ERRCODE_ARRAY_SUBSCRIPT_ERROR), + errmsg("array subscript out of range"))); + } + } + + /* Now we can calculate linear offset of target item in array */ + offset = ArrayGetOffset(nSubscripts, dim, lb, indx); + + /* Physically enlarge existing dvalues/dnulls arrays if needed */ + if (dim[0] > eah->dvalueslen) + { + /* We want some extra space if we're enlarging */ + int newlen = dim[0] + dim[0] / 8; + + newlen = Max(newlen, dim[0]); /* integer overflow guard */ + eah->dvalues = dvalues = (Datum *) + repalloc(dvalues, newlen * sizeof(Datum)); + if (dnulls) + eah->dnulls = dnulls = (bool *) + repalloc(dnulls, newlen * sizeof(bool)); + eah->dvalueslen = newlen; + } + + /* + * If we need a nulls bitmap and don't already have one, create it, being + * sure to mark all existing entries as not null. + */ + if (newhasnulls && dnulls == NULL) + eah->dnulls = dnulls = (bool *) + MemoryContextAllocZero(eah->hdr.eoh_context, + eah->dvalueslen * sizeof(bool)); + + /* + * We now have all the needed space allocated, so we're ready to make + * irreversible changes. Be very wary of allowing failure below here. + */ + + /* Flattened value will no longer represent array accurately */ + eah->fvalue = NULL; + /* And we don't know the flattened size either */ + eah->flat_size = 0; + + /* Update dimensionality info if needed */ + if (dimschanged) + { + eah->ndims = ndim; + memcpy(eah->dims, dim, ndim * sizeof(int)); + memcpy(eah->lbound, lb, ndim * sizeof(int)); + } + + /* Reposition items if needed, and fill addedbefore items with nulls */ + if (addedbefore > 0) + { + memmove(dvalues + addedbefore, dvalues, eah->nelems * sizeof(Datum)); + for (i = 0; i < addedbefore; i++) + dvalues[i] = (Datum) 0; + if (dnulls) + { + memmove(dnulls + addedbefore, dnulls, eah->nelems * sizeof(bool)); + for (i = 0; i < addedbefore; i++) + dnulls[i] = true; + } + eah->nelems += addedbefore; + } + + /* fill addedafter items with nulls */ + if (addedafter > 0) + { + for (i = 0; i < addedafter; i++) + dvalues[eah->nelems + i] = (Datum) 0; + if (dnulls) + { + for (i = 0; i < addedafter; i++) + dnulls[eah->nelems + i] = true; + } + eah->nelems += addedafter; + } + + /* Grab old element value for pfree'ing, if needed. */ + if (!eah->typbyval && (dnulls == NULL || !dnulls[offset])) + oldValue = (char *) DatumGetPointer(dvalues[offset]); + else + oldValue = NULL; + + /* And finally we can insert the new element. */ + dvalues[offset] = dataValue; + if (dnulls) + dnulls[offset] = isNull; + + /* + * Free old element if needed; this keeps repeated element replacements + * from bloating the array's storage. If the pfree somehow fails, it + * won't corrupt the array. + */ + if (oldValue) + { + /* Don't try to pfree a part of the original flat array */ + if (oldValue < eah->fstartptr || oldValue >= eah->fendptr) + pfree(oldValue); + } + + /* Done, return standard TOAST pointer for object */ + return EOHPGetRWDatum(&eah->hdr); + } + + /* * array_set_slice : * This routine sets the value of a range of array locations (specified * by upper and lower subscript values) to new values passed as *************** array_set(ArrayType *array, int nSubscri *** 2734,2741 **** * the function fn(), and if nargs > 1 then argument positions after the * first must be preset to the additional values to be passed. The * first argument position initially holds the input array value. - * * inpType: OID of element type of input array. This must be the same as, - * or binary-compatible with, the first argument type of fn(). * * retType: OID of element type of output array. This must be the same as, * or binary-compatible with, the result type of fn(). * * amstate: workspace for array_map. Must be zeroed by caller before --- 3072,3077 ---- *************** array_set(ArrayType *array, int nSubscri *** 2749,2762 **** * the array are OK however. */ Datum ! array_map(FunctionCallInfo fcinfo, Oid inpType, Oid retType, ! ArrayMapState *amstate) { ! ArrayType *v; ArrayType *result; Datum *values; bool *nulls; - Datum elt; int *dim; int ndim; int nitems; --- 3085,3096 ---- * the array are OK however. */ Datum ! array_map(FunctionCallInfo fcinfo, Oid retType, ArrayMapState *amstate) { ! AnyArrayType *v; ArrayType *result; Datum *values; bool *nulls; int *dim; int ndim; int nitems; *************** array_map(FunctionCallInfo fcinfo, Oid i *** 2764,2778 **** int32 nbytes = 0; int32 dataoffset; bool hasnulls; int inp_typlen; bool inp_typbyval; char inp_typalign; int typlen; bool typbyval; char typalign; ! char *s; ! bits8 *bitmap; ! int bitmask; ArrayMetaState *inp_extra; ArrayMetaState *ret_extra; --- 3098,3111 ---- int32 nbytes = 0; int32 dataoffset; bool hasnulls; + Oid inpType; int inp_typlen; bool inp_typbyval; char inp_typalign; int typlen; bool typbyval; char typalign; ! ARRAY_ITER ARRAY_ITER_VARS(iter); ArrayMetaState *inp_extra; ArrayMetaState *ret_extra; *************** array_map(FunctionCallInfo fcinfo, Oid i *** 2781,2792 **** elog(ERROR, "invalid nargs: %d", fcinfo->nargs); if (PG_ARGISNULL(0)) elog(ERROR, "null input array"); ! v = PG_GETARG_ARRAYTYPE_P(0); ! ! Assert(ARR_ELEMTYPE(v) == inpType); ! ndim = ARR_NDIM(v); ! dim = ARR_DIMS(v); nitems = ArrayGetNItems(ndim, dim); /* Check for empty array */ --- 3114,3124 ---- elog(ERROR, "invalid nargs: %d", fcinfo->nargs); if (PG_ARGISNULL(0)) elog(ERROR, "null input array"); ! v = PG_GETARG_ANY_ARRAY(0); ! inpType = AARR_ELEMTYPE(v); ! ndim = AARR_NDIM(v); ! dim = AARR_DIMS(v); nitems = ArrayGetNItems(ndim, dim); /* Check for empty array */ *************** array_map(FunctionCallInfo fcinfo, Oid i *** 2833,2841 **** nulls = (bool *) palloc(nitems * sizeof(bool)); /* Loop over source data */ ! s = ARR_DATA_PTR(v); ! bitmap = ARR_NULLBITMAP(v); ! bitmask = 1; hasnulls = false; for (i = 0; i < nitems; i++) --- 3165,3171 ---- nulls = (bool *) palloc(nitems * sizeof(bool)); /* Loop over source data */ ! ARRAY_ITER_SETUP(iter, v); hasnulls = false; for (i = 0; i < nitems; i++) *************** array_map(FunctionCallInfo fcinfo, Oid i *** 2843,2860 **** bool callit = true; /* Get source element, checking for NULL */ ! if (bitmap && (*bitmap & bitmask) == 0) ! { ! fcinfo->argnull[0] = true; ! } ! else ! { ! elt = fetch_att(s, inp_typbyval, inp_typlen); ! s = att_addlength_datum(s, inp_typlen, elt); ! s = (char *) att_align_nominal(s, inp_typalign); ! fcinfo->arg[0] = elt; ! fcinfo->argnull[0] = false; ! } /* * Apply the given function to source elt and extra args. --- 3173,3180 ---- bool callit = true; /* Get source element, checking for NULL */ ! ARRAY_ITER_NEXT(iter, i, fcinfo->arg[0], fcinfo->argnull[0], ! inp_typlen, inp_typbyval, inp_typalign); /* * Apply the given function to source elt and extra args. *************** array_map(FunctionCallInfo fcinfo, Oid i *** 2899,2915 **** errmsg("array size exceeds the maximum allowed (%d)", (int) MaxAllocSize))); } - - /* advance bitmap pointer if any */ - if (bitmap) - { - bitmask <<= 1; - if (bitmask == 0x100) - { - bitmap++; - bitmask = 1; - } - } } /* Allocate and initialize the result array */ --- 3219,3224 ---- *************** array_map(FunctionCallInfo fcinfo, Oid i *** 2928,2934 **** result->ndim = ndim; result->dataoffset = dataoffset; result->elemtype = retType; ! memcpy(ARR_DIMS(result), ARR_DIMS(v), 2 * ndim * sizeof(int)); /* * Note: do not risk trying to pfree the results of the called function --- 3237,3244 ---- result->ndim = ndim; result->dataoffset = dataoffset; result->elemtype = retType; ! memcpy(ARR_DIMS(result), AARR_DIMS(v), ndim * sizeof(int)); ! memcpy(ARR_LBOUND(result), AARR_LBOUND(v), ndim * sizeof(int)); /* * Note: do not risk trying to pfree the results of the called function *************** construct_empty_array(Oid elmtype) *** 3092,3097 **** --- 3402,3424 ---- } /* + * construct_empty_expanded_array: make an empty expanded array + * given only type information. (metacache can be NULL if not needed.) + */ + ExpandedArrayHeader * + construct_empty_expanded_array(Oid element_type, + MemoryContext parentcontext, + ArrayMetaState *metacache) + { + ArrayType *array = construct_empty_array(element_type); + Datum d; + + d = expand_array(PointerGetDatum(array), parentcontext, metacache); + pfree(array); + return (ExpandedArrayHeader *) DatumGetEOHP(d); + } + + /* * deconstruct_array --- simple method for extracting data from an array * * array: array object to examine (must not be NULL) *************** array_contains_nulls(ArrayType *array) *** 3229,3264 **** Datum array_eq(PG_FUNCTION_ARGS) { ! ArrayType *array1 = PG_GETARG_ARRAYTYPE_P(0); ! ArrayType *array2 = PG_GETARG_ARRAYTYPE_P(1); Oid collation = PG_GET_COLLATION(); ! int ndims1 = ARR_NDIM(array1); ! int ndims2 = ARR_NDIM(array2); ! int *dims1 = ARR_DIMS(array1); ! int *dims2 = ARR_DIMS(array2); ! Oid element_type = ARR_ELEMTYPE(array1); bool result = true; int nitems; TypeCacheEntry *typentry; int typlen; bool typbyval; char typalign; ! char *ptr1; ! char *ptr2; ! bits8 *bitmap1; ! bits8 *bitmap2; ! int bitmask; int i; FunctionCallInfoData locfcinfo; ! if (element_type != ARR_ELEMTYPE(array2)) ereport(ERROR, (errcode(ERRCODE_DATATYPE_MISMATCH), errmsg("cannot compare arrays of different element types"))); /* fast path if the arrays do not have the same dimensionality */ if (ndims1 != ndims2 || ! memcmp(dims1, dims2, 2 * ndims1 * sizeof(int)) != 0) result = false; else { --- 3556,3591 ---- Datum array_eq(PG_FUNCTION_ARGS) { ! AnyArrayType *array1 = PG_GETARG_ANY_ARRAY(0); ! AnyArrayType *array2 = PG_GETARG_ANY_ARRAY(1); Oid collation = PG_GET_COLLATION(); ! int ndims1 = AARR_NDIM(array1); ! int ndims2 = AARR_NDIM(array2); ! int *dims1 = AARR_DIMS(array1); ! int *dims2 = AARR_DIMS(array2); ! int *lbs1 = AARR_LBOUND(array1); ! int *lbs2 = AARR_LBOUND(array2); ! Oid element_type = AARR_ELEMTYPE(array1); bool result = true; int nitems; TypeCacheEntry *typentry; int typlen; bool typbyval; char typalign; ! ARRAY_ITER ARRAY_ITER_VARS(it1); ! ARRAY_ITER ARRAY_ITER_VARS(it2); int i; FunctionCallInfoData locfcinfo; ! if (element_type != AARR_ELEMTYPE(array2)) ereport(ERROR, (errcode(ERRCODE_DATATYPE_MISMATCH), errmsg("cannot compare arrays of different element types"))); /* fast path if the arrays do not have the same dimensionality */ if (ndims1 != ndims2 || ! memcmp(dims1, dims2, ndims1 * sizeof(int)) != 0 || ! memcmp(lbs1, lbs2, ndims1 * sizeof(int)) != 0) result = false; else { *************** array_eq(PG_FUNCTION_ARGS) *** 3293,3303 **** /* Loop over source data */ nitems = ArrayGetNItems(ndims1, dims1); ! ptr1 = ARR_DATA_PTR(array1); ! ptr2 = ARR_DATA_PTR(array2); ! bitmap1 = ARR_NULLBITMAP(array1); ! bitmap2 = ARR_NULLBITMAP(array2); ! bitmask = 1; /* use same bitmask for both arrays */ for (i = 0; i < nitems; i++) { --- 3620,3627 ---- /* Loop over source data */ nitems = ArrayGetNItems(ndims1, dims1); ! ARRAY_ITER_SETUP(it1, array1); ! ARRAY_ITER_SETUP(it2, array2); for (i = 0; i < nitems; i++) { *************** array_eq(PG_FUNCTION_ARGS) *** 3308,3349 **** bool oprresult; /* Get elements, checking for NULL */ ! if (bitmap1 && (*bitmap1 & bitmask) == 0) ! { ! isnull1 = true; ! elt1 = (Datum) 0; ! } ! else ! { ! isnull1 = false; ! elt1 = fetch_att(ptr1, typbyval, typlen); ! ptr1 = att_addlength_pointer(ptr1, typlen, ptr1); ! ptr1 = (char *) att_align_nominal(ptr1, typalign); ! } ! ! if (bitmap2 && (*bitmap2 & bitmask) == 0) ! { ! isnull2 = true; ! elt2 = (Datum) 0; ! } ! else ! { ! isnull2 = false; ! elt2 = fetch_att(ptr2, typbyval, typlen); ! ptr2 = att_addlength_pointer(ptr2, typlen, ptr2); ! ptr2 = (char *) att_align_nominal(ptr2, typalign); ! } ! ! /* advance bitmap pointers if any */ ! bitmask <<= 1; ! if (bitmask == 0x100) ! { ! if (bitmap1) ! bitmap1++; ! if (bitmap2) ! bitmap2++; ! bitmask = 1; ! } /* * We consider two NULLs equal; NULL and not-NULL are unequal. --- 3632,3639 ---- bool oprresult; /* Get elements, checking for NULL */ ! ARRAY_ITER_NEXT(it1, i, elt1, isnull1, typlen, typbyval, typalign); ! ARRAY_ITER_NEXT(it2, i, elt2, isnull2, typlen, typbyval, typalign); /* * We consider two NULLs equal; NULL and not-NULL are unequal. *************** array_eq(PG_FUNCTION_ARGS) *** 3374,3381 **** } /* Avoid leaking memory when handed toasted input. */ ! PG_FREE_IF_COPY(array1, 0); ! PG_FREE_IF_COPY(array2, 1); PG_RETURN_BOOL(result); } --- 3664,3671 ---- } /* Avoid leaking memory when handed toasted input. */ ! AARR_FREE_IF_COPY(array1, 0); ! AARR_FREE_IF_COPY(array2, 1); PG_RETURN_BOOL(result); } *************** btarraycmp(PG_FUNCTION_ARGS) *** 3435,3465 **** static int array_cmp(FunctionCallInfo fcinfo) { ! ArrayType *array1 = PG_GETARG_ARRAYTYPE_P(0); ! ArrayType *array2 = PG_GETARG_ARRAYTYPE_P(1); Oid collation = PG_GET_COLLATION(); ! int ndims1 = ARR_NDIM(array1); ! int ndims2 = ARR_NDIM(array2); ! int *dims1 = ARR_DIMS(array1); ! int *dims2 = ARR_DIMS(array2); int nitems1 = ArrayGetNItems(ndims1, dims1); int nitems2 = ArrayGetNItems(ndims2, dims2); ! Oid element_type = ARR_ELEMTYPE(array1); int result = 0; TypeCacheEntry *typentry; int typlen; bool typbyval; char typalign; int min_nitems; ! char *ptr1; ! char *ptr2; ! bits8 *bitmap1; ! bits8 *bitmap2; ! int bitmask; int i; FunctionCallInfoData locfcinfo; ! if (element_type != ARR_ELEMTYPE(array2)) ereport(ERROR, (errcode(ERRCODE_DATATYPE_MISMATCH), errmsg("cannot compare arrays of different element types"))); --- 3725,3752 ---- static int array_cmp(FunctionCallInfo fcinfo) { ! AnyArrayType *array1 = PG_GETARG_ANY_ARRAY(0); ! AnyArrayType *array2 = PG_GETARG_ANY_ARRAY(1); Oid collation = PG_GET_COLLATION(); ! int ndims1 = AARR_NDIM(array1); ! int ndims2 = AARR_NDIM(array2); ! int *dims1 = AARR_DIMS(array1); ! int *dims2 = AARR_DIMS(array2); int nitems1 = ArrayGetNItems(ndims1, dims1); int nitems2 = ArrayGetNItems(ndims2, dims2); ! Oid element_type = AARR_ELEMTYPE(array1); int result = 0; TypeCacheEntry *typentry; int typlen; bool typbyval; char typalign; int min_nitems; ! ARRAY_ITER ARRAY_ITER_VARS(it1); ! ARRAY_ITER ARRAY_ITER_VARS(it2); int i; FunctionCallInfoData locfcinfo; ! if (element_type != AARR_ELEMTYPE(array2)) ereport(ERROR, (errcode(ERRCODE_DATATYPE_MISMATCH), errmsg("cannot compare arrays of different element types"))); *************** array_cmp(FunctionCallInfo fcinfo) *** 3495,3505 **** /* Loop over source data */ min_nitems = Min(nitems1, nitems2); ! ptr1 = ARR_DATA_PTR(array1); ! ptr2 = ARR_DATA_PTR(array2); ! bitmap1 = ARR_NULLBITMAP(array1); ! bitmap2 = ARR_NULLBITMAP(array2); ! bitmask = 1; /* use same bitmask for both arrays */ for (i = 0; i < min_nitems; i++) { --- 3782,3789 ---- /* Loop over source data */ min_nitems = Min(nitems1, nitems2); ! ARRAY_ITER_SETUP(it1, array1); ! ARRAY_ITER_SETUP(it2, array2); for (i = 0; i < min_nitems; i++) { *************** array_cmp(FunctionCallInfo fcinfo) *** 3510,3551 **** int32 cmpresult; /* Get elements, checking for NULL */ ! if (bitmap1 && (*bitmap1 & bitmask) == 0) ! { ! isnull1 = true; ! elt1 = (Datum) 0; ! } ! else ! { ! isnull1 = false; ! elt1 = fetch_att(ptr1, typbyval, typlen); ! ptr1 = att_addlength_pointer(ptr1, typlen, ptr1); ! ptr1 = (char *) att_align_nominal(ptr1, typalign); ! } ! ! if (bitmap2 && (*bitmap2 & bitmask) == 0) ! { ! isnull2 = true; ! elt2 = (Datum) 0; ! } ! else ! { ! isnull2 = false; ! elt2 = fetch_att(ptr2, typbyval, typlen); ! ptr2 = att_addlength_pointer(ptr2, typlen, ptr2); ! ptr2 = (char *) att_align_nominal(ptr2, typalign); ! } ! ! /* advance bitmap pointers if any */ ! bitmask <<= 1; ! if (bitmask == 0x100) ! { ! if (bitmap1) ! bitmap1++; ! if (bitmap2) ! bitmap2++; ! bitmask = 1; ! } /* * We consider two NULLs equal; NULL > not-NULL. --- 3794,3801 ---- int32 cmpresult; /* Get elements, checking for NULL */ ! ARRAY_ITER_NEXT(it1, i, elt1, isnull1, typlen, typbyval, typalign); ! ARRAY_ITER_NEXT(it2, i, elt2, isnull2, typlen, typbyval, typalign); /* * We consider two NULLs equal; NULL > not-NULL. *************** array_cmp(FunctionCallInfo fcinfo) *** 3604,3611 **** result = (ndims1 < ndims2) ? -1 : 1; else { ! /* this relies on LB array immediately following DIMS array */ ! for (i = 0; i < ndims1 * 2; i++) { if (dims1[i] != dims2[i]) { --- 3854,3860 ---- result = (ndims1 < ndims2) ? -1 : 1; else { ! for (i = 0; i < ndims1; i++) { if (dims1[i] != dims2[i]) { *************** array_cmp(FunctionCallInfo fcinfo) *** 3613,3624 **** break; } } } } /* Avoid leaking memory when handed toasted input. */ ! PG_FREE_IF_COPY(array1, 0); ! PG_FREE_IF_COPY(array2, 1); return result; } --- 3862,3887 ---- break; } } + if (result == 0) + { + int *lbound1 = AARR_LBOUND(array1); + int *lbound2 = AARR_LBOUND(array2); + + for (i = 0; i < ndims1; i++) + { + if (lbound1[i] != lbound2[i]) + { + result = (lbound1[i] < lbound2[i]) ? -1 : 1; + break; + } + } + } } } /* Avoid leaking memory when handed toasted input. */ ! AARR_FREE_IF_COPY(array1, 0); ! AARR_FREE_IF_COPY(array2, 1); return result; } *************** array_cmp(FunctionCallInfo fcinfo) *** 3633,3652 **** Datum hash_array(PG_FUNCTION_ARGS) { ! ArrayType *array = PG_GETARG_ARRAYTYPE_P(0); ! int ndims = ARR_NDIM(array); ! int *dims = ARR_DIMS(array); ! Oid element_type = ARR_ELEMTYPE(array); uint32 result = 1; int nitems; TypeCacheEntry *typentry; int typlen; bool typbyval; char typalign; - char *ptr; - bits8 *bitmap; - int bitmask; int i; FunctionCallInfoData locfcinfo; /* --- 3896,3913 ---- Datum hash_array(PG_FUNCTION_ARGS) { ! AnyArrayType *array = PG_GETARG_ANY_ARRAY(0); ! int ndims = AARR_NDIM(array); ! int *dims = AARR_DIMS(array); ! Oid element_type = AARR_ELEMTYPE(array); uint32 result = 1; int nitems; TypeCacheEntry *typentry; int typlen; bool typbyval; char typalign; int i; + ARRAY_ITER ARRAY_ITER_VARS(iter); FunctionCallInfoData locfcinfo; /* *************** hash_array(PG_FUNCTION_ARGS) *** 3680,3707 **** /* Loop over source data */ nitems = ArrayGetNItems(ndims, dims); ! ptr = ARR_DATA_PTR(array); ! bitmap = ARR_NULLBITMAP(array); ! bitmask = 1; for (i = 0; i < nitems; i++) { uint32 elthash; /* Get element, checking for NULL */ ! if (bitmap && (*bitmap & bitmask) == 0) { /* Treat nulls as having hashvalue 0 */ elthash = 0; } else { - Datum elt; - - elt = fetch_att(ptr, typbyval, typlen); - ptr = att_addlength_pointer(ptr, typlen, ptr); - ptr = (char *) att_align_nominal(ptr, typalign); - /* Apply the hash function */ locfcinfo.arg[0] = elt; locfcinfo.argnull[0] = false; --- 3941,3964 ---- /* Loop over source data */ nitems = ArrayGetNItems(ndims, dims); ! ARRAY_ITER_SETUP(iter, array); for (i = 0; i < nitems; i++) { + Datum elt; + bool isnull; uint32 elthash; /* Get element, checking for NULL */ ! ARRAY_ITER_NEXT(iter, i, elt, isnull, typlen, typbyval, typalign); ! ! if (isnull) { /* Treat nulls as having hashvalue 0 */ elthash = 0; } else { /* Apply the hash function */ locfcinfo.arg[0] = elt; locfcinfo.argnull[0] = false; *************** hash_array(PG_FUNCTION_ARGS) *** 3709,3725 **** elthash = DatumGetUInt32(FunctionCallInvoke(&locfcinfo)); } - /* advance bitmap pointer if any */ - if (bitmap) - { - bitmask <<= 1; - if (bitmask == 0x100) - { - bitmap++; - bitmask = 1; - } - } - /* * Combine hash values of successive elements by multiplying the * current value by 31 and adding on the new element's hash value. --- 3966,3971 ---- *************** hash_array(PG_FUNCTION_ARGS) *** 3735,3741 **** } /* Avoid leaking memory when handed toasted input. */ ! PG_FREE_IF_COPY(array, 0); PG_RETURN_UINT32(result); } --- 3981,3987 ---- } /* Avoid leaking memory when handed toasted input. */ ! AARR_FREE_IF_COPY(array, 0); PG_RETURN_UINT32(result); } *************** hash_array(PG_FUNCTION_ARGS) *** 3756,3766 **** * When matchall is false, return true if any members of array1 are in array2. */ static bool ! array_contain_compare(ArrayType *array1, ArrayType *array2, Oid collation, bool matchall, void **fn_extra) { bool result = matchall; ! Oid element_type = ARR_ELEMTYPE(array1); TypeCacheEntry *typentry; int nelems1; Datum *values2; --- 4002,4012 ---- * When matchall is false, return true if any members of array1 are in array2. */ static bool ! array_contain_compare(AnyArrayType *array1, AnyArrayType *array2, Oid collation, bool matchall, void **fn_extra) { bool result = matchall; ! Oid element_type = AARR_ELEMTYPE(array1); TypeCacheEntry *typentry; int nelems1; Datum *values2; *************** array_contain_compare(ArrayType *array1, *** 3769,3782 **** int typlen; bool typbyval; char typalign; - char *ptr1; - bits8 *bitmap1; - int bitmask; int i; int j; FunctionCallInfoData locfcinfo; ! if (element_type != ARR_ELEMTYPE(array2)) ereport(ERROR, (errcode(ERRCODE_DATATYPE_MISMATCH), errmsg("cannot compare arrays of different element types"))); --- 4015,4026 ---- int typlen; bool typbyval; char typalign; int i; int j; + ARRAY_ITER ARRAY_ITER_VARS(it1); FunctionCallInfoData locfcinfo; ! if (element_type != AARR_ELEMTYPE(array2)) ereport(ERROR, (errcode(ERRCODE_DATATYPE_MISMATCH), errmsg("cannot compare arrays of different element types"))); *************** array_contain_compare(ArrayType *array1, *** 3809,3816 **** * worthwhile to use deconstruct_array on it. We scan array1 the hard way * however, since we very likely won't need to look at all of it. */ ! deconstruct_array(array2, element_type, typlen, typbyval, typalign, ! &values2, &nulls2, &nelems2); /* * Apply the comparison operator to each pair of array elements. --- 4053,4070 ---- * worthwhile to use deconstruct_array on it. We scan array1 the hard way * however, since we very likely won't need to look at all of it. */ ! if (VARATT_IS_EXPANDED_HEADER(array2)) ! { ! /* This should be safe even if input is read-only */ ! deconstruct_expanded_array(&(array2->xpn)); ! values2 = array2->xpn.dvalues; ! nulls2 = array2->xpn.dnulls; ! nelems2 = array2->xpn.nelems; ! } ! else ! deconstruct_array(&(array2->flt), ! element_type, typlen, typbyval, typalign, ! &values2, &nulls2, &nelems2); /* * Apply the comparison operator to each pair of array elements. *************** array_contain_compare(ArrayType *array1, *** 3819,3828 **** collation, NULL, NULL); /* Loop over source data */ ! nelems1 = ArrayGetNItems(ARR_NDIM(array1), ARR_DIMS(array1)); ! ptr1 = ARR_DATA_PTR(array1); ! bitmap1 = ARR_NULLBITMAP(array1); ! bitmask = 1; for (i = 0; i < nelems1; i++) { --- 4073,4080 ---- collation, NULL, NULL); /* Loop over source data */ ! nelems1 = ArrayGetNItems(AARR_NDIM(array1), AARR_DIMS(array1)); ! ARRAY_ITER_SETUP(it1, array1); for (i = 0; i < nelems1; i++) { *************** array_contain_compare(ArrayType *array1, *** 3830,3856 **** bool isnull1; /* Get element, checking for NULL */ ! if (bitmap1 && (*bitmap1 & bitmask) == 0) ! { ! isnull1 = true; ! elt1 = (Datum) 0; ! } ! else ! { ! isnull1 = false; ! elt1 = fetch_att(ptr1, typbyval, typlen); ! ptr1 = att_addlength_pointer(ptr1, typlen, ptr1); ! ptr1 = (char *) att_align_nominal(ptr1, typalign); ! } ! ! /* advance bitmap pointer if any */ ! bitmask <<= 1; ! if (bitmask == 0x100) ! { ! if (bitmap1) ! bitmap1++; ! bitmask = 1; ! } /* * We assume that the comparison operator is strict, so a NULL can't --- 4082,4088 ---- bool isnull1; /* Get element, checking for NULL */ ! ARRAY_ITER_NEXT(it1, i, elt1, isnull1, typlen, typbyval, typalign); /* * We assume that the comparison operator is strict, so a NULL can't *************** array_contain_compare(ArrayType *array1, *** 3909,3925 **** } } - pfree(values2); - pfree(nulls2); - return result; } Datum arrayoverlap(PG_FUNCTION_ARGS) { ! ArrayType *array1 = PG_GETARG_ARRAYTYPE_P(0); ! ArrayType *array2 = PG_GETARG_ARRAYTYPE_P(1); Oid collation = PG_GET_COLLATION(); bool result; --- 4141,4154 ---- } } return result; } Datum arrayoverlap(PG_FUNCTION_ARGS) { ! AnyArrayType *array1 = PG_GETARG_ANY_ARRAY(0); ! AnyArrayType *array2 = PG_GETARG_ANY_ARRAY(1); Oid collation = PG_GET_COLLATION(); bool result; *************** arrayoverlap(PG_FUNCTION_ARGS) *** 3927,3934 **** &fcinfo->flinfo->fn_extra); /* Avoid leaking memory when handed toasted input. */ ! PG_FREE_IF_COPY(array1, 0); ! PG_FREE_IF_COPY(array2, 1); PG_RETURN_BOOL(result); } --- 4156,4163 ---- &fcinfo->flinfo->fn_extra); /* Avoid leaking memory when handed toasted input. */ ! AARR_FREE_IF_COPY(array1, 0); ! AARR_FREE_IF_COPY(array2, 1); PG_RETURN_BOOL(result); } *************** arrayoverlap(PG_FUNCTION_ARGS) *** 3936,3943 **** Datum arraycontains(PG_FUNCTION_ARGS) { ! ArrayType *array1 = PG_GETARG_ARRAYTYPE_P(0); ! ArrayType *array2 = PG_GETARG_ARRAYTYPE_P(1); Oid collation = PG_GET_COLLATION(); bool result; --- 4165,4172 ---- Datum arraycontains(PG_FUNCTION_ARGS) { ! AnyArrayType *array1 = PG_GETARG_ANY_ARRAY(0); ! AnyArrayType *array2 = PG_GETARG_ANY_ARRAY(1); Oid collation = PG_GET_COLLATION(); bool result; *************** arraycontains(PG_FUNCTION_ARGS) *** 3945,3952 **** &fcinfo->flinfo->fn_extra); /* Avoid leaking memory when handed toasted input. */ ! PG_FREE_IF_COPY(array1, 0); ! PG_FREE_IF_COPY(array2, 1); PG_RETURN_BOOL(result); } --- 4174,4181 ---- &fcinfo->flinfo->fn_extra); /* Avoid leaking memory when handed toasted input. */ ! AARR_FREE_IF_COPY(array1, 0); ! AARR_FREE_IF_COPY(array2, 1); PG_RETURN_BOOL(result); } *************** arraycontains(PG_FUNCTION_ARGS) *** 3954,3961 **** Datum arraycontained(PG_FUNCTION_ARGS) { ! ArrayType *array1 = PG_GETARG_ARRAYTYPE_P(0); ! ArrayType *array2 = PG_GETARG_ARRAYTYPE_P(1); Oid collation = PG_GET_COLLATION(); bool result; --- 4183,4190 ---- Datum arraycontained(PG_FUNCTION_ARGS) { ! AnyArrayType *array1 = PG_GETARG_ANY_ARRAY(0); ! AnyArrayType *array2 = PG_GETARG_ANY_ARRAY(1); Oid collation = PG_GET_COLLATION(); bool result; *************** arraycontained(PG_FUNCTION_ARGS) *** 3963,3970 **** &fcinfo->flinfo->fn_extra); /* Avoid leaking memory when handed toasted input. */ ! PG_FREE_IF_COPY(array1, 0); ! PG_FREE_IF_COPY(array2, 1); PG_RETURN_BOOL(result); } --- 4192,4199 ---- &fcinfo->flinfo->fn_extra); /* Avoid leaking memory when handed toasted input. */ ! AARR_FREE_IF_COPY(array1, 0); ! AARR_FREE_IF_COPY(array2, 1); PG_RETURN_BOOL(result); } *************** initArrayResult(Oid element_type, Memory *** 4702,4708 **** MemoryContextAlloc(arr_context, sizeof(ArrayBuildState)); astate->mcontext = arr_context; astate->private_cxt = subcontext; ! astate->alen = (subcontext ? 64 : 8); /* arbitrary starting array size */ astate->dvalues = (Datum *) MemoryContextAlloc(arr_context, astate->alen * sizeof(Datum)); astate->dnulls = (bool *) --- 4931,4938 ---- MemoryContextAlloc(arr_context, sizeof(ArrayBuildState)); astate->mcontext = arr_context; astate->private_cxt = subcontext; ! astate->alen = (subcontext ? 64 : 8); /* arbitrary starting array ! * size */ astate->dvalues = (Datum *) MemoryContextAlloc(arr_context, astate->alen * sizeof(Datum)); astate->dnulls = (bool *) *************** initArrayResultArr(Oid array_type, Oid e *** 4878,4887 **** bool subcontext) { ArrayBuildStateArr *astate; ! MemoryContext arr_context = rcontext; /* by default use the parent ctx */ /* Lookup element type, unless element_type already provided */ ! if (! OidIsValid(element_type)) { element_type = get_element_type(array_type); --- 5108,5118 ---- bool subcontext) { ArrayBuildStateArr *astate; ! MemoryContext arr_context = rcontext; /* by default use the parent ! * ctx */ /* Lookup element type, unless element_type already provided */ ! if (!OidIsValid(element_type)) { element_type = get_element_type(array_type); *************** makeArrayResultAny(ArrayBuildStateAny *a *** 5259,5289 **** Datum array_larger(PG_FUNCTION_ARGS) { ! ArrayType *v1, ! *v2, ! *result; ! ! v1 = PG_GETARG_ARRAYTYPE_P(0); ! v2 = PG_GETARG_ARRAYTYPE_P(1); ! ! result = ((array_cmp(fcinfo) > 0) ? v1 : v2); ! ! PG_RETURN_ARRAYTYPE_P(result); } Datum array_smaller(PG_FUNCTION_ARGS) { ! ArrayType *v1, ! *v2, ! *result; ! ! v1 = PG_GETARG_ARRAYTYPE_P(0); ! v2 = PG_GETARG_ARRAYTYPE_P(1); ! ! result = ((array_cmp(fcinfo) < 0) ? v1 : v2); ! ! PG_RETURN_ARRAYTYPE_P(result); } --- 5490,5508 ---- Datum array_larger(PG_FUNCTION_ARGS) { ! if (array_cmp(fcinfo) > 0) ! PG_RETURN_DATUM(PG_GETARG_DATUM(0)); ! else ! PG_RETURN_DATUM(PG_GETARG_DATUM(1)); } Datum array_smaller(PG_FUNCTION_ARGS) { ! if (array_cmp(fcinfo) < 0) ! PG_RETURN_DATUM(PG_GETARG_DATUM(0)); ! else ! PG_RETURN_DATUM(PG_GETARG_DATUM(1)); } *************** generate_subscripts(PG_FUNCTION_ARGS) *** 5308,5314 **** /* stuff done only on the first call of the function */ if (SRF_IS_FIRSTCALL()) { ! ArrayType *v = PG_GETARG_ARRAYTYPE_P(0); int reqdim = PG_GETARG_INT32(1); int *lb, *dimv; --- 5527,5533 ---- /* stuff done only on the first call of the function */ if (SRF_IS_FIRSTCALL()) { ! AnyArrayType *v = PG_GETARG_ANY_ARRAY(0); int reqdim = PG_GETARG_INT32(1); int *lb, *dimv; *************** generate_subscripts(PG_FUNCTION_ARGS) *** 5317,5327 **** funcctx = SRF_FIRSTCALL_INIT(); /* Sanity check: does it look like an array at all? */ ! if (ARR_NDIM(v) <= 0 || ARR_NDIM(v) > MAXDIM) SRF_RETURN_DONE(funcctx); /* Sanity check: was the requested dim valid */ ! if (reqdim <= 0 || reqdim > ARR_NDIM(v)) SRF_RETURN_DONE(funcctx); /* --- 5536,5546 ---- funcctx = SRF_FIRSTCALL_INIT(); /* Sanity check: does it look like an array at all? */ ! if (AARR_NDIM(v) <= 0 || AARR_NDIM(v) > MAXDIM) SRF_RETURN_DONE(funcctx); /* Sanity check: was the requested dim valid */ ! if (reqdim <= 0 || reqdim > AARR_NDIM(v)) SRF_RETURN_DONE(funcctx); /* *************** generate_subscripts(PG_FUNCTION_ARGS) *** 5330,5337 **** oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx); fctx = (generate_subscripts_fctx *) palloc(sizeof(generate_subscripts_fctx)); ! lb = ARR_LBOUND(v); ! dimv = ARR_DIMS(v); fctx->lower = lb[reqdim - 1]; fctx->upper = dimv[reqdim - 1] + lb[reqdim - 1] - 1; --- 5549,5556 ---- oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx); fctx = (generate_subscripts_fctx *) palloc(sizeof(generate_subscripts_fctx)); ! lb = AARR_LBOUND(v); ! dimv = AARR_DIMS(v); fctx->lower = lb[reqdim - 1]; fctx->upper = dimv[reqdim - 1] + lb[reqdim - 1] - 1; *************** array_unnest(PG_FUNCTION_ARGS) *** 5650,5660 **** { typedef struct { ! ArrayType *arr; int nextelem; int numelems; - char *elemdataptr; /* this moves with nextelem */ - bits8 *arraynullsptr; /* this does not */ int16 elmlen; bool elmbyval; char elmalign; --- 5869,5877 ---- { typedef struct { ! ARRAY_ITER ARRAY_ITER_VARS(iter); int nextelem; int numelems; int16 elmlen; bool elmbyval; char elmalign; *************** array_unnest(PG_FUNCTION_ARGS) *** 5667,5673 **** /* stuff done only on the first call of the function */ if (SRF_IS_FIRSTCALL()) { ! ArrayType *arr; /* create a function context for cross-call persistence */ funcctx = SRF_FIRSTCALL_INIT(); --- 5884,5890 ---- /* stuff done only on the first call of the function */ if (SRF_IS_FIRSTCALL()) { ! AnyArrayType *arr; /* create a function context for cross-call persistence */ funcctx = SRF_FIRSTCALL_INIT(); *************** array_unnest(PG_FUNCTION_ARGS) *** 5684,5706 **** * and not before. (If no detoast happens, we assume the originally * passed array will stick around till then.) */ ! arr = PG_GETARG_ARRAYTYPE_P(0); /* allocate memory for user context */ fctx = (array_unnest_fctx *) palloc(sizeof(array_unnest_fctx)); /* initialize state */ ! fctx->arr = arr; fctx->nextelem = 0; ! fctx->numelems = ArrayGetNItems(ARR_NDIM(arr), ARR_DIMS(arr)); ! ! fctx->elemdataptr = ARR_DATA_PTR(arr); ! fctx->arraynullsptr = ARR_NULLBITMAP(arr); ! get_typlenbyvalalign(ARR_ELEMTYPE(arr), ! &fctx->elmlen, ! &fctx->elmbyval, ! &fctx->elmalign); funcctx->user_fctx = fctx; MemoryContextSwitchTo(oldcontext); --- 5901,5928 ---- * and not before. (If no detoast happens, we assume the originally * passed array will stick around till then.) */ ! arr = PG_GETARG_ANY_ARRAY(0); /* allocate memory for user context */ fctx = (array_unnest_fctx *) palloc(sizeof(array_unnest_fctx)); /* initialize state */ ! ARRAY_ITER_SETUP(fctx->iter, arr); fctx->nextelem = 0; ! fctx->numelems = ArrayGetNItems(AARR_NDIM(arr), AARR_DIMS(arr)); ! if (VARATT_IS_EXPANDED_HEADER(arr)) ! { ! /* we can just grab the type data from expanded array */ ! fctx->elmlen = arr->xpn.typlen; ! fctx->elmbyval = arr->xpn.typbyval; ! fctx->elmalign = arr->xpn.typalign; ! } ! else ! get_typlenbyvalalign(AARR_ELEMTYPE(arr), ! &fctx->elmlen, ! &fctx->elmbyval, ! &fctx->elmalign); funcctx->user_fctx = fctx; MemoryContextSwitchTo(oldcontext); *************** array_unnest(PG_FUNCTION_ARGS) *** 5715,5746 **** int offset = fctx->nextelem++; Datum elem; ! /* ! * Check for NULL array element ! */ ! if (array_get_isnull(fctx->arraynullsptr, offset)) ! { ! fcinfo->isnull = true; ! elem = (Datum) 0; ! /* elemdataptr does not move */ ! } ! else ! { ! /* ! * OK, get the element ! */ ! char *ptr = fctx->elemdataptr; ! ! fcinfo->isnull = false; ! elem = ArrayCast(ptr, fctx->elmbyval, fctx->elmlen); ! ! /* ! * Advance elemdataptr over it ! */ ! ptr = att_addlength_pointer(ptr, fctx->elmlen, ptr); ! ptr = (char *) att_align_nominal(ptr, fctx->elmalign); ! fctx->elemdataptr = ptr; ! } SRF_RETURN_NEXT(funcctx, elem); } --- 5937,5944 ---- int offset = fctx->nextelem++; Datum elem; ! ARRAY_ITER_NEXT(fctx->iter, offset, elem, fcinfo->isnull, ! fctx->elmlen, fctx->elmbyval, fctx->elmalign); SRF_RETURN_NEXT(funcctx, elem); } *************** array_replace_internal(ArrayType *array, *** 5992,5998 **** result->ndim = ndim; result->dataoffset = dataoffset; result->elemtype = element_type; ! memcpy(ARR_DIMS(result), ARR_DIMS(array), 2 * ndim * sizeof(int)); if (remove) { --- 6190,6197 ---- result->ndim = ndim; result->dataoffset = dataoffset; result->elemtype = element_type; ! memcpy(ARR_DIMS(result), ARR_DIMS(array), ndim * sizeof(int)); ! memcpy(ARR_LBOUND(result), ARR_LBOUND(array), ndim * sizeof(int)); if (remove) { diff --git a/src/backend/utils/adt/datum.c b/src/backend/utils/adt/datum.c index 014eca5..e8af030 100644 *** a/src/backend/utils/adt/datum.c --- b/src/backend/utils/adt/datum.c *************** *** 12,19 **** * *------------------------------------------------------------------------- */ /* ! * In the implementation of the next routines we assume the following: * * A) if a type is "byVal" then all the information is stored in the * Datum itself (i.e. no pointers involved!). In this case the --- 12,20 ---- * *------------------------------------------------------------------------- */ + /* ! * In the implementation of these routines we assume the following: * * A) if a type is "byVal" then all the information is stored in the * Datum itself (i.e. no pointers involved!). In this case the *************** *** 34,44 **** --- 35,49 ---- * * Note that we do not treat "toasted" datums specially; therefore what * will be copied or compared is the compressed data or toast reference. + * An exception is made for datumCopy() of an expanded object, however, + * because most callers expect to get a simple contiguous (and pfree'able) + * result from datumCopy(). See also datumTransfer(). */ #include "postgres.h" #include "utils/datum.h" + #include "utils/expandeddatum.h" /*------------------------------------------------------------------------- *************** *** 46,51 **** --- 51,57 ---- * * Find the "real" size of a datum, given the datum value, * whether it is a "by value", and the declared type length. + * (For TOAST pointer datums, this is the size of the pointer datum.) * * This is essentially an out-of-line version of the att_addlength_datum() * macro in access/tupmacs.h. We do a tad more error checking though. *************** datumGetSize(Datum value, bool typByVal, *** 106,114 **** /*------------------------------------------------------------------------- * datumCopy * ! * make a copy of a datum * * If the datatype is pass-by-reference, memory is obtained with palloc(). *------------------------------------------------------------------------- */ Datum --- 112,127 ---- /*------------------------------------------------------------------------- * datumCopy * ! * Make a copy of a non-NULL datum. * * If the datatype is pass-by-reference, memory is obtained with palloc(). + * + * If the value is a reference to an expanded object, we flatten into memory + * obtained with palloc(). We need to copy because one of the main uses of + * this function is to copy a datum out of a transient memory context that's + * about to be destroyed, and the expanded object is probably in a child + * context that will also go away. Moreover, many callers assume that the + * result is a single pfree-able chunk. *------------------------------------------------------------------------- */ Datum *************** datumCopy(Datum value, bool typByVal, in *** 118,161 **** if (typByVal) res = value; else { Size realSize; ! char *s; ! ! if (DatumGetPointer(value) == NULL) ! return PointerGetDatum(NULL); realSize = datumGetSize(value, typByVal, typLen); ! s = (char *) palloc(realSize); ! memcpy(s, DatumGetPointer(value), realSize); ! res = PointerGetDatum(s); } return res; } /*------------------------------------------------------------------------- ! * datumFree * ! * Free the space occupied by a datum CREATED BY "datumCopy" * ! * NOTE: DO NOT USE THIS ROUTINE with datums returned by heap_getattr() etc. ! * ONLY datums created by "datumCopy" can be freed! *------------------------------------------------------------------------- */ ! #ifdef NOT_USED ! void ! datumFree(Datum value, bool typByVal, int typLen) { ! if (!typByVal) ! { ! Pointer s = DatumGetPointer(value); ! ! pfree(s); ! } } - #endif /*------------------------------------------------------------------------- * datumIsEqual --- 131,201 ---- if (typByVal) res = value; + else if (typLen == -1) + { + /* It is a varlena datatype */ + struct varlena *vl = (struct varlena *) DatumGetPointer(value); + + if (VARATT_IS_EXTERNAL_EXPANDED(vl)) + { + /* Flatten into the caller's memory context */ + ExpandedObjectHeader *eoh = DatumGetEOHP(value); + Size resultsize; + char *resultptr; + + resultsize = EOH_get_flat_size(eoh); + resultptr = (char *) palloc(resultsize); + EOH_flatten_into(eoh, (void *) resultptr, resultsize); + res = PointerGetDatum(resultptr); + } + else + { + /* Otherwise, just copy the varlena datum verbatim */ + Size realSize; + char *resultptr; + + realSize = (Size) VARSIZE_ANY(vl); + resultptr = (char *) palloc(realSize); + memcpy(resultptr, vl, realSize); + res = PointerGetDatum(resultptr); + } + } else { + /* Pass by reference, but not varlena, so not toasted */ Size realSize; ! char *resultptr; realSize = datumGetSize(value, typByVal, typLen); ! resultptr = (char *) palloc(realSize); ! memcpy(resultptr, DatumGetPointer(value), realSize); ! res = PointerGetDatum(resultptr); } return res; } /*------------------------------------------------------------------------- ! * datumTransfer * ! * Transfer a non-NULL datum into the current memory context. * ! * This is equivalent to datumCopy() except when the datum is a read-write ! * pointer to an expanded object. In that case we merely reparent the object ! * into the current context, and return its standard R/W pointer (in case the ! * given one is a transient pointer of shorter lifespan). *------------------------------------------------------------------------- */ ! Datum ! datumTransfer(Datum value, bool typByVal, int typLen) { ! if (!typByVal && typLen == -1 && ! VARATT_IS_EXTERNAL_EXPANDED_RW(DatumGetPointer(value))) ! value = TransferExpandedObject(value, CurrentMemoryContext); ! else ! value = datumCopy(value, typByVal, typLen); ! return value; } /*------------------------------------------------------------------------- * datumIsEqual diff --git a/src/backend/utils/adt/expandeddatum.c b/src/backend/utils/adt/expandeddatum.c index ...039671b . *** a/src/backend/utils/adt/expandeddatum.c --- b/src/backend/utils/adt/expandeddatum.c *************** *** 0 **** --- 1,163 ---- + /*------------------------------------------------------------------------- + * + * expandeddatum.c + * Support functions for "expanded" value representations. + * + * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group + * Portions Copyright (c) 1994, Regents of the University of California + * + * + * IDENTIFICATION + * src/backend/utils/adt/expandeddatum.c + * + *------------------------------------------------------------------------- + */ + #include "postgres.h" + + #include "utils/expandeddatum.h" + #include "utils/memutils.h" + + /* + * DatumGetEOHP + * + * Given a Datum that is an expanded-object reference, extract the pointer. + * + * This is a bit tedious since the pointer may not be properly aligned; + * compare VARATT_EXTERNAL_GET_POINTER(). + */ + ExpandedObjectHeader * + DatumGetEOHP(Datum d) + { + varattrib_1b_e *datum = (varattrib_1b_e *) DatumGetPointer(d); + varatt_expanded ptr; + + Assert(VARATT_IS_EXTERNAL_EXPANDED(datum)); + memcpy(&ptr, VARDATA_EXTERNAL(datum), sizeof(ptr)); + Assert(VARATT_IS_EXPANDED_HEADER(ptr.eohptr)); + return ptr.eohptr; + } + + /* + * EOH_init_header + * + * Initialize the common header of an expanded object. + * + * The main thing this encapsulates is initializing the TOAST pointers. + */ + void + EOH_init_header(ExpandedObjectHeader *eohptr, + const ExpandedObjectMethods *methods, + MemoryContext obj_context) + { + varatt_expanded ptr; + + eohptr->vl_len_ = EOH_HEADER_MAGIC; + eohptr->eoh_methods = methods; + eohptr->eoh_context = obj_context; + + ptr.eohptr = eohptr; + + SET_VARTAG_EXTERNAL(eohptr->eoh_rw_ptr, VARTAG_EXPANDED_RW); + memcpy(VARDATA_EXTERNAL(eohptr->eoh_rw_ptr), &ptr, sizeof(ptr)); + + SET_VARTAG_EXTERNAL(eohptr->eoh_ro_ptr, VARTAG_EXPANDED_RO); + memcpy(VARDATA_EXTERNAL(eohptr->eoh_ro_ptr), &ptr, sizeof(ptr)); + } + + /* + * EOH_get_flat_size + * EOH_flatten_into + * + * Convenience functions for invoking the "methods" of an expanded object. + */ + + Size + EOH_get_flat_size(ExpandedObjectHeader *eohptr) + { + return (*eohptr->eoh_methods->get_flat_size) (eohptr); + } + + void + EOH_flatten_into(ExpandedObjectHeader *eohptr, + void *result, Size allocated_size) + { + (*eohptr->eoh_methods->flatten_into) (eohptr, result, allocated_size); + } + + /* + * Does the Datum represent a writable expanded object? + */ + bool + DatumIsReadWriteExpandedObject(Datum d, bool isnull, int16 typlen) + { + /* Reject if it's NULL or not a varlena type */ + if (isnull || typlen != -1) + return false; + + /* Reject if not a read-write expanded-object pointer */ + if (!VARATT_IS_EXTERNAL_EXPANDED_RW(DatumGetPointer(d))) + return false; + + return true; + } + + /* + * If the Datum represents a R/W expanded object, change it to R/O. + * Otherwise return the original Datum. + */ + Datum + MakeExpandedObjectReadOnly(Datum d, bool isnull, int16 typlen) + { + ExpandedObjectHeader *eohptr; + + /* Nothing to do if it's NULL or not a varlena type */ + if (isnull || typlen != -1) + return d; + + /* Nothing to do if not a read-write expanded-object pointer */ + if (!VARATT_IS_EXTERNAL_EXPANDED_RW(DatumGetPointer(d))) + return d; + + /* Now safe to extract the object pointer */ + eohptr = DatumGetEOHP(d); + + /* Return the built-in read-only pointer instead of given pointer */ + return EOHPGetRODatum(eohptr); + } + + /* + * Transfer ownership of an expanded object to a new parent memory context. + * The object must be referenced by a R/W pointer, and what we return is + * always its "standard" R/W pointer, which is certain to have the same + * lifespan as the object itself. (The passed-in pointer might not, and + * in any case wouldn't provide a unique identifier if it's not that one.) + */ + Datum + TransferExpandedObject(Datum d, MemoryContext new_parent) + { + ExpandedObjectHeader *eohptr = DatumGetEOHP(d); + + /* Assert caller gave a R/W pointer */ + Assert(VARATT_IS_EXTERNAL_EXPANDED_RW(DatumGetPointer(d))); + + /* Transfer ownership */ + MemoryContextSetParent(eohptr->eoh_context, new_parent); + + /* Return the object's standard read-write pointer */ + return EOHPGetRWDatum(eohptr); + } + + /* + * Delete an expanded object (must be referenced by a R/W pointer). + */ + void + DeleteExpandedObject(Datum d) + { + ExpandedObjectHeader *eohptr = DatumGetEOHP(d); + + /* Assert caller gave a R/W pointer */ + Assert(VARATT_IS_EXTERNAL_EXPANDED_RW(DatumGetPointer(d))); + + /* Kill it */ + MemoryContextDelete(eohptr->eoh_context); + } diff --git a/src/backend/utils/mmgr/mcxt.c b/src/backend/utils/mmgr/mcxt.c index c42a6b6..34f4e72 100644 *** a/src/backend/utils/mmgr/mcxt.c --- b/src/backend/utils/mmgr/mcxt.c *************** MemoryContextSetParent(MemoryContext con *** 323,328 **** --- 323,332 ---- AssertArg(MemoryContextIsValid(context)); AssertArg(context != new_parent); + /* Fast path if it's got correct parent already */ + if (new_parent == context->parent) + return; + /* Delink from existing parent, if any */ if (context->parent) { diff --git a/src/include/executor/spi.h b/src/include/executor/spi.h index 9e912ba..fbcae0c 100644 *** a/src/include/executor/spi.h --- b/src/include/executor/spi.h *************** extern char *SPI_getnspname(Relation rel *** 124,129 **** --- 124,130 ---- extern void *SPI_palloc(Size size); extern void *SPI_repalloc(void *pointer, Size size); extern void SPI_pfree(void *pointer); + extern Datum SPI_datumTransfer(Datum value, bool typByVal, int typLen); extern void SPI_freetuple(HeapTuple pointer); extern void SPI_freetuptable(SPITupleTable *tuptable); diff --git a/src/include/executor/tuptable.h b/src/include/executor/tuptable.h index 48f84bf..00686b0 100644 *** a/src/include/executor/tuptable.h --- b/src/include/executor/tuptable.h *************** extern Datum ExecFetchSlotTupleDatum(Tup *** 163,168 **** --- 163,169 ---- extern HeapTuple ExecMaterializeSlot(TupleTableSlot *slot); extern TupleTableSlot *ExecCopySlot(TupleTableSlot *dstslot, TupleTableSlot *srcslot); + extern TupleTableSlot *ExecMakeSlotContentsReadOnly(TupleTableSlot *slot); /* in access/common/heaptuple.c */ extern Datum slot_getattr(TupleTableSlot *slot, int attnum, bool *isnull); diff --git a/src/include/nodes/primnodes.h b/src/include/nodes/primnodes.h index 4f1d234..deaa3c5 100644 *** a/src/include/nodes/primnodes.h --- b/src/include/nodes/primnodes.h *************** typedef struct WindowFunc *** 305,310 **** --- 305,314 ---- * Note: the result datatype is the element type when fetching a single * element; but it is the array type when doing subarray fetch or either * type of store. + * + * Note: for the cases where an array is returned, if refexpr yields a R/W + * expanded array, then the implementation is allowed to modify that object + * in-place and return the same object.) * ---------------- */ typedef struct ArrayRef diff --git a/src/include/postgres.h b/src/include/postgres.h index be37313..ccf1605 100644 *** a/src/include/postgres.h --- b/src/include/postgres.h *************** typedef struct varatt_indirect *** 88,93 **** --- 88,110 ---- } varatt_indirect; /* + * struct varatt_expanded is a "TOAST pointer" representing an out-of-line + * Datum that is stored in memory, in some type-specific, not necessarily + * physically contiguous format that is convenient for computation not + * storage. APIs for this, in particular the definition of struct + * ExpandedObjectHeader, are in src/include/utils/expandeddatum.h. + * + * Note that just as for struct varatt_external, this struct is stored + * unaligned within any containing tuple. + */ + typedef struct ExpandedObjectHeader ExpandedObjectHeader; + + typedef struct varatt_expanded + { + ExpandedObjectHeader *eohptr; + } varatt_expanded; + + /* * Type tag for the various sorts of "TOAST pointer" datums. The peculiar * value for VARTAG_ONDISK comes from a requirement for on-disk compatibility * with a previous notion that the tag field was the pointer datum's length. *************** typedef struct varatt_indirect *** 95,105 **** --- 112,129 ---- typedef enum vartag_external { VARTAG_INDIRECT = 1, + VARTAG_EXPANDED_RO = 2, + VARTAG_EXPANDED_RW = 3, VARTAG_ONDISK = 18 } vartag_external; + /* this test relies on the specific tag values above */ + #define VARTAG_IS_EXPANDED(tag) \ + (((tag) & ~1) == VARTAG_EXPANDED_RO) + #define VARTAG_SIZE(tag) \ ((tag) == VARTAG_INDIRECT ? sizeof(varatt_indirect) : \ + VARTAG_IS_EXPANDED(tag) ? sizeof(varatt_expanded) : \ (tag) == VARTAG_ONDISK ? sizeof(varatt_external) : \ TrapMacro(true, "unrecognized TOAST vartag")) *************** typedef struct *** 294,299 **** --- 318,329 ---- (VARATT_IS_EXTERNAL(PTR) && VARTAG_EXTERNAL(PTR) == VARTAG_ONDISK) #define VARATT_IS_EXTERNAL_INDIRECT(PTR) \ (VARATT_IS_EXTERNAL(PTR) && VARTAG_EXTERNAL(PTR) == VARTAG_INDIRECT) + #define VARATT_IS_EXTERNAL_EXPANDED_RO(PTR) \ + (VARATT_IS_EXTERNAL(PTR) && VARTAG_EXTERNAL(PTR) == VARTAG_EXPANDED_RO) + #define VARATT_IS_EXTERNAL_EXPANDED_RW(PTR) \ + (VARATT_IS_EXTERNAL(PTR) && VARTAG_EXTERNAL(PTR) == VARTAG_EXPANDED_RW) + #define VARATT_IS_EXTERNAL_EXPANDED(PTR) \ + (VARATT_IS_EXTERNAL(PTR) && VARTAG_IS_EXPANDED(VARTAG_EXTERNAL(PTR))) #define VARATT_IS_SHORT(PTR) VARATT_IS_1B(PTR) #define VARATT_IS_EXTENDED(PTR) (!VARATT_IS_4B_U(PTR)) diff --git a/src/include/utils/array.h b/src/include/utils/array.h index 0a488e7..f76443d 100644 *** a/src/include/utils/array.h --- b/src/include/utils/array.h *************** *** 45,50 **** --- 45,55 ---- * We support subscripting on these types, but array_in() and array_out() * only work with varlena arrays. * + * In addition, arrays are a major user of the "expanded object" TOAST + * infrastructure. This allows a varlena array to be converted to a + * separate representation that may include "deconstructed" Datum/isnull + * arrays holding the elements. + * * * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group * Portions Copyright (c) 1994, Regents of the University of California *************** *** 57,62 **** --- 62,69 ---- #define ARRAY_H #include "fmgr.h" + #include "utils/expandeddatum.h" + /* * Arrays are varlena objects, so must meet the varlena convention that *************** typedef struct *** 75,80 **** --- 82,167 ---- } ArrayType; /* + * An expanded array is contained within a private memory context (as + * all expanded objects must be) and has a control structure as below. + * + * The expanded array might contain a regular "flat" array if that was the + * original input and we've not modified it significantly. Otherwise, the + * contents are represented by Datum/isnull arrays plus dimensionality and + * type information. We could also have both forms, if we've deconstructed + * the original array for access purposes but not yet changed it. For pass- + * by-reference element types, the Datums would point into the flat array in + * this situation. Once we start modifying array elements, new pass-by-ref + * elements are separately palloc'd within the memory context. + */ + #define EA_MAGIC 689375833 /* ID for debugging crosschecks */ + + typedef struct ExpandedArrayHeader + { + /* Standard header for expanded objects */ + ExpandedObjectHeader hdr; + + /* Magic value identifying an expanded array (for debugging only) */ + int ea_magic; + + /* Dimensionality info (always valid) */ + int ndims; /* # of dimensions */ + int *dims; /* array dimensions */ + int *lbound; /* index lower bounds for each dimension */ + + /* Element type info (always valid) */ + Oid element_type; /* element type OID */ + int16 typlen; /* needed info about element datatype */ + bool typbyval; + char typalign; + + /* + * If we have a Datum-array representation of the array, it's kept here; + * else dvalues/dnulls are NULL. The dvalues and dnulls arrays are always + * palloc'd within the object private context, but may change size from + * time to time. For pass-by-ref element types, dvalues entries might + * point either into the fstartptr..fendptr area, or to separately + * palloc'd chunks. Elements should always be fully detoasted, as they + * are in the standard flat representation. + * + * Even when dvalues is valid, dnulls can be NULL if there are no null + * elements. + */ + Datum *dvalues; /* array of Datums */ + bool *dnulls; /* array of is-null flags for Datums */ + int dvalueslen; /* allocated length of above arrays */ + int nelems; /* number of valid entries in above arrays */ + + /* + * flat_size is the current space requirement for the flat equivalent of + * the expanded array, if known; otherwise it's 0. We store this to make + * consecutive calls of get_flat_size cheap. + */ + Size flat_size; + + /* + * fvalue points to the flat representation if it is valid, else it is + * NULL. If we have or ever had a flat representation then + * fstartptr/fendptr point to the start and end+1 of its data area; this + * is so that we can tell which Datum pointers point into the flat + * representation rather than being pointers to separately palloc'd data. + */ + ArrayType *fvalue; /* must be a fully detoasted array */ + char *fstartptr; /* start of its data area */ + char *fendptr; /* end+1 of its data area */ + } ExpandedArrayHeader; + + /* + * Functions that can handle either a "flat" varlena array or an expanded + * array use this union to work with their input. + */ + typedef union AnyArrayType + { + ArrayType flt; + ExpandedArrayHeader xpn; + } AnyArrayType; + + /* * working state for accumArrayResult() and friends * note that the input must be scalars (legal array elements) */ *************** typedef struct ArrayMapState *** 151,167 **** /* ArrayIteratorData is private in arrayfuncs.c */ typedef struct ArrayIteratorData *ArrayIterator; ! /* ! * fmgr macros for array objects ! */ #define DatumGetArrayTypeP(X) ((ArrayType *) PG_DETOAST_DATUM(X)) #define DatumGetArrayTypePCopy(X) ((ArrayType *) PG_DETOAST_DATUM_COPY(X)) #define PG_GETARG_ARRAYTYPE_P(n) DatumGetArrayTypeP(PG_GETARG_DATUM(n)) #define PG_GETARG_ARRAYTYPE_P_COPY(n) DatumGetArrayTypePCopy(PG_GETARG_DATUM(n)) #define PG_RETURN_ARRAYTYPE_P(x) PG_RETURN_POINTER(x) /* ! * Access macros for array header fields. * * ARR_DIMS returns a pointer to an array of array dimensions (number of * elements along the various array axes). --- 238,261 ---- /* ArrayIteratorData is private in arrayfuncs.c */ typedef struct ArrayIteratorData *ArrayIterator; ! /* fmgr macros for regular varlena array objects */ #define DatumGetArrayTypeP(X) ((ArrayType *) PG_DETOAST_DATUM(X)) #define DatumGetArrayTypePCopy(X) ((ArrayType *) PG_DETOAST_DATUM_COPY(X)) #define PG_GETARG_ARRAYTYPE_P(n) DatumGetArrayTypeP(PG_GETARG_DATUM(n)) #define PG_GETARG_ARRAYTYPE_P_COPY(n) DatumGetArrayTypePCopy(PG_GETARG_DATUM(n)) #define PG_RETURN_ARRAYTYPE_P(x) PG_RETURN_POINTER(x) + /* fmgr macros for expanded array objects */ + #define PG_GETARG_EXPANDED_ARRAY(n) DatumGetExpandedArray(PG_GETARG_DATUM(n)) + #define PG_GETARG_EXPANDED_ARRAYX(n, metacache) \ + DatumGetExpandedArrayX(PG_GETARG_DATUM(n), metacache) + #define PG_RETURN_EXPANDED_ARRAY(x) PG_RETURN_DATUM(EOHPGetRWDatum(&(x)->hdr)) + + /* fmgr macros for AnyArrayType (ie, get either varlena or expanded form) */ + #define PG_GETARG_ANY_ARRAY(n) DatumGetAnyArray(PG_GETARG_DATUM(n)) + /* ! * Access macros for varlena array header fields. * * ARR_DIMS returns a pointer to an array of array dimensions (number of * elements along the various array axes). *************** typedef struct ArrayIteratorData *ArrayI *** 209,214 **** --- 303,404 ---- #define ARR_DATA_PTR(a) \ (((char *) (a)) + ARR_DATA_OFFSET(a)) + /* + * Macros for working with AnyArrayType inputs. Beware multiple references! + */ + #define AARR_NDIM(a) \ + (VARATT_IS_EXPANDED_HEADER(a) ? (a)->xpn.ndims : ARR_NDIM(&(a)->flt)) + #define AARR_HASNULL(a) \ + (VARATT_IS_EXPANDED_HEADER(a) ? \ + ((a)->xpn.dvalues != NULL ? (a)->xpn.dnulls != NULL : ARR_HASNULL((a)->xpn.fvalue)) : \ + ARR_HASNULL(&(a)->flt)) + #define AARR_ELEMTYPE(a) \ + (VARATT_IS_EXPANDED_HEADER(a) ? (a)->xpn.element_type : ARR_ELEMTYPE(&(a)->flt)) + #define AARR_DIMS(a) \ + (VARATT_IS_EXPANDED_HEADER(a) ? (a)->xpn.dims : ARR_DIMS(&(a)->flt)) + #define AARR_LBOUND(a) \ + (VARATT_IS_EXPANDED_HEADER(a) ? (a)->xpn.lbound : ARR_LBOUND(&(a)->flt)) + + /* + * Macros for iterating through elements of a flat or expanded array. + * Use "ARRAY_ITER ARRAY_ITER_VARS(name);" to declare the local variables + * needed for an iterator (more than one set can be used in the same function, + * if they have different names). + * Use "ARRAY_ITER_SETUP(name, arrayptr);" to prepare to iterate, and + * "ARRAY_ITER_NEXT(name, index, datumvar, isnullvar, ...);" to fetch the + * next element into datumvar/isnullvar. "index" must be the zero-origin + * element number; we make caller provide this since caller is generally + * counting the elements anyway. + */ + #define ARRAY_ITER /* dummy type name to keep pgindent happy */ + + #define ARRAY_ITER_VARS(iter) \ + Datum *iter##datumptr; \ + bool *iter##isnullptr; \ + char *iter##dataptr; \ + bits8 *iter##bitmapptr; \ + int iter##bitmask + + #define ARRAY_ITER_SETUP(iter, arrayptr) \ + do { \ + if (VARATT_IS_EXPANDED_HEADER(arrayptr)) \ + { \ + if ((arrayptr)->xpn.dvalues) \ + { \ + (iter##datumptr) = (arrayptr)->xpn.dvalues; \ + (iter##isnullptr) = (arrayptr)->xpn.dnulls; \ + (iter##dataptr) = NULL; \ + (iter##bitmapptr) = NULL; \ + } \ + else \ + { \ + (iter##datumptr) = NULL; \ + (iter##isnullptr) = NULL; \ + (iter##dataptr) = ARR_DATA_PTR((arrayptr)->xpn.fvalue); \ + (iter##bitmapptr) = ARR_NULLBITMAP((arrayptr)->xpn.fvalue); \ + } \ + } \ + else \ + { \ + (iter##datumptr) = NULL; \ + (iter##isnullptr) = NULL; \ + (iter##dataptr) = ARR_DATA_PTR(&(arrayptr)->flt); \ + (iter##bitmapptr) = ARR_NULLBITMAP(&(arrayptr)->flt); \ + } \ + (iter##bitmask) = 1; \ + } while (0) + + #define ARRAY_ITER_NEXT(iter,i, datumvar,isnullvar, elmlen,elmbyval,elmalign) \ + do { \ + if (iter##datumptr) \ + { \ + (datumvar) = (iter##datumptr)[i]; \ + (isnullvar) = (iter##isnullptr) ? (iter##isnullptr)[i] : false; \ + } \ + else \ + { \ + if ((iter##bitmapptr) && (*(iter##bitmapptr) & (iter##bitmask)) == 0) \ + { \ + (isnullvar) = true; \ + (datumvar) = (Datum) 0; \ + } \ + else \ + { \ + (isnullvar) = false; \ + (datumvar) = fetch_att(iter##dataptr, elmbyval, elmlen); \ + (iter##dataptr) = att_addlength_pointer(iter##dataptr, elmlen, iter##dataptr); \ + (iter##dataptr) = (char *) att_align_nominal(iter##dataptr, elmalign); \ + } \ + (iter##bitmask) <<= 1; \ + if ((iter##bitmask) == 0x100) \ + { \ + if (iter##bitmapptr) \ + (iter##bitmapptr)++; \ + (iter##bitmask) = 1; \ + } \ + } \ + } while (0) + /* * GUC parameter *************** extern Datum array_remove(PG_FUNCTION_AR *** 250,255 **** --- 440,454 ---- extern Datum array_replace(PG_FUNCTION_ARGS); extern Datum width_bucket_array(PG_FUNCTION_ARGS); + extern void CopyArrayEls(ArrayType *array, + Datum *values, + bool *nulls, + int nitems, + int typlen, + bool typbyval, + char typalign, + bool freedata); + extern Datum array_get_element(Datum arraydatum, int nSubscripts, int *indx, int arraytyplen, int elmlen, bool elmbyval, char elmalign, bool *isNull); *************** extern ArrayType *array_set(ArrayType *a *** 271,277 **** Datum dataValue, bool isNull, int arraytyplen, int elmlen, bool elmbyval, char elmalign); ! extern Datum array_map(FunctionCallInfo fcinfo, Oid inpType, Oid retType, ArrayMapState *amstate); extern void array_bitmap_copy(bits8 *destbitmap, int destoffset, --- 470,476 ---- Datum dataValue, bool isNull, int arraytyplen, int elmlen, bool elmbyval, char elmalign); ! extern Datum array_map(FunctionCallInfo fcinfo, Oid retType, ArrayMapState *amstate); extern void array_bitmap_copy(bits8 *destbitmap, int destoffset, *************** extern ArrayType *construct_md_array(Dat *** 288,293 **** --- 487,495 ---- int *lbs, Oid elmtype, int elmlen, bool elmbyval, char elmalign); extern ArrayType *construct_empty_array(Oid elmtype); + extern ExpandedArrayHeader *construct_empty_expanded_array(Oid element_type, + MemoryContext parentcontext, + ArrayMetaState *metacache); extern void deconstruct_array(ArrayType *array, Oid elmtype, int elmlen, bool elmbyval, char elmalign, *************** extern int mda_next_tuple(int n, int *cu *** 341,346 **** --- 543,559 ---- extern int32 *ArrayGetIntegerTypmods(ArrayType *arr, int *n); /* + * prototypes for functions defined in array_expanded.c + */ + extern Datum expand_array(Datum arraydatum, MemoryContext parentcontext, + ArrayMetaState *metacache); + extern ExpandedArrayHeader *DatumGetExpandedArray(Datum d); + extern ExpandedArrayHeader *DatumGetExpandedArrayX(Datum d, + ArrayMetaState *metacache); + extern AnyArrayType *DatumGetAnyArray(Datum d); + extern void deconstruct_expanded_array(ExpandedArrayHeader *eah); + + /* * prototypes for functions defined in array_userfuncs.c */ extern Datum array_append(PG_FUNCTION_ARGS); diff --git a/src/include/utils/datum.h b/src/include/utils/datum.h index 663414b..c572f79 100644 *** a/src/include/utils/datum.h --- b/src/include/utils/datum.h *************** *** 24,41 **** extern Size datumGetSize(Datum value, bool typByVal, int typLen); /* ! * datumCopy - make a copy of a datum. * * If the datatype is pass-by-reference, memory is obtained with palloc(). */ extern Datum datumCopy(Datum value, bool typByVal, int typLen); /* ! * datumFree - free a datum previously allocated by datumCopy, if any. * ! * Does nothing if datatype is pass-by-value. */ ! extern void datumFree(Datum value, bool typByVal, int typLen); /* * datumIsEqual --- 24,41 ---- extern Size datumGetSize(Datum value, bool typByVal, int typLen); /* ! * datumCopy - make a copy of a non-NULL datum. * * If the datatype is pass-by-reference, memory is obtained with palloc(). */ extern Datum datumCopy(Datum value, bool typByVal, int typLen); /* ! * datumTransfer - transfer a non-NULL datum into the current memory context. * ! * Differs from datumCopy() in its handling of read-write expanded objects. */ ! extern Datum datumTransfer(Datum value, bool typByVal, int typLen); /* * datumIsEqual diff --git a/src/include/utils/expandeddatum.h b/src/include/utils/expandeddatum.h index ...3a8336e . *** a/src/include/utils/expandeddatum.h --- b/src/include/utils/expandeddatum.h *************** *** 0 **** --- 1,148 ---- + /*------------------------------------------------------------------------- + * + * expandeddatum.h + * Declarations for access to "expanded" value representations. + * + * Complex data types, particularly container types such as arrays and + * records, usually have on-disk representations that are compact but not + * especially convenient to modify. What's more, when we do modify them, + * having to recopy all the rest of the value can be extremely inefficient. + * Therefore, we provide a notion of an "expanded" representation that is used + * only in memory and is optimized more for computation than storage. + * The format appearing on disk is called the data type's "flattened" + * representation, since it is required to be a contiguous blob of bytes -- + * but the type can have an expanded representation that is not. Data types + * must provide means to translate an expanded representation back to + * flattened form. + * + * An expanded object is meant to survive across multiple operations, but + * not to be enormously long-lived; for example it might be a local variable + * in a PL/pgSQL procedure. So its extra bulk compared to the on-disk format + * is a worthwhile trade-off. + * + * References to expanded objects are a type of TOAST pointer. + * Because of longstanding conventions in Postgres, this means that the + * flattened form of such an object must always be a varlena object. + * Fortunately that's no restriction in practice. + * + * There are actually two kinds of TOAST pointers for expanded objects: + * read-only and read-write pointers. Possession of one of the latter + * authorizes a function to modify the value in-place rather than copying it + * as would normally be required. Functions should always return a read-write + * pointer to any new expanded object they create. Functions that modify an + * argument value in-place must take care that they do not corrupt the old + * value if they fail partway through. + * + * + * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group + * Portions Copyright (c) 1994, Regents of the University of California + * + * src/include/utils/expandeddatum.h + * + *------------------------------------------------------------------------- + */ + #ifndef EXPANDEDDATUM_H + #define EXPANDEDDATUM_H + + /* Size of an EXTERNAL datum that contains a pointer to an expanded object */ + #define EXPANDED_POINTER_SIZE (VARHDRSZ_EXTERNAL + sizeof(varatt_expanded)) + + /* + * "Methods" that must be provided for any expanded object. + * + * get_flat_size: compute space needed for flattened representation (which + * must be a valid in-line, non-compressed, 4-byte-header varlena object). + * + * flatten_into: construct flattened representation in the caller-allocated + * space at *result, of size allocated_size (which will always be the result + * of a preceding get_flat_size call; it's passed for cross-checking). + * + * Note: construction of a heap tuple from an expanded datum calls + * get_flat_size twice, so it's worthwhile to make sure that that doesn't + * incur too much overhead. + */ + typedef Size (*EOM_get_flat_size_method) (ExpandedObjectHeader *eohptr); + typedef void (*EOM_flatten_into_method) (ExpandedObjectHeader *eohptr, + void *result, Size allocated_size); + + /* Struct of function pointers for an expanded object's methods */ + typedef struct ExpandedObjectMethods + { + EOM_get_flat_size_method get_flat_size; + EOM_flatten_into_method flatten_into; + } ExpandedObjectMethods; + + /* + * Every expanded object must contain this header; typically the header + * is embedded in some larger struct that adds type-specific fields. + * + * It is presumed that the header object and all subsidiary data are stored + * in eoh_context, so that the object can be freed by deleting that context, + * or its storage lifespan can be altered by reparenting the context. + * (In principle the object could own additional resources, such as malloc'd + * storage, and use a memory context reset callback to free them upon reset or + * deletion of eoh_context.) + * + * We set up two TOAST pointers within the standard header, one read-write + * and one read-only. This allows functions to return either kind of pointer + * without making an additional allocation, and in particular without worrying + * whether a separately palloc'd object would have sufficient lifespan. + * But note that these pointers are just a convenience; a pointer object + * appearing somewhere else would still be legal. + * + * The typedef declaration for this appears in postgres.h. + */ + struct ExpandedObjectHeader + { + /* Phony varlena header */ + int32 vl_len_; /* always EOH_HEADER_MAGIC, see below */ + + /* Pointer to methods required for object type */ + const ExpandedObjectMethods *eoh_methods; + + /* Memory context containing this header and subsidiary data */ + MemoryContext eoh_context; + + /* Standard R/W TOAST pointer for this object is kept here */ + char eoh_rw_ptr[EXPANDED_POINTER_SIZE]; + + /* Standard R/O TOAST pointer for this object is kept here */ + char eoh_ro_ptr[EXPANDED_POINTER_SIZE]; + }; + + /* + * Particularly for read-only functions, it is handy to be able to work with + * either regular "flat" varlena inputs or expanded inputs of the same data + * type. To allow determining which case an argument-fetching function has + * returned, the first int32 of an ExpandedObjectHeader always contains -1 + * (EOH_HEADER_MAGIC to the code). This works since no 4-byte-header varlena + * could have that as its first 4 bytes. Caution: we could not reliably tell + * the difference between an ExpandedObjectHeader and a short-header object + * with this trick. However, it works fine if the argument fetching code + * always returns either a 4-byte-header flat object or an expanded object. + */ + #define EOH_HEADER_MAGIC (-1) + #define VARATT_IS_EXPANDED_HEADER(PTR) \ + (((ExpandedObjectHeader *) (PTR))->vl_len_ == EOH_HEADER_MAGIC) + + /* + * Generic support functions for expanded objects. + * (More of these might be worth inlining later.) + */ + + #define EOHPGetRWDatum(eohptr) PointerGetDatum((eohptr)->eoh_rw_ptr) + #define EOHPGetRODatum(eohptr) PointerGetDatum((eohptr)->eoh_ro_ptr) + + extern ExpandedObjectHeader *DatumGetEOHP(Datum d); + extern void EOH_init_header(ExpandedObjectHeader *eohptr, + const ExpandedObjectMethods *methods, + MemoryContext obj_context); + extern Size EOH_get_flat_size(ExpandedObjectHeader *eohptr); + extern void EOH_flatten_into(ExpandedObjectHeader *eohptr, + void *result, Size allocated_size); + extern bool DatumIsReadWriteExpandedObject(Datum d, bool isnull, int16 typlen); + extern Datum MakeExpandedObjectReadOnly(Datum d, bool isnull, int16 typlen); + extern Datum TransferExpandedObject(Datum d, MemoryContext new_parent); + extern void DeleteExpandedObject(Datum d); + + #endif /* EXPANDEDDATUM_H */ diff --git a/src/pl/plpgsql/src/pl_comp.c b/src/pl/plpgsql/src/pl_comp.c index 650cc48..0ff2086 100644 *** a/src/pl/plpgsql/src/pl_comp.c --- b/src/pl/plpgsql/src/pl_comp.c *************** build_datatype(HeapTuple typeTup, int32 *** 2200,2205 **** --- 2200,2221 ---- typ->collation = typeStruct->typcollation; if (OidIsValid(collation) && OidIsValid(typ->collation)) typ->collation = collation; + /* Detect if type is true array, or domain thereof */ + /* NB: this is only used to decide whether to apply expand_array */ + if (typeStruct->typtype == TYPTYPE_BASE) + { + /* this test should match what get_element_type() checks */ + typ->typisarray = (typeStruct->typlen == -1 && + OidIsValid(typeStruct->typelem)); + } + else if (typeStruct->typtype == TYPTYPE_DOMAIN) + { + /* we can short-circuit looking up base types if it's not varlena */ + typ->typisarray = (typeStruct->typlen == -1 && + OidIsValid(get_base_element_type(typeStruct->typbasetype))); + } + else + typ->typisarray = false; typ->atttypmod = typmod; return typ; diff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c index deefb1f..aac7cda 100644 *** a/src/pl/plpgsql/src/pl_exec.c --- b/src/pl/plpgsql/src/pl_exec.c *************** *** 34,39 **** --- 34,40 ---- #include "utils/array.h" #include "utils/builtins.h" #include "utils/datum.h" + #include "utils/fmgroids.h" #include "utils/lsyscache.h" #include "utils/memutils.h" #include "utils/rel.h" *************** static void exec_prepare_plan(PLpgSQL_ex *** 173,178 **** --- 174,181 ---- static bool exec_simple_check_node(Node *node); static void exec_simple_check_plan(PLpgSQL_expr *expr); static void exec_simple_recheck_plan(PLpgSQL_expr *expr, CachedPlan *cplan); + static void exec_check_rw_parameter(PLpgSQL_expr *expr, int target_dno); + static bool contains_target_param(Node *node, int *target_dno); static bool exec_eval_simple_expr(PLpgSQL_execstate *estate, PLpgSQL_expr *expr, Datum *result, *************** plpgsql_exec_function(PLpgSQL_function * *** 312,317 **** --- 315,358 ---- var->value = fcinfo->arg[i]; var->isnull = fcinfo->argnull[i]; var->freeval = false; + + /* + * Force any array-valued parameter to be stored in + * expanded form in our local variable, in hopes of + * improving efficiency of uses of the variable. (This is + * a hack, really: why only arrays? Need more thought + * about which cases are likely to win. See also + * typisarray-specific heuristic in exec_assign_value.) + * + * Special cases: If passed a R/W expanded pointer, assume + * we can commandeer the object rather than having to copy + * it. If passed a R/O expanded pointer, just keep it as + * the value of the variable for the moment. (We'll force + * it to R/W if the variable gets modified, but that may + * very well never happen.) + */ + if (!var->isnull && var->datatype->typisarray) + { + if (VARATT_IS_EXTERNAL_EXPANDED_RW(DatumGetPointer(var->value))) + { + /* take ownership of R/W object */ + var->value = TransferExpandedObject(var->value, + CurrentMemoryContext); + var->freeval = true; + } + else if (VARATT_IS_EXTERNAL_EXPANDED_RO(DatumGetPointer(var->value))) + { + /* R/O pointer, keep it as-is until assigned to */ + } + else + { + /* flat array, so force to expanded form */ + var->value = expand_array(var->value, + CurrentMemoryContext, + NULL); + var->freeval = true; + } + } } break; *************** plpgsql_exec_function(PLpgSQL_function * *** 477,494 **** /* * If the function's return type isn't by value, copy the value ! * into upper executor memory context. */ if (!fcinfo->isnull && !func->fn_retbyval) ! { ! Size len; ! void *tmp; ! ! len = datumGetSize(estate.retval, false, func->fn_rettyplen); ! tmp = SPI_palloc(len); ! memcpy(tmp, DatumGetPointer(estate.retval), len); ! estate.retval = PointerGetDatum(tmp); ! } } } --- 518,531 ---- /* * If the function's return type isn't by value, copy the value ! * into upper executor memory context. However, if we have a R/W ! * expanded datum, we can just transfer its ownership out to the ! * upper executor context. */ if (!fcinfo->isnull && !func->fn_retbyval) ! estate.retval = SPI_datumTransfer(estate.retval, ! false, ! func->fn_rettyplen); } } *************** exec_stmt_return(PLpgSQL_execstate *esta *** 2476,2481 **** --- 2513,2525 ---- * Special case path when the RETURN expression is a simple variable * reference; in particular, this path is always taken in functions with * one or more OUT parameters. + * + * This special case is especially efficient for returning variables that + * have R/W expanded values: we can put the R/W pointer directly into + * estate->retval, leading to transferring the value to the caller's + * context cheaply. If we went through exec_eval_expr we'd end up with a + * R/O pointer. It's okay to skip MakeExpandedObjectReadOnly here since + * we know we won't need the variable's value within the function anymore. */ if (stmt->retvarno >= 0) { *************** exec_stmt_return_next(PLpgSQL_execstate *** 2604,2609 **** --- 2648,2658 ---- * Special case path when the RETURN NEXT expression is a simple variable * reference; in particular, this path is always taken in functions with * one or more OUT parameters. + * + * Unlike exec_statement_return, there's no special win here for R/W + * expanded values, since they'll have to get flattened to go into the + * tuplestore. Indeed, we'd better make them R/O to avoid any risk of the + * casting step changing them in-place. */ if (stmt->retvarno >= 0) { *************** exec_stmt_return_next(PLpgSQL_execstate *** 2622,2627 **** --- 2671,2681 ---- (errcode(ERRCODE_DATATYPE_MISMATCH), errmsg("wrong result type supplied in RETURN NEXT"))); + /* let's be very paranoid about the cast step */ + retval = MakeExpandedObjectReadOnly(retval, + isNull, + var->datatype->typlen); + /* coerce type if needed */ retval = exec_cast_value(estate, retval, *************** exec_prepare_plan(PLpgSQL_execstate *est *** 3333,3338 **** --- 3387,3399 ---- /* Check to see if it's a simple expression */ exec_simple_check_plan(expr); + + /* + * Mark expression as not using a read-write param. exec_assign_value has + * to take steps to override this if appropriate; that seems cleaner than + * adding parameters to all other callers. + */ + expr->rwparam = -1; } *************** exec_assign_expr(PLpgSQL_execstate *esta *** 4071,4076 **** --- 4132,4150 ---- Oid valtype; int32 valtypmod; + /* + * If first time through, create a plan for this expression, and then see + * if we can pass the target variable as a read-write parameter to the + * expression. (This is a bit messy, but it seems cleaner than modifying + * the API of exec_eval_expr for the purpose.) + */ + if (expr->plan == NULL) + { + exec_prepare_plan(estate, expr, 0); + if (target->dtype == PLPGSQL_DTYPE_VAR) + exec_check_rw_parameter(expr, target->dno); + } + value = exec_eval_expr(estate, expr, &isnull, &valtype, &valtypmod); exec_assign_value(estate, target, value, isnull, valtype, valtypmod); exec_eval_cleanup(estate); *************** exec_assign_value(PLpgSQL_execstate *est *** 4140,4165 **** /* * If type is by-reference, copy the new value (which is * probably in the eval_econtext) into the procedure's memory ! * context. */ if (!var->datatype->typbyval && !isNull) ! newvalue = datumCopy(newvalue, ! false, ! var->datatype->typlen); /* ! * Now free the old value. (We can't do this any earlier ! * because of the possibility that we are assigning the var's ! * old value to it, eg "foo := foo". We could optimize out ! * the assignment altogether in such cases, but it's too ! * infrequent to be worth testing for.) */ ! free_var(var); var->value = newvalue; var->isnull = isNull; ! if (!var->datatype->typbyval && !isNull) ! var->freeval = true; break; } --- 4214,4264 ---- /* * If type is by-reference, copy the new value (which is * probably in the eval_econtext) into the procedure's memory ! * context. But if it's a read/write reference to an expanded ! * object, no physical copy needs to happen; at most we need ! * to reparent the object's memory context. ! * ! * If it's an array, we force the value to be stored in R/W ! * expanded form. This wins if the function later does, say, ! * a lot of array subscripting operations on the variable, and ! * otherwise might lose. We might need to use a different ! * heuristic, but it's too soon to tell. Also, are there ! * cases where it'd be useful to force non-array values into ! * expanded form? */ if (!var->datatype->typbyval && !isNull) ! { ! if (var->datatype->typisarray && ! !VARATT_IS_EXTERNAL_EXPANDED_RW(DatumGetPointer(newvalue))) ! { ! /* array and not already R/W, so apply expand_array */ ! newvalue = expand_array(newvalue, ! CurrentMemoryContext, ! NULL); ! } ! else ! { ! /* else transfer value if R/W, else just datumCopy */ ! newvalue = datumTransfer(newvalue, ! false, ! var->datatype->typlen); ! } ! } /* ! * Now free the old value, unless it's the same as the new ! * value (ie, we're doing "foo := foo"). Note that for ! * expanded objects, this test is necessary and cannot ! * reliably be made any earlier; we have to be looking at the ! * object's standard R/W pointer to be sure pointer equality ! * is meaningful. */ ! if (var->value != newvalue || var->isnull || isNull) ! free_var(var); var->value = newvalue; var->isnull = isNull; ! var->freeval = (!var->datatype->typbyval && !isNull); break; } *************** exec_assign_value(PLpgSQL_execstate *est *** 4505,4514 **** * * At present this doesn't handle PLpgSQL_expr or PLpgSQL_arrayelem datums. * ! * NOTE: caller must not modify the returned value, since it points right ! * at the stored value in the case of pass-by-reference datatypes. In some ! * cases we have to palloc a return value, and in such cases we put it into ! * the estate's short-term memory context. */ static void exec_eval_datum(PLpgSQL_execstate *estate, --- 4604,4617 ---- * * At present this doesn't handle PLpgSQL_expr or PLpgSQL_arrayelem datums. * ! * NOTE: the returned Datum points right at the stored value in the case of ! * pass-by-reference datatypes. Generally callers should take care not to ! * modify the stored value. Some callers intentionally manipulate variables ! * referenced by R/W expanded pointers, though; it is those callers' ! * responsibility that the results are semantically OK. ! * ! * In some cases we have to palloc a return value, and in such cases we put ! * it into the estate's short-term memory context. */ static void exec_eval_datum(PLpgSQL_execstate *estate, *************** exec_eval_simple_expr(PLpgSQL_execstate *** 5216,5221 **** --- 5319,5327 ---- { /* It got replanned ... is it still simple? */ exec_simple_recheck_plan(expr, cplan); + /* better recheck r/w safety, as well */ + if (expr->rwparam >= 0) + exec_check_rw_parameter(expr, expr->rwparam); if (expr->expr_simple_expr == NULL) { /* Ooops, release refcount and fail */ *************** setup_param_list(PLpgSQL_execstate *esta *** 5362,5368 **** */ MemSet(paramLI->params, 0, estate->ndatums * sizeof(ParamExternData)); ! /* Instantiate values for "safe" parameters of the expression */ dno = -1; while ((dno = bms_next_member(expr->paramnos, dno)) >= 0) { --- 5468,5480 ---- */ MemSet(paramLI->params, 0, estate->ndatums * sizeof(ParamExternData)); ! /* ! * Instantiate values for "safe" parameters of the expression. One of ! * them might be the variable the expression result will be assigned ! * to, in which case we can pass the variable's value as-is even if ! * it's a read-write expanded object; otherwise, convert read-write ! * pointers to read-only pointers for safety. ! */ dno = -1; while ((dno = bms_next_member(expr->paramnos, dno)) >= 0) { *************** setup_param_list(PLpgSQL_execstate *esta *** 5373,5379 **** PLpgSQL_var *var = (PLpgSQL_var *) datum; ParamExternData *prm = ¶mLI->params[dno]; ! prm->value = var->value; prm->isnull = var->isnull; prm->pflags = PARAM_FLAG_CONST; prm->ptype = var->datatype->typoid; --- 5485,5496 ---- PLpgSQL_var *var = (PLpgSQL_var *) datum; ParamExternData *prm = ¶mLI->params[dno]; ! if (dno == expr->rwparam) ! prm->value = var->value; ! else ! prm->value = MakeExpandedObjectReadOnly(var->value, ! var->isnull, ! var->datatype->typlen); prm->isnull = var->isnull; prm->pflags = PARAM_FLAG_CONST; prm->ptype = var->datatype->typoid; *************** plpgsql_param_fetch(ParamListInfo params *** 5442,5447 **** --- 5559,5573 ---- exec_eval_datum(estate, datum, &prm->ptype, &prmtypmod, &prm->value, &prm->isnull); + + /* + * If it's a read/write expanded datum, convert reference to read-only, + * unless it's safe to pass as read-write. + */ + if (datum->dtype == PLPGSQL_DTYPE_VAR && dno != expr->rwparam) + prm->value = MakeExpandedObjectReadOnly(prm->value, + prm->isnull, + ((PLpgSQL_var *) datum)->datatype->typlen); } *************** exec_simple_recheck_plan(PLpgSQL_expr *e *** 6384,6389 **** --- 6510,6622 ---- expr->expr_simple_typmod = exprTypmod((Node *) tle->expr); } + /* + * exec_check_rw_parameter --- can we pass expanded object as read/write param? + * + * If we have an assignment like "x := array_append(x, foo)" in which the + * top-level function is trusted not to corrupt its argument in case of an + * error, then when x has an expanded object as value, it is safe to pass the + * value as a read/write pointer and let the function modify the value + * in-place. + * + * This function checks for a safe expression, and sets expr->rwparam to the + * dno of the target variable (x) if safe, or -1 if not safe. + */ + static void + exec_check_rw_parameter(PLpgSQL_expr *expr, int target_dno) + { + Oid funcid; + List *fargs; + ListCell *lc; + + /* Assume unsafe */ + expr->rwparam = -1; + + /* + * If the expression isn't simple, there's no point in trying to optimize + * (because the exec_run_select code path will flatten any expanded result + * anyway). Even without that, this seems like a good safety restriction. + */ + if (expr->expr_simple_expr == NULL) + return; + + /* + * If target variable isn't referenced by expression, no need to look + * further. + */ + if (!bms_is_member(target_dno, expr->paramnos)) + return; + + /* + * Top level of expression must be a simple FuncExpr or OpExpr. + */ + if (IsA(expr->expr_simple_expr, FuncExpr)) + { + FuncExpr *fexpr = (FuncExpr *) expr->expr_simple_expr; + + funcid = fexpr->funcid; + fargs = fexpr->args; + } + else if (IsA(expr->expr_simple_expr, OpExpr)) + { + OpExpr *opexpr = (OpExpr *) expr->expr_simple_expr; + + funcid = opexpr->opfuncid; + fargs = opexpr->args; + } + else + return; + + /* + * The top-level function must be one that we trust to be "safe". + * Currently we hard-wire the list, but it would be very desirable to + * allow extensions to mark their functions as safe ... + */ + if (!(funcid == F_ARRAY_APPEND || + funcid == F_ARRAY_PREPEND)) + return; + + /* + * The target variable (in the form of a Param) must only appear as a + * direct argument of the top-level function. + */ + foreach(lc, fargs) + { + Node *arg = (Node *) lfirst(lc); + + /* A Param is OK, whether it's the target variable or not */ + if (arg && IsA(arg, Param)) + continue; + /* Otherwise, argument expression must not reference target */ + if (contains_target_param(arg, &target_dno)) + return; + } + + /* OK, we can pass target as a read-write parameter */ + expr->rwparam = target_dno; + } + + /* + * Recursively check for a Param referencing the target variable + */ + static bool + contains_target_param(Node *node, int *target_dno) + { + if (node == NULL) + return false; + if (IsA(node, Param)) + { + Param *param = (Param *) node; + + if (param->paramkind == PARAM_EXTERN && + param->paramid == *target_dno + 1) + return true; + return false; + } + return expression_tree_walker(node, contains_target_param, + (void *) target_dno); + } + /* ---------- * exec_set_found Set the global found variable to true/false * ---------- *************** free_var(PLpgSQL_var *var) *** 6540,6546 **** { if (var->freeval) { ! pfree(DatumGetPointer(var->value)); var->freeval = false; } } --- 6773,6784 ---- { if (var->freeval) { ! if (DatumIsReadWriteExpandedObject(var->value, ! var->isnull, ! var->datatype->typlen)) ! DeleteExpandedObject(var->value); ! else ! pfree(DatumGetPointer(var->value)); var->freeval = false; } } *************** format_expr_params(PLpgSQL_execstate *es *** 6750,6757 **** curvar = (PLpgSQL_var *) estate->datums[dno]; ! exec_eval_datum(estate, (PLpgSQL_datum *) curvar, ¶mtypeid, ! ¶mtypmod, ¶mdatum, ¶misnull); appendStringInfo(¶mstr, "%s%s = ", paramno > 0 ? ", " : "", --- 6988,6996 ---- curvar = (PLpgSQL_var *) estate->datums[dno]; ! exec_eval_datum(estate, (PLpgSQL_datum *) curvar, ! ¶mtypeid, ¶mtypmod, ! ¶mdatum, ¶misnull); appendStringInfo(¶mstr, "%s%s = ", paramno > 0 ? ", " : "", diff --git a/src/pl/plpgsql/src/pl_gram.y b/src/pl/plpgsql/src/pl_gram.y index 4026e41..0097890 100644 *** a/src/pl/plpgsql/src/pl_gram.y --- b/src/pl/plpgsql/src/pl_gram.y *************** read_sql_construct(int until, *** 2625,2630 **** --- 2625,2631 ---- expr->query = pstrdup(ds.data); expr->plan = NULL; expr->paramnos = NULL; + expr->rwparam = -1; expr->ns = plpgsql_ns_top(); pfree(ds.data); *************** make_execsql_stmt(int firsttoken, int lo *** 2849,2854 **** --- 2850,2856 ---- expr->query = pstrdup(ds.data); expr->plan = NULL; expr->paramnos = NULL; + expr->rwparam = -1; expr->ns = plpgsql_ns_top(); pfree(ds.data); *************** read_cursor_args(PLpgSQL_var *cursor, in *** 3732,3737 **** --- 3734,3740 ---- expr->query = pstrdup(ds.data); expr->plan = NULL; expr->paramnos = NULL; + expr->rwparam = -1; expr->ns = plpgsql_ns_top(); pfree(ds.data); diff --git a/src/pl/plpgsql/src/plpgsql.h b/src/pl/plpgsql/src/plpgsql.h index bec773a..93c2504 100644 *** a/src/pl/plpgsql/src/plpgsql.h --- b/src/pl/plpgsql/src/plpgsql.h *************** typedef struct *** 183,188 **** --- 183,189 ---- char typtype; Oid typrelid; Oid collation; /* from pg_type, but can be overridden */ + bool typisarray; /* is "true" array, or domain over one */ int32 atttypmod; /* typmod (taken from someplace else) */ } PLpgSQL_type; *************** typedef struct PLpgSQL_expr *** 216,221 **** --- 217,223 ---- char *query; SPIPlanPtr plan; Bitmapset *paramnos; /* all dnos referenced by this query */ + int rwparam; /* dno of read/write param, or -1 if none */ /* function containing this expr (not set until we first parse query) */ struct PLpgSQL_function *func;
2015-05-06 0:50 GMT+02:00 Tom Lane <tgl@sss.pgh.pa.us>:
declare a int[] := '{}'; begin for i in 1..90000 loop a := a || ARRAY[i ]; end loop; raise notice '%', 'aa'; end$$ language plpgsql;
I wrote:
> Pavel Stehule <pavel.stehule@gmail.com> writes:
>> Significant slowdown is on following test:
>> do $$ declare a int[] := '{}'; begin for i in 1..90000 loop a := a || 10;
>> end loop; end$$ language plpgsql;
>> do $$ declare a numeric[] := '{}'; begin for i in 1..90000 loop a := a ||
>> 10.1; end loop; end$$ language plpgsql;
>> integer master 14sec x patched 55sec
>> numeric master 43sec x patched 108sec
>> It is probably worst case - and it is known plpgsql antipattern
> Yeah, I have not expended a great deal of effort on the array_append/
> array_prepend/array_cat code paths. Still, in these plpgsql cases,
> we should in principle have gotten down from two array copies per loop to
> one, so it's disappointing to not have better results there, even granting
> that the new "copy" step is not just a byte-by-byte copy. Let me see if
> there's anything simple to be done about that.
The attached updated patch reduces both of those do-loop tests to about
60 msec on my machine. It contains two improvements over the 1.1 patch:
1. There's a fast path for copying an expanded array to another expanded
array when the element type is pass-by-value: we can just memcpy the
Datum array instead of working element-by-element. In isolation, that
change made the patch a little faster than 9.4 on your int-array case,
though of course it doesn't help for the numeric-array case (and I do not
see a way to avoid working element-by-element for pass-by-ref cases).
2. pl/pgsql now detects cases like "a := a || x" and allows the array "a"
to be passed as a read-write pointer to array_append, so that array_append
can modify expanded arrays in-place and avoid inessential data copying
altogether. (The earlier patch had made array_append and array_prepend
safe for this usage, but there wasn't actually any way to invoke them
with read-write pointers.) I had speculated about doing this in my
earliest discussion of this patch, but there was no code for it before.
The key question for change #2 is how do we identify what is a "safe"
top-level function that can be trusted not to corrupt the read-write value
if it fails partway through. I did not have a good answer before, and
I still don't; what this version of the patch does is to hard-wire
array_append and array_prepend as the functions considered safe.
Obviously that is crying out for improvement, but we can leave that
question for later; at least now we have infrastructure that makes it
possible to do it.
Change #1 is actually not relevant to these example cases, because we
don't copy any arrays within the loop given change #2. But I left it in
because it's not much code and it will help for situations where change #2
doesn't apply.
I can confirm this speedup - pretty nice.
Multidimensional append is slower 2x .. but it is probably corner case
declare a int[] := '{}'; begin for i in 1..90000 loop a := a || ARRAY[[i ]]; end loop; raise notice '%', 'aa'; end$$ language plpgsql;
declare a int[] := '{}'; begin for i in 1..90000 loop a := a || ARRAY[[i ]]; end loop; raise notice '%', 'aa'; end$$ language plpgsql;
but this optimization doesn't work for code - that is semantically same like a || i;
declare a int[] := '{}'; begin for i in 1..90000 loop a := a || ARRAY[i ]; end loop; raise notice '%', 'aa'; end$$ language plpgsql;
So there is some to much sensible
There are slowdown with MD arrays, but it is not typical use case, and the speedup is about 5-10x and faster - so I'll be very happy if this patch will be in 9.5
Regards
Pavel
regards, tom lane
Pavel Stehule <pavel.stehule@gmail.com> writes: > Multidimensional append is slower 2x .. but it is probably corner case > declare a int[] := '{}'; begin for i in 1..90000 loop a := a || ARRAY[[i > ]]; end loop; raise notice '%', 'aa'; end$$ language plpgsql; Yeah, that's array_cat(), which I've not done anything with. I'm not really excited about adding code to it; I think use-cases like this one are probably too uncommon to justify more code. In any case we could go back and improve it later if there are enough complaints. Another way to look at it is that in this example, plpgsql's attempts to force the "a" array into expanded form are a mistake: we never get any benefit because array_cat() just wants it in flat form again, and delivers it in flat form. (It's likely that this is an unrealistic worst case: it's hard to imagine real array-using applications that never do any element-by-element access.) Possibly we could improve matters with a more refined heuristic about whether to force arrays to expanded form during assignments --- but I'm not sure what that would look like. plpgsql has very little direct knowledge of which operations will be applied to the array later. regards, tom lane
2015-05-06 15:50 GMT+02:00 Tom Lane <tgl@sss.pgh.pa.us>:
Pavel Stehule <pavel.stehule@gmail.com> writes:
> Multidimensional append is slower 2x .. but it is probably corner case
> declare a int[] := '{}'; begin for i in 1..90000 loop a := a || ARRAY[[i
> ]]; end loop; raise notice '%', 'aa'; end$$ language plpgsql;
Yeah, that's array_cat(), which I've not done anything with. I'm not
really excited about adding code to it; I think use-cases like this one
are probably too uncommon to justify more code. In any case we could
go back and improve it later if there are enough complaints.
Another way to look at it is that in this example, plpgsql's attempts to
force the "a" array into expanded form are a mistake: we never get any
benefit because array_cat() just wants it in flat form again, and delivers
it in flat form. (It's likely that this is an unrealistic worst case:
it's hard to imagine real array-using applications that never do any
element-by-element access.) Possibly we could improve matters with a more
refined heuristic about whether to force arrays to expanded form during
assignments --- but I'm not sure what that would look like. plpgsql has
very little direct knowledge of which operations will be applied to the
array later.
Isn't better to push information about possible target to function?
array_cat(a, b, result)
{
{
if (undef(result))
return a || b;
if (b == result)
array_extend(result, a);
return result;
else if (a == result)
array_extend(result, b);
return result;
else
return a || b;
}
It can be used for arrays, for strings?
On second hand it decrease readability related functions :( (but not all functions should to support this optimization).
Regards
Pavel
regards, tom lane
Pavel Stehule <pavel.stehule@gmail.com> writes: > 2015-05-06 15:50 GMT+02:00 Tom Lane <tgl@sss.pgh.pa.us>: >> Another way to look at it is that in this example, plpgsql's attempts to >> force the "a" array into expanded form are a mistake: we never get any >> benefit because array_cat() just wants it in flat form again, and delivers >> it in flat form. (It's likely that this is an unrealistic worst case: >> it's hard to imagine real array-using applications that never do any >> element-by-element access.) Possibly we could improve matters with a more >> refined heuristic about whether to force arrays to expanded form during >> assignments --- but I'm not sure what that would look like. plpgsql has >> very little direct knowledge of which operations will be applied to the >> array later. > Isn't better to push information about possible target to function? I don't think that would solve the problem. For example, one of the cases I worry about is a function that does read-only examination of an array argument; consider something like create function sum_squares(a numeric[]) returns numeric as $$ declare s numeric := 0; begin for i in array_lower(a,1) .. array_upper(a, 1) loop s := s + a[i] * a[i]; end loop; return s; end;$$ language plpgsqlstrict immutable; array_get_element() is not in a position here to force expansion of the array variable, so unless plpgsql itself does something we're not going to get a performance win (unless the argument happens to be already expanded on arrival). I'm inclined to think that we need to add information to pg_type about whether a type supports expansion (and how to convert to expanded form if so). In the patch as it stands, plpgsql just has hard-wired knowledge that it can call expand_array() on arrays that it's putting into function local variables. I'd be okay with shipping 9.5 like that, but pretty soon we'll want a solution that extension data types can use too. More generally, it'd be nice if the mechanism could be more flexible than "always force variables of this type to expanded form". But I don't see how to teach plpgsql itself how to decide that intelligently, let alone how we might design an API that lets some third-party data type decide it at arm's length from plpgsql ... regards, tom lane
2015-05-06 18:54 GMT+02:00 Tom Lane <tgl@sss.pgh.pa.us>:
Pavel Stehule <pavel.stehule@gmail.com> writes:
> 2015-05-06 15:50 GMT+02:00 Tom Lane <tgl@sss.pgh.pa.us>:
>> Another way to look at it is that in this example, plpgsql's attempts to
>> force the "a" array into expanded form are a mistake: we never get any
>> benefit because array_cat() just wants it in flat form again, and delivers
>> it in flat form. (It's likely that this is an unrealistic worst case:
>> it's hard to imagine real array-using applications that never do any
>> element-by-element access.) Possibly we could improve matters with a more
>> refined heuristic about whether to force arrays to expanded form during
>> assignments --- but I'm not sure what that would look like. plpgsql has
>> very little direct knowledge of which operations will be applied to the
>> array later.
> Isn't better to push information about possible target to function?
I don't think that would solve the problem. For example, one of the cases
I worry about is a function that does read-only examination of an array
argument; consider something like
create function sum_squares(a numeric[]) returns numeric as $$
declare s numeric := 0;
begin
for i in array_lower(a, 1) .. array_upper(a, 1) loop
s := s + a[i] * a[i];
end loop;
return s;
end;
$$ language plpgsql strict immutable;
I remember this issue
array_get_element() is not in a position here to force expansion of the
array variable, so unless plpgsql itself does something we're not going
to get a performance win (unless the argument happens to be already
expanded on arrival).
I'm inclined to think that we need to add information to pg_type about
whether a type supports expansion (and how to convert to expanded form
if so). In the patch as it stands, plpgsql just has hard-wired knowledge
that it can call expand_array() on arrays that it's putting into function
local variables. I'd be okay with shipping 9.5 like that, but pretty soon
we'll want a solution that extension data types can use too.
More generally, it'd be nice if the mechanism could be more flexible than
"always force variables of this type to expanded form". But I don't see
how to teach plpgsql itself how to decide that intelligently, let alone
how we might design an API that lets some third-party data type decide it
at arm's length from plpgsql ...
I agree - the core of work have to be elsewhere than in plpgsql. Some years ago there was a idea about toast cache.
regards, tom lane
Hi, > The attached updated patch reduces both of those do-loop tests to about > 60 msec on my machine. It contains two improvements over the 1.1 patch: Looking at this. First reading the patch to understand the details. * The VARTAG_IS_EXPANDED(tag) trick in VARTAG_SIZE is unlikely to beneficial, before the compiler could implement the wholething as a computed goto or lookup table, afterwards not. * It'd be nice if the get_flat_size comment in expandeddatm.h could specify whether the header size is included. That variesenough around toast that it seems worthwhile. * You were rather bothered by the potential of multiple evaluations for the ilist stuff. And now the AARR macros are fullof them... * I find the ARRAY_ITER_VARS/ARRAY_ITER_NEXT macros rather ugly. I don't buy the argument that turning them into functionswill be slower. I'd bet the contrary on common platforms. * Not a fan of the EH_ prefix in array_expanded.c and EOH_ elsewhere. Just looks ugly to me. Whatever. * The list of hardwired safe ops in exec_check_rw_parameter is somewhat sad. Don't have a better idea though. * "Also, a C function that is modifying a read-write expanded value in-place should take care to leave the value in a sanestate if it fails partway through." - that's a pretty hefty requirement imo. I wonder if it'd not be possible to convertRW to RO if a value originates from outside an exception block. IIRC that'd be useful for a bunch of other errorcases we currently basically shrug away (something around toast and aborted xacts comes to mind). * The forced RW->RO conversion in subquery scans is a bit sad, but I seems like something left for later. These are more judgement calls than anything else... Somewhere in the thread you comment on the fact that it's a bit sad that plpgsql is the sole benefactor of this (unless some function forces expansion internally). I'd be ok to leave it at that for now. It'd be quite cool to get some feedback from postgis folks about the suitability of this for their cases. I've not really looked into performance improvements around this, choosing to look into somewhat reasonable cases where it'll regress. ISTM that the worst case for the new situation is large arrays that exist as plpgsql variables but are only ever passed on. Say e.g. a function that accepts an array among other parameters and passes it on to another function. As rather extreme case of this: CREATE OR REPLACE FUNCTION plpgsql_array_length(p_a anyarray) RETURNS int LANGUAGE plpgsql AS $$ BEGIN RETURN array_length(p_a, 1); END; $$; SELECT plpgsql_array_length(b.arr) FROM (SELECT array_agg(d) FROM generate_series(1, 10000) d) b(arr), generate_series(1, 100000) repeat; with \o /dev/null redirecting the output. in an assert build it goes from 325.511 ms to 655.733 ms optimized from 94.648 ms to 287.574 ms. Now this is a fairly extreme example; and I don't think it'll get much worse than that. But I do think there's a bunch of cases where values exist in plpgsql that won't actually be accessed. Say, e.g. return values from queries that are then conditionally returned and such. I'm not sure it's possible to do anything about that. Expanding only in cases where it'd be beneficial is going to be hard. Greetings, Andres Freund
Andres Freund <andres@anarazel.de> writes: > Looking at this. First reading the patch to understand the details. > * The VARTAG_IS_EXPANDED(tag) trick in VARTAG_SIZE is unlikely to > beneficial, before the compiler could implement the whole thing as a > computed goto or lookup table, afterwards not. Well, if you're worried about the speed of VARTAG_SIZE() then the right thing to do would be to revert your change that made enum vartag_external distinct from the size of the struct, so that we could go back to just using the second byte of a varattrib_1b_e datum as its size. As I said at the time, inserting pad bytes to force each different type of toast pointer to be a different size would probably be a better tradeoff than what commit 3682025015 did. > * It'd be nice if the get_flat_size comment in expandeddatm.h could > specify whether the header size is included. That varies enough around > toast that it seems worthwhile. OK. > * You were rather bothered by the potential of multiple evaluations for > the ilist stuff. And now the AARR macros are full of them... Yeah, there is doubtless some added cost there. But I think it's a better answer than duplicating each function in toto; the code space that that would take isn't free either. > * I find the ARRAY_ITER_VARS/ARRAY_ITER_NEXT macros rather ugly. I don't > buy the argument that turning them into functions will be slower. I'd > bet the contrary on common platforms. Perhaps; do you want to do some testing and see? > * Not a fan of the EH_ prefix in array_expanded.c and EOH_ > elsewhere. Just looks ugly to me. Whatever. I'm not wedded to that naming if you have a better suggestion. > * The list of hardwired safe ops in exec_check_rw_parameter is somewhat > sad. Don't have a better idea though. It's very sad, and it will be high on my list to improve that in 9.6. But I do not think it's a fatal problem to ship it that way in 9.5, because *as things stand today* those are the only two functions that could benefit anyway. It won't really matter until we have extensions that want to use this mechanism. > * "Also, a C function that is modifying a read-write expanded value > in-place should take care to leave the value in a sane state if it > fails partway through." - that's a pretty hefty requirement imo. It is, which is one reason that I'm nervous about relaxing the test in exec_check_rw_parameter. Still, it was possible to code array_set_element to meet that restriction without too much pain. I'm inclined to leave the stronger requirement in the docs for now, until we get more push-back. > * The forced RW->RO conversion in subquery scans is a bit sad, but I > seems like something left for later. Yes, there are definitely some things that look like future opportunities here. > Somewhere in the thread you comment on the fact that it's a bit sad that > plpgsql is the sole benefactor of this (unless some function forces > expansion internally). I'd be ok to leave it at that for now. It'd be > quite cool to get some feedback from postgis folks about the suitability > of this for their cases. Indeed, that's the main reason I'm eager to ship something in 9.5, even if it's not perfect. I hope those guys will look at it and use it, and maybe give us feedback on ways to improve it. > ISTM that the worst case for the new situation is large arrays that > exist as plpgsql variables but are only ever passed on. Well, more to the point, large arrays that are forced into expanded format (costing a conversion step) but then we never do anything with them that would benefit from that. Just saying they're "passed on" doesn't prove much since the called function might or might not get any benefit. array_length doesn't, but some other things would. Your example with array_agg() is interesting, since one of the things on my to-do list is to see whether we could change array_agg to return an expanded array. It would not be hard to make it build that representation directly, instead of its present ad-hoc internal state. The trick would be, when can you return the internal state without an additional copy step? But maybe it could return a R/O pointer ... > ... Expanding only in > cases where it'd be beneficial is going to be hard. Yeah, improving that heuristic looks like a research project. Still, even with all the limitations and to-do items in the patch now, I'm pretty sure this will be a net win for practically all applications. regards, tom lane
On 2015-05-10 12:09:41 -0400, Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: > > Looking at this. First reading the patch to understand the details. > > > * The VARTAG_IS_EXPANDED(tag) trick in VARTAG_SIZE is unlikely to > > beneficial, before the compiler could implement the whole thing as a > > computed goto or lookup table, afterwards not. > > Well, if you're worried about the speed of VARTAG_SIZE() then the right > thing to do would be to revert your change that made enum vartag_external > distinct from the size of the struct, so that we could go back to just > using the second byte of a varattrib_1b_e datum as its size. As I said > at the time, inserting pad bytes to force each different type of toast > pointer to be a different size would probably be a better tradeoff than > what commit 3682025015 did. I doubt that'd be a net positive. Anyway, all I'm saying is that I can't see the VARTAG_IS_EXPANDED trick being beneficial in comparison to checking both explicit values. > > * You were rather bothered by the potential of multiple evaluations for > > the ilist stuff. And now the AARR macros are full of them... > > Yeah, there is doubtless some added cost there. But I think it's a better > answer than duplicating each function in toto; the code space that that > would take isn't free either. Yea, duplicating would be horrid. I'm more thinking of declaring some iterator state outside the macro, or just using an inline function. > > * I find the ARRAY_ITER_VARS/ARRAY_ITER_NEXT macros rather ugly. I don't > > buy the argument that turning them into functions will be slower. I'd > > bet the contrary on common platforms. > > Perhaps; do you want to do some testing and see? Not exactly with great joy, but I will. > > * The list of hardwired safe ops in exec_check_rw_parameter is somewhat > > sad. Don't have a better idea though. > > It's very sad, and it will be high on my list to improve that in 9.6. > But I do not think it's a fatal problem to ship it that way in 9.5, > because *as things stand today* those are the only two functions that > could benefit anyway. It won't really matter until we have extensions > that want to use this mechanism. Agreed that it's not fatal. > > ISTM that the worst case for the new situation is large arrays that > > exist as plpgsql variables but are only ever passed on. > > Well, more to the point, large arrays that are forced into expanded format > (costing a conversion step) but then we never do anything with them that > would benefit from that. Just saying they're "passed on" doesn't prove > much since the called function might or might not get any benefit. > array_length doesn't, but some other things would. Right. But I'm not sure it's that uncommon. > Your example with array_agg() is interesting, since one of the things on > my to-do list is to see whether we could change array_agg to return an > expanded array. Well, I chose array_agg only because it was a trivial way to generate a large array. The values could actually come from disk or something. > It would not be hard to make it build that representation > directly, instead of its present ad-hoc internal state. The trick would > be, when can you return the internal state without an additional copy > step? But maybe it could return a R/O pointer ... R/O or R/W? > > ... Expanding only in > > cases where it'd be beneficial is going to be hard. > > Yeah, improving that heuristic looks like a research project. Still, even > with all the limitations and to-do items in the patch now, I'm pretty sure > this will be a net win for practically all applications. I wonder if we could somehow 'mark' other toast pointers as 'expand if useful'. I.e. have something pretty much like ExpandedObjectHeader, except that it initially works like the indirect toast stuff. So eoh_context is set, but the data is still in the original datum. When accessed via 'plain' accessors that don't know about the expanded format the pointed to datum is returned. But when accessed by something "desiring" the expanded version it's expanded. It seemed that'd be doable expanding the new infrastructure a bit more. Greetings, Andres Freund
On 2015-05-10 12:09:41 -0400, Tom Lane wrote: > > * I find the ARRAY_ITER_VARS/ARRAY_ITER_NEXT macros rather ugly. I don't > > buy the argument that turning them into functions will be slower. I'd > > bet the contrary on common platforms. > Perhaps; do you want to do some testing and see? I've added new iterator functions using a on-stack state variable and array_iter_setup/next functions pretty analogous to the macros. And then converted arrayfuncs.c to use them. Codesize before introducing inline functions: assert: text data bss dec hex filename 8142400 50562 295952 8488914 8187d2 src/backend/postgres optimize: text data bss dec hex filename 6892928 50022 295920 7238870 6e74d6 src/backend/postgres After: assert: text data bss dec hex filename 8133040 50562 295952 8479554 816342 src/backend/postgres optimize: text data bss dec hex filename 6890256 50022 295920 7236198 6e6a66 src/backend/postgres That's a small decrease. I'm not sure what exactly to use as a performance benchmark here. For now I chose SELECT * FROM (SELECT ARRAY(SELECT generate_series(1, 10000))) d, generate_series(1, 1000) repeat(i); that'll hit array_out, which uses iterators. pgbench -P 10 -h /tmp -p 5440 postgres -n -f /tmp/bench.sql -c 4 -T 60 (I chose parallel because it'll show icache efficiency differences) before, best of four: tps = 4.921260 (including connections establishing) after, best of four: tps = 5.046437 (including connections establishing) That's a relatively small difference. I'm not surprised, I'd not have expected anything major. Personally I think something roughly along those lines is both more robust and easier to maintain. Even if possibly need to protect against inlines not being available. Similarly using inline funcs for AARR_NDIMS/HASNULL does not appear to hamper performance and gets rid of the multiple evaluation risk. These patches are obviously WIP. Especially with the iter stuff it's possible that the concept could be extended a bit further. Greetings, Andres Freund
Attachment
Andres Freund <andres@anarazel.de> writes: > I'm not sure what exactly to use as a performance benchmark > here. For now I chose > SELECT * FROM (SELECT ARRAY(SELECT generate_series(1, 10000))) d, generate_series(1, 1000) repeat(i); > that'll hit array_out, which uses iterators. Hmm, probably those results are swamped by I/O functions though. I'd suggest trying something that exercises array_map(), which it looks like means doing an array coercion. Perhaps like so: do $$ declare a int4[]; x int; begin a := array(select generate_series(1,1000)); for i in 1..100000 loop x := array_length(a::int8[], 1); end loop; end$$; Anyway, thanks for poking at it! regards, tom lane
On 2015-05-10 21:09:14 -0400, Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: > > I'm not sure what exactly to use as a performance benchmark > > here. For now I chose > > SELECT * FROM (SELECT ARRAY(SELECT generate_series(1, 10000))) d, generate_series(1, 1000) repeat(i); > > that'll hit array_out, which uses iterators. > > Hmm, probably those results are swamped by I/O functions though. I did check with a quick profile, and the iteration itself is a significant part of the total execution time. > I'd suggest trying something that exercises array_map(), which > it looks like means doing an array coercion. Perhaps like so: > do $$ > declare a int4[]; > x int; > begin > a := array(select generate_series(1,1000)); > for i in 1..100000 loop > x := array_length(a::int8[], 1); > end loop; > end$$; with the loop count set to 10000 instead, I get: before: after: tps = 20.940092 (including connections establishing) after: tps = 20.568730 (including connections establishing) Greetings, Andres Freund
Andres Freund <andres@anarazel.de> writes: > On 2015-05-10 12:09:41 -0400, Tom Lane wrote: >>> * I find the ARRAY_ITER_VARS/ARRAY_ITER_NEXT macros rather ugly. I don't >>> buy the argument that turning them into functions will be slower. I'd >>> bet the contrary on common platforms. >> Perhaps; do you want to do some testing and see? > I've added new iterator functions using a on-stack state variable and > array_iter_setup/next functions pretty analogous to the macros. And then > converted arrayfuncs.c to use them. I confirm that this doesn't seem to be any slower (at least not on a compiler with inline functions). And it's certainly less ugly, so I've adopted it. > Similarly using inline funcs for AARR_NDIMS/HASNULL does not appear to > hamper performance and gets rid of the multiple evaluation risk. I'm less excited about that part though. The original ARR_FOO macros mostly have multiple-evaluation risks as well, and that's been totally academic so far. By the time you get done dealing with the STATIC_IF_INLINE dance, it's quite messy to have these be inline functions, and I am not seeing a useful return from adding the mess. regards, tom lane
On 2015-05-13 20:28:52 -0400, Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: > > Similarly using inline funcs for AARR_NDIMS/HASNULL does not appear to > > hamper performance and gets rid of the multiple evaluation risk. > > I'm less excited about that part though. The original ARR_FOO macros > mostly have multiple-evaluation risks as well, and that's been totally > academic so far. Fair point.
Andres Freund <andres@anarazel.de> writes: > On 2015-05-10 12:09:41 -0400, Tom Lane wrote: >> Andres Freund <andres@anarazel.de> writes: >>> * The VARTAG_IS_EXPANDED(tag) trick in VARTAG_SIZE is unlikely to >>> beneficial, before the compiler could implement the whole thing as a >>> computed goto or lookup table, afterwards not. >> Well, if you're worried about the speed of VARTAG_SIZE() then the right >> thing to do would be to revert your change that made enum vartag_external >> distinct from the size of the struct, so that we could go back to just >> using the second byte of a varattrib_1b_e datum as its size. As I said >> at the time, inserting pad bytes to force each different type of toast >> pointer to be a different size would probably be a better tradeoff than >> what commit 3682025015 did. > I doubt that'd be a net positive. Anyway, all I'm saying is that I can't > see the VARTAG_IS_EXPANDED trick being beneficial in comparison to > checking both explicit values. I did some microbenchmarking on this, and AFAICT doing it your way makes it slower. I still think that going back to defining the second byte as the size would be better. Fortunately, since this is only a matter of in-memory representations, we aren't committed to any particular answer. regards, tom lane
On 2015-05-13 20:48:51 -0400, Tom Lane wrote: > I still think that going back to defining the second byte as the size > would be better. Fortunately, since this is only a matter of in-memory > representations, we aren't committed to any particular answer. Requiring sizes to be different still strikes me as a disaster. Or is that not what you're proposing?
Andres Freund <andres@anarazel.de> writes: > On 2015-05-13 20:48:51 -0400, Tom Lane wrote: >> I still think that going back to defining the second byte as the size >> would be better. Fortunately, since this is only a matter of in-memory >> representations, we aren't committed to any particular answer. > Requiring sizes to be different still strikes me as a disaster. Or is > that not what you're proposing? It is, but why would it be a disaster? We could add StaticAsserts verifying that the sizes actually are different. I doubt that the pad space itself could amount to any issue performance-wise, since it would only ever exist in transient in-memory tuples, and even that only seldom. regards, tom lane
On 2015-05-13 21:01:43 -0400, Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: > > On 2015-05-13 20:48:51 -0400, Tom Lane wrote: > >> I still think that going back to defining the second byte as the size > >> would be better. Fortunately, since this is only a matter of in-memory > >> representations, we aren't committed to any particular answer. > > > Requiring sizes to be different still strikes me as a disaster. Or is > > that not what you're proposing? > > It is, but why would it be a disaster? We could add StaticAsserts > verifying that the sizes actually are different. I doubt that the pad > space itself could amount to any issue performance-wise, since it would > only ever exist in transient in-memory tuples, and even that only seldom. The sizes would be platform dependant. It's also just incredibly ugly to have to add pad bytes to structures so we can disambiguate them. Anyway, I think we can live with your & or my proposed additional branch for now. I can't see either variant being a relevant performance bottleneck anytime soon.
Andres Freund <andres@anarazel.de> writes: > On 2015-05-13 21:01:43 -0400, Tom Lane wrote: >> It is, but why would it be a disaster? We could add StaticAsserts >> verifying that the sizes actually are different. I doubt that the pad >> space itself could amount to any issue performance-wise, since it would >> only ever exist in transient in-memory tuples, and even that only seldom. > The sizes would be platform dependant. So what? There are lots of platform-dependent constants in PG. > It's also just incredibly ugly to > have to add pad bytes to structures so we can disambiguate them. Well, I agree it's not too pretty, but you were the one who brought up the issue of the speed of VARTAG_SIZE(). We definitely gave up some performance there already, and my patch will make it worse. > Anyway, I think we can live with your & or my proposed additional branch > for now. I can't see either variant being a relevant performance > bottleneck anytime soon. Actually, after having microbenchmarked the difference between those two proposals, I'm not too sure that VARTAG_SIZE() is down in the noise. But it doesn't matter for the moment --- any one of these alternatives would be a very localized code change, and none of them would create an on-disk compatibility break. We can let it go until someone wants to put together a more definitive benchmark for testing. regards, tom lane