Question about using AggCheckCallContext in a C function - Mailing list pgsql-general

From Matt Solnit
Subject Question about using AggCheckCallContext in a C function
Date
Msg-id 525FEA14-9548-4C3C-B88F-4201E8E9BA1E@soasta.com
Whole thread Raw
Responses Re: Question about using AggCheckCallContext in a C function  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
Hi everyone.  A while back, I wrote a function to sum array contents (think
1-D matrix addition), and am using it in a custom SUM aggregate.  Here's an
example:

CREATE TABLE foo (arr INT[]);
INSERT INTO foo VALUES ('{1, 2, 3}'), ('{4, 5, 6}');

SELECT SUM(arr) FROM foo;
   sum
---------
 {5,7,9}
(1 row)

This works, but I got some great advice from Craig Ringer about how I can
make it even faster using AggCheckCallContext().  See http://stackoverflow.com/a/16996606/6198

I'm trying to implement this now, and I'm running into occasional memory
corruption.  Sometimes the back-end segfaults, sometimes I get seemingly-
random "ERROR:  could not open relation with OID xyz" errors.

Here's the relevant code snippet:

// Assumptions:
// 1. We will never be called with a null array.  The CREATE FUNCTION call should specify STRICT to prevent PostgreSQL
fromdoing this. 
// 2. The arrays will never contain null values.  It's up to the caller to ensure this.
// 3. The arrays will always be the same length.  It's up to the caller to ensure this.

ArrayType *array1, *array2;
Datum *arrayData1, *arrayData2;
int arrayLength1;

array2 = PG_GETARG_ARRAYTYPE_P(1);

if (AggCheckCallContext(fcinfo, NULL))
{
  // We are being called by an aggregate.

  if (PG_ARGISNULL(0))
  {
    // This can happen on the very first call as PostgreSQL sets up the "temporary transition value" for the aggregate.

    // NOTE: This never seems to happen!  Is it because the function is STRICT?

    // Return a copy of the second array.
    PG_RETURN_ARRAYTYPE_P(copy_intArrayType(array2));
  }
  else
  {
    // This means that the first array is a "temporary transition value", and we can safely modify it directly with no
sideeffects. 
    // This avoids the overhead of creating a new array object.

    array1 = PG_GETARG_ARRAYTYPE_P(0);

    // Get a direct pointer to each array's contents.
    arrayData1 = ARR_DATA_PTR(array1);
    arrayData2 = ARR_DATA_PTR(array2);

    // Get the length of the first array (should be the same as the second).
    arrayLength1 = (ARR_DIMS(array1))[0];

    // Add the contents of the second array to the first.
    for (i = 0; i < arrayLength1; i++)
    {
      arrayData1[i] += arrayData2[i];
    }

    // Return the updated array.
    PG_RETURN_ARRAYTYPE_P(array1);
  }
}
else
{
  // We are not being called by an aggregate.
  // <omitted>
}

After poring over the code in nodeAgg.c, and looking at the in8inc()
function, I think I know what the problem is:  the typical use of
AggCheckCallContext() is not compatible with TOAST-able data types.

When the state transition function is STRICT (as mine is), and
advance_transition_function() comes across the first non-NULL value,
it makes a copy using datumCopy().  However, from what I can tell,
datumCopy() is not really "compatible" with TOAST for this
particular use case.

Am I on the right track here?  If so, what's the best way to solve
this problem?  I would appreciate any help you can offer.

Sincerely,
Matt Solnit <msolnit@soasta.com>

pgsql-general by date:

Previous
From: "Day, David"
Date:
Subject: Re: plpgsql FOR LOOP CTE problem ?
Next
From: Tom Lane
Date:
Subject: Re: Question about using AggCheckCallContext in a C function