Re: Combining Aggregates - Mailing list pgsql-hackers

From David Rowley
Subject Re: Combining Aggregates
Date
Msg-id CAKJS1f_fpgAE_VSF1gH7obq00wt_Mg3Cr=C2iG9erzaBAvMJjA@mail.gmail.com
Whole thread Raw
In response to Re: Combining Aggregates  (Haribabu Kommi <kommi.haribabu@gmail.com>)
List pgsql-hackers
On 19 January 2016 at 02:44, Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
On Mon, Jan 18, 2016 at 10:32 PM, David Rowley
<david.rowley@2ndquadrant.com> wrote:

I just thought like direct mapping of the structure with text pointer.
something like
the below.

result = PG_ARGISNULL(0) ? NULL : (text *) PG_GETARG_POINTER(0);
state = (PolyNumAggState *)VARDATA(result);

To handle the big-endian or little-endian, we may need some extra changes.

Instead of adding 3 new columns to the pg_aggregate catalog table to handle
the internal types, either something like the above to handle the internal types
or some other way is better IMO.

The problem with that is that most of these internal structs for the aggregate states have pointers to other memory, so even if we laid those bytes down into a bytea or something, then doing so is not going to dereference the pointers to the other memory, and when we dereference those pointers in the other process, we'll have problems as these addresses belong to the other process.
 
For example PolyNumAggState is defined as:

typedef NumericAggState PolyNumAggState;

and NumericAggState has:

NumericVar sumX; /* sum of processed numbers */
NumericVar sumX2; /* sum of squares of processed numbers */

And NumericVar has:

NumericDigit *buf; /* start of palloc'd space for digits[] */
NumericDigit *digits; /* base-NBASE digits */

Both of these point to other memory which won't be in the varlena type.

Serialization is the process of collecting all of these pointers up in to some consecutive bytes.

Of course, that's not to say that there's never Aggregate State structs which don't have any pointers, I've not checked, but in these cases we could (perhaps) just make the serialize and deserialize function a simple memcpy() into a bytea array, although in reality, as you mentioned, we'd likely want to agree on some format that's cross platform for different byte orders, as we'll probably, one day, want to forward these values over to some other server to finish off the aggregation.

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

pgsql-hackers by date:

Previous
From: Joe Conway
Date:
Subject: Re: exposing pg_controldata and pg_config as functions
Next
From: Noah Misch
Date:
Subject: Re: pgindent-polluted commits