Re: Combining Aggregates - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Combining Aggregates
Date
Msg-id CA+TgmoZ31om+auYEYb6W7e5HSQGLQzDJ1JOVajbs5+w1w9csXg@mail.gmail.com
Whole thread Raw
In response to Re: Combining Aggregates  (David Rowley <david.rowley@2ndquadrant.com>)
Responses Re: Combining Aggregates  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Tue, Apr 5, 2016 at 9:30 AM, David Rowley
<david.rowley@2ndquadrant.com> wrote:
> On 6 April 2016 at 01:01, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Tue, Apr 5, 2016 at 3:55 AM, David Rowley
>>> To be really honest, I'm quite worried that if I go and make this
>>> change then my time might be wasted as I really think making that
>>> change this late in the day is just setting this up for failure.  I
>>> really don't think we can bat this patch over the Pacific Ocean too
>>> many times before we find ourselves inside the feature freeze. Of
>>> course, if you really think it's no good, that's different, it can't
>>> be committed, but "I think it might be better" does not seem like a
>>> solid enough argument for me to want to risk trying this and delaying
>>> further for that.
>>
>> I think I agree.  Certainly, with what we've got here today, these are
>> not user-exposed, so I think we could certainly change them next time
>> around if need be.  If they ever become user-exposed, then maybe we
>> should think a little harder.
>
> I understand your concern. Painting ourselves into a corner would be pretty bad.
>
> Although, I'm not so sure we'd want to go down the route of trying to
> stabilise the format, even into something human readable. As far as I
> know we've never mentioned anything about what's stored in the
> internal aggregate states in the documentation, and we've made some
> fairly hefty changes to these over the last few releases. For example;
> 959277a4, where internal int128 support was added to improve
> performance of various aggregates, like sum(bigint). I think if we're
> going to head down the route of trying to keep this format stable,
> then we're going to struggle to make improvements to aggregates in the
> future.

That might be true, but it's much more likely to be true if the
internal format is an opaque blob.  If the serialized transition state
for average is {count,sum} then we can switch between having the
internal representation being numeric and having it be int128 and
nothing breaks.  With the opaque blob, maybe nothing breaks, or maybe
something does.

> It's also not all that clear that all of the internal fields really
> make sense to other databases, for example NumericAggState's maxScale
> and maxScaleCount. These really only are needed for moving aggregates.
> Would we want some other database to have to magic up some numbers for
> these fields because our format requires it?

Well, if we want to do cross-node aggregation with that database on
the other side, those numbers are going to have to get magicked up
someplace along the line.

I'm going to concede the point that this shouldn't really be a
priority for 9.6, but I might want to come back to it later.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: dealing with extension dependencies that aren't quite 'e'
Next
From: Alexander Korotkov
Date:
Subject: Re: Move PinBuffer and UnpinBuffer to atomics