Re: Spilling hashed SetOps and aggregates to disk - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Spilling hashed SetOps and aggregates to disk
Date
Msg-id 20180605130907.4vmefdhbvkagoq2r@alap3.anarazel.de
Whole thread Raw
In response to Re: Spilling hashed SetOps and aggregates to disk  (David Rowley <david.rowley@2ndquadrant.com>)
Responses Re: Spilling hashed SetOps and aggregates to disk
List pgsql-hackers
Hi,

On 2018-06-06 01:06:39 +1200, David Rowley wrote:
> On 6 June 2018 at 00:57, Andres Freund <andres@anarazel.de> wrote:
> > I think it's ok to only handle this gracefully if serialization is
> > supported.
> >
> > But I think my proposal to continue use a hashtable for the already
> > known groups, and sorting for additional groups would largely address
> > that largely, right?  We couldn't deal with groups becoming too large,
> > but easily with the number of groups becoming too large.
> 
> My concern is that only accounting memory for the group and not the
> state is only solving half the problem. It might be fine for
> aggregates that don't stray far from their aggtransspace, but for the
> other ones, we could still see OOM.

> If solving the problem completely is too hard, then a half fix (maybe
> 3/4) is better than nothing, but if we can get a design for a full fix
> before too much work is done, then isn't that better?

I don't think we actually disagree.  I was really primarily talking
about the case where we can't really do better because we don't have
serialization support.  I mean we could just rescan from scratch, using
a groupagg, but that obviously sucks.

Greetings,

Andres Freund


pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: commitfest 2018-07
Next
From: Andres Freund
Date:
Subject: Re: commitfest 2018-07