Home > mailing lists

Re: Parallel grouping sets - Mailing list pgsql-hackers

From	Pengzhou Tang
Subject	Re: Parallel grouping sets
Date	February 10, 2020 06:37:19
Msg-id	CAG4reASx+p3z5W59O57xhPC57Z4MW3mSnE=s-MJGromnmgg2fA@mail.gmail.com Whole thread Raw
In response to	Re: Parallel grouping sets (Jesse Zhang <sbjesse@gmail.com>)
Responses	Re: Parallel grouping sets (Richard Guo <guofenglinux@gmail.com>)
List	pgsql-hackers

Tree view

Thanks to reviewing those patches.

Ha, I believe you meant to say a "normal aggregate", because what's
performed above gather is no longer "grouping sets", right?

The group key idea is clever in that it helps "discriminate" tuples by
their grouping set id. I haven't completely thought this through, but my
hunch is that this leaves some money on the table, for example, won't it
also lead to more expensive (and unnecessary) sorting and hashing? The
groupings with a few partials are now sharing the same tuplesort with
the groupings with a lot of groups even though we only want to tell
grouping 1 *apart from* grouping 10, not neccessarily that grouping 1
needs to come before grouping 10. That's why I like the multiplexed pipe
/ "dispatched by grouping set id" idea: we only pay for sorting (or
hashing) within each grouping. That said, I'm open to the criticism that
keeping multiple tuplesort and agg hash tabes running is expensive in
itself, memory-wise ...

Cheers,
Jesse

That's something we need to testing, thanks. Meanwhile, for the approach to

use "normal aggregate" with grouping set id, one concern is that it cannot use

Mixed Hashed which means if a grouping sets contain both non-hashable or

non-sortable sets, it will fallback to one-phase aggregate.

pgsql-hackers by date:

From: nuko yokohama
Date: 10 February 2020, 06:34:32
Subject: Re: Implementing Incremental View Maintenance

From: Amit Langote
Date: 10 February 2020, 06:54:10
Subject: Re: Identifying user-created objects

Re: Parallel grouping sets - Mailing list pgsql-hackers

Previous

Next