Re: [HACKERS] Partition-wise aggregation/grouping - Mailing list pgsql-hackers

From Antonin Houska
Subject Re: [HACKERS] Partition-wise aggregation/grouping
Date
Msg-id 23356.1493363038@localhost
Whole thread Raw
In response to Re: [HACKERS] Partition-wise aggregation/grouping  (Jeevan Chalke <jeevan.chalke@enterprisedb.com>)
Responses Re: [HACKERS] Partition-wise aggregation/grouping
List pgsql-hackers
Jeevan Chalke <jeevan.chalke@enterprisedb.com> wrote:

> On Thu, Apr 27, 2017 at 4:53 PM, Antonin Houska <ah@cybertec.at> wrote:
>
> > Robert Haas <robertmhaas@gmail.com> wrote:
> > > Well, I'm confused. I see that there's a relationship between what
> > > Antonin is trying to do and what Jeevan is trying to do, but I can't
> > > figure out whether one is a subset of the other, whether they're both
> > > orthogonal, or something else. This plan looks similar to what I
> > > would expect Jeevan's patch to produce,

> >  The point is that the patch Jeevan wanted to work on is actually a subset of
> >  [1] combined with [2].

> Seems like, as you are targeting every relation whether or not it is
> partitioned.

Yes.

> With my patch, I am getting following plan where we push entire
> aggregation below append.
>
> QUERY PLAN
> ------------------------------------------
> Append
> -> HashAggregate
> Group Key: b_1.j
> -> Hash Join
> Hash Cond: (b_1.j = c_1.k)
> -> Seq Scan on b_1
> -> Hash
> -> Seq Scan on c_1
> -> HashAggregate
> Group Key: b_2.j
> -> Hash Join
> Hash Cond: (b_2.j = c_2.k)
> -> Seq Scan on b_2
> -> Hash
> -> Seq Scan on c_2
> (15 rows)

I think this is not generic enough because the result of the Append plan can
be joined to another relation. As such a join can duplicate the
already-aggregated values, the aggregates should not be finalized below the
top-level plan.

> Antonin, I have tried applying your patch on master but it doesn't get
> apply. Can you please provide the HEAD and any other changes required
> to be applied first?

I've lost that information. I'll post a new version to the [1] thread asap.

> How the plan look like when GROUP BY key does not match with the
> partitioning key i.e. GROUP BY b.v ?

EXPLAIN (COSTS false)
SELECT          b.v, avg(b.v + c.v)
FROM            b               JOIN               c ON b.j = c.k
GROUP BY        b.v;
                  QUERY PLAN
------------------------------------------------Finalize HashAggregate  Group Key: b_1.v  ->  Append        ->  Partial
HashAggregate             Group Key: b_1.v              ->  Hash Join                    Hash Cond: (b_1.j = c_1.k)
              ->  Seq Scan on b_1                    ->  Hash                          ->  Seq Scan on c_1        ->
PartialHashAggregate              Group Key: b_2.v              ->  Hash Join                    Hash Cond: (b_2.j =
c_2.k)                   ->  Seq Scan on b_2                    ->  Hash                          ->  Seq Scan on c_2 


> >  [1] https://www.postgresql.org/message-id/9666.1491295317%40localhost
> >
> >  [2] https://commitfest.postgresql.org/14/994/


--
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de, http://www.cybertec.at



pgsql-hackers by date:

Previous
From: Konstantin Knizhnik
Date:
Subject: [HACKERS] Bug in prepared statement cache invalidation?
Next
From: Kyotaro HORIGUCHI
Date:
Subject: [HACKERS] PQhost may return socket dir for network connection