Re: Relids in upper relations - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Relids in upper relations
Date
Msg-id 4081.1476052415@sss.pgh.pa.us
Whole thread Raw
In response to Re: Relids in upper relations  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Relids in upper relations  (David Rowley <david.rowley@2ndquadrant.com>)
Re: Relids in upper relations  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
> On Wed, Oct 5, 2016 at 9:42 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I think this is fundamentally wrongheaded.  If we go that route,
>> the only valid relids for any upper path would be the union of all
>> baserel RTIs, ...

> Hmm, but this is only true if the upper steps are always done last.
> Hackers on this list have been hoping to reorder joins with aggregates
> since at least 2008 - probably sooner, but that's when I started
> reading this mailing list.   A simple example is:

> SELECT order_line.order_id, order.customer_id, SUM(order_line.amount)
> FROM order_line, order WHERE order_line.order_id = order.order_id
> GROUP BY 1,2;

> Doing the aggregation step first is likely to be much faster than
> doing the join first here,

Please provide some reason to believe that.  It's the nature of an
aggregate that it's sensitive to the number of rows going through it,
with only a tiny number of exceptions (and SUM ain't one).  So you could
only push it down past joins that won't change the number of rows the
aggregate will process, and how is that going to make it any faster?

I'm also dubious that doing the aggregate first could even be correct
in your example, because I sure don't see how you'd group by
order.customer_id before joining.  But that's an artifact of this
example not a general point.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: autonomous transactions
Next
From: David Rowley
Date:
Subject: Re: Relids in upper relations