Thread: [HACKERS] Print correct startup cost for the group aggregate.

[HACKERS] Print correct startup cost for the group aggregate.

From
Rushabh Lathia
Date:
Hi,

While reading through the cost_agg() I found that startup cost for the
group aggregate is not correctly assigned. Due to this explain plan is
not printing the correct startup cost.

Without patch:

postgres=# explain select aid, sum(abalance) from pgbench_accounts where filler like '%foo%' group by aid;
                                     QUERY PLAN                                     
-------------------------------------------------------------------------------------
 GroupAggregate  (cost=81634.33..85102.04 rows=198155 width=12)
   Group Key: aid
   ->  Sort  (cost=81634.33..82129.72 rows=198155 width=8)
         Sort Key: aid
         ->  Seq Scan on pgbench_accounts  (cost=0.00..61487.89 rows=198155 width=8)
               Filter: (filler ~~ '%foo%'::text)
(6 rows)

With patch:

postgres=# explain select aid, sum(abalance) from pgbench_accounts where filler like '%foo%' group by aid;
                                     QUERY PLAN                                     
-------------------------------------------------------------------------------------
 GroupAggregate  (cost=82129.72..85102.04 rows=198155 width=12)
   Group Key: aid
   ->  Sort  (cost=81634.33..82129.72 rows=198155 width=8)
         Sort Key: aid
         ->  Seq Scan on pgbench_accounts  (cost=0.00..61487.89 rows=198155 width=8)
               Filter: (filler ~~ '%foo%'::text)
(6 rows)

PFA patch to correct the same.

Regards,
Rushabh Lathia
Attachment

Re: [HACKERS] Print correct startup cost for the group aggregate.

From
Ashutosh Bapat
Date:
On Thu, Mar 2, 2017 at 6:06 PM, Rushabh Lathia <rushabh.lathia@gmail.com> wrote:
> Hi,
>
> While reading through the cost_agg() I found that startup cost for the
> group aggregate is not correctly assigned. Due to this explain plan is
> not printing the correct startup cost.
>
> Without patch:
>
> postgres=# explain select aid, sum(abalance) from pgbench_accounts where
> filler like '%foo%' group by aid;
>                                      QUERY PLAN
> -------------------------------------------------------------------------------------
>  GroupAggregate  (cost=81634.33..85102.04 rows=198155 width=12)
>    Group Key: aid
>    ->  Sort  (cost=81634.33..82129.72 rows=198155 width=8)
>          Sort Key: aid
>          ->  Seq Scan on pgbench_accounts  (cost=0.00..61487.89 rows=198155
> width=8)
>                Filter: (filler ~~ '%foo%'::text)
> (6 rows)
>
> With patch:
>
> postgres=# explain select aid, sum(abalance) from pgbench_accounts where
> filler like '%foo%' group by aid;
>                                      QUERY PLAN
> -------------------------------------------------------------------------------------
>  GroupAggregate  (cost=82129.72..85102.04 rows=198155 width=12)
>    Group Key: aid
>    ->  Sort  (cost=81634.33..82129.72 rows=198155 width=8)
>          Sort Key: aid
>          ->  Seq Scan on pgbench_accounts  (cost=0.00..61487.89 rows=198155
> width=8)
>                Filter: (filler ~~ '%foo%'::text)
> (6 rows)
>

The reason the reason why startup_cost = input_startup_cost and not
input_total_cost for aggregation by sorting is we don't need the whole
input before the Group/Agg plan can produce the first row. But I think
setting startup_cost = input_startup_cost is also not exactly correct.
Before the plan can produce one row, it has to transit through all the
rows belonging to the group to which the first row belongs. On an
average it has to scan (total number of rows)/(number of groups)
before producing the first aggregated row. startup_cost will be
input_startup_cost + cost to scan (total number of rows)/(number of
groups) rows + cost of transiting over those many rows. Total cost =
startup_cost + cost of scanning and transiting through the remaining
number of input rows.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company



Re: [HACKERS] Print correct startup cost for the group aggregate.

From
Ashutosh Bapat
Date:
On Thu, Mar 2, 2017 at 6:48 PM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> On Thu, Mar 2, 2017 at 6:06 PM, Rushabh Lathia <rushabh.lathia@gmail.com> wrote:
>> Hi,
>>
>> While reading through the cost_agg() I found that startup cost for the
>> group aggregate is not correctly assigned. Due to this explain plan is
>> not printing the correct startup cost.
>>
>> Without patch:
>>
>> postgres=# explain select aid, sum(abalance) from pgbench_accounts where
>> filler like '%foo%' group by aid;
>>                                      QUERY PLAN
>> -------------------------------------------------------------------------------------
>>  GroupAggregate  (cost=81634.33..85102.04 rows=198155 width=12)
>>    Group Key: aid
>>    ->  Sort  (cost=81634.33..82129.72 rows=198155 width=8)
>>          Sort Key: aid
>>          ->  Seq Scan on pgbench_accounts  (cost=0.00..61487.89 rows=198155
>> width=8)
>>                Filter: (filler ~~ '%foo%'::text)
>> (6 rows)
>>
>> With patch:
>>
>> postgres=# explain select aid, sum(abalance) from pgbench_accounts where
>> filler like '%foo%' group by aid;
>>                                      QUERY PLAN
>> -------------------------------------------------------------------------------------
>>  GroupAggregate  (cost=82129.72..85102.04 rows=198155 width=12)
>>    Group Key: aid
>>    ->  Sort  (cost=81634.33..82129.72 rows=198155 width=8)
>>          Sort Key: aid
>>          ->  Seq Scan on pgbench_accounts  (cost=0.00..61487.89 rows=198155
>> width=8)
>>                Filter: (filler ~~ '%foo%'::text)
>> (6 rows)
>>
>
> The reason the reason why startup_cost = input_startup_cost and not
> input_total_cost for aggregation by sorting is we don't need the whole
> input before the Group/Agg plan can produce the first row. But I think
> setting startup_cost = input_startup_cost is also not exactly correct.
> Before the plan can produce one row, it has to transit through all the
> rows belonging to the group to which the first row belongs. On an
> average it has to scan (total number of rows)/(number of groups)
> before producing the first aggregated row. startup_cost will be
> input_startup_cost + cost to scan (total number of rows)/(number of
> groups) rows + cost of transiting over those many rows. Total cost =
> startup_cost + cost of scanning and transiting through the remaining
> number of input rows.

The comment below from cost_agg() may be the reason why we don't
bother including the cost of aggregating first group in startup cost.    * Note: in this cost model, AGG_SORTED and
AGG_HASHEDhave exactly the    * same total CPU cost, but AGG_SORTED has lower startup cost.  If the    * input path is
alreadysorted appropriately, AGG_SORTED should be    * preferred (since it has no risk of memory overflow).  This will
happen   * as long as the computed total costs are indeed exactly equal --- but if    * there's roundoff error we might
dothe wrong thing.  So be sure that    * the computations below form the same intermediate values in the same    *
order.

Particularly the last line implies that if we compute intermediate
values differently for the hash and sort aggregates, we have a risk of
floating point rounding errors. Including the cost of computing first
group's in the startup cost does exactly that. Probably having total
cost consistent is more important than having correct startup cost as
long as it's lesser than hash aggregate when the input is sorted per
grouping key.
-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company



Re: [HACKERS] Print correct startup cost for the group aggregate.

From
Robert Haas
Date:
On Thu, Mar 2, 2017 at 6:48 PM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> On Thu, Mar 2, 2017 at 6:06 PM, Rushabh Lathia <rushabh.lathia@gmail.com> wrote:
>> While reading through the cost_agg() I found that startup cost for the
>> group aggregate is not correctly assigned. Due to this explain plan is
>> not printing the correct startup cost.
>>
>> Without patch:
>>
>> postgres=# explain select aid, sum(abalance) from pgbench_accounts where
>> filler like '%foo%' group by aid;
>>                                      QUERY PLAN
>> -------------------------------------------------------------------------------------
>>  GroupAggregate  (cost=81634.33..85102.04 rows=198155 width=12)
>>    Group Key: aid
>>    ->  Sort  (cost=81634.33..82129.72 rows=198155 width=8)
>>          Sort Key: aid
>>          ->  Seq Scan on pgbench_accounts  (cost=0.00..61487.89 rows=198155
>> width=8)
>>                Filter: (filler ~~ '%foo%'::text)
>> (6 rows)
>>
>> With patch:
>>
>> postgres=# explain select aid, sum(abalance) from pgbench_accounts where
>> filler like '%foo%' group by aid;
>>                                      QUERY PLAN
>> -------------------------------------------------------------------------------------
>>  GroupAggregate  (cost=82129.72..85102.04 rows=198155 width=12)
>>    Group Key: aid
>>    ->  Sort  (cost=81634.33..82129.72 rows=198155 width=8)
>>          Sort Key: aid
>>          ->  Seq Scan on pgbench_accounts  (cost=0.00..61487.89 rows=198155
>> width=8)
>>                Filter: (filler ~~ '%foo%'::text)
>> (6 rows)
>>
>
> The reason the reason why startup_cost = input_startup_cost and not
> input_total_cost for aggregation by sorting is we don't need the whole
> input before the Group/Agg plan can produce the first row. But I think
> setting startup_cost = input_startup_cost is also not exactly correct.
> Before the plan can produce one row, it has to transit through all the
> rows belonging to the group to which the first row belongs. On an
> average it has to scan (total number of rows)/(number of groups)
> before producing the first aggregated row. startup_cost will be
> input_startup_cost + cost to scan (total number of rows)/(number of
> groups) rows + cost of transiting over those many rows. Total cost =
> startup_cost + cost of scanning and transiting through the remaining
> number of input rows.

While that idea has some merit, I think it's inconsistent with current
practice.  cost_seqscan(), for example, doesn't include the cost of
reading the first page in the startup cost, even though that certainly
must be done before returning the first row.  I think there have been
previous discussions of switching over to the practice for which you
are advocating here, but my impression (without researching) is that
the current practice is more like what Rushabh did.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] Print correct startup cost for the group aggregate.

From
Ashutosh Bapat
Date:
On Sat, Mar 4, 2017 at 2:50 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Thu, Mar 2, 2017 at 6:48 PM, Ashutosh Bapat
> <ashutosh.bapat@enterprisedb.com> wrote:
>> On Thu, Mar 2, 2017 at 6:06 PM, Rushabh Lathia <rushabh.lathia@gmail.com> wrote:
>>> While reading through the cost_agg() I found that startup cost for the
>>> group aggregate is not correctly assigned. Due to this explain plan is
>>> not printing the correct startup cost.
>>>
>>> Without patch:
>>>
>>> postgres=# explain select aid, sum(abalance) from pgbench_accounts where
>>> filler like '%foo%' group by aid;
>>>                                      QUERY PLAN
>>> -------------------------------------------------------------------------------------
>>>  GroupAggregate  (cost=81634.33..85102.04 rows=198155 width=12)
>>>    Group Key: aid
>>>    ->  Sort  (cost=81634.33..82129.72 rows=198155 width=8)
>>>          Sort Key: aid
>>>          ->  Seq Scan on pgbench_accounts  (cost=0.00..61487.89 rows=198155
>>> width=8)
>>>                Filter: (filler ~~ '%foo%'::text)
>>> (6 rows)
>>>
>>> With patch:
>>>
>>> postgres=# explain select aid, sum(abalance) from pgbench_accounts where
>>> filler like '%foo%' group by aid;
>>>                                      QUERY PLAN
>>> -------------------------------------------------------------------------------------
>>>  GroupAggregate  (cost=82129.72..85102.04 rows=198155 width=12)
>>>    Group Key: aid
>>>    ->  Sort  (cost=81634.33..82129.72 rows=198155 width=8)
>>>          Sort Key: aid
>>>          ->  Seq Scan on pgbench_accounts  (cost=0.00..61487.89 rows=198155
>>> width=8)
>>>                Filter: (filler ~~ '%foo%'::text)
>>> (6 rows)
>>>
>>
>> The reason the reason why startup_cost = input_startup_cost and not
>> input_total_cost for aggregation by sorting is we don't need the whole
>> input before the Group/Agg plan can produce the first row. But I think
>> setting startup_cost = input_startup_cost is also not exactly correct.
>> Before the plan can produce one row, it has to transit through all the
>> rows belonging to the group to which the first row belongs. On an
>> average it has to scan (total number of rows)/(number of groups)
>> before producing the first aggregated row. startup_cost will be
>> input_startup_cost + cost to scan (total number of rows)/(number of
>> groups) rows + cost of transiting over those many rows. Total cost =
>> startup_cost + cost of scanning and transiting through the remaining
>> number of input rows.
>
> While that idea has some merit, I think it's inconsistent with current
> practice.  cost_seqscan(), for example, doesn't include the cost of
> reading the first page in the startup cost, even though that certainly
> must be done before returning the first row.

OTOH, while costing for merge join, initial_cost_mergejoin() seems to
consider the cost of scanning outer and inner relations upto the first
matching tuple on both sides in the startup_cost. Different policies
at different places.

> I think there have been
> previous discussions of switching over to the practice for which you
> are advocating here, but my impression (without researching) is that
> the current practice is more like what Rushabh did.
>

I am not sure Rushabh's approach is correct. Here's the excerpt from my mail.

>> The reason the reason why startup_cost = input_startup_cost and not
>> input_total_cost for aggregation by sorting is we don't need the whole
>> input before the Group/Agg plan can produce the first row.

With Rushabh's approach the startup cost of aggregation by sorting
would be way higher that it's right now. Secondly, it would match that
of hash aggregation and thus forcing hash aggregation to be chosen in
almost all the cases.
-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company



Re: [HACKERS] Print correct startup cost for the group aggregate.

From
Rushabh Lathia
Date:
Thanks Ashutosh & Robert for the explanation.

On Mon, Mar 6, 2017 at 10:02 AM, Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> wrote:
On Sat, Mar 4, 2017 at 2:50 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Thu, Mar 2, 2017 at 6:48 PM, Ashutosh Bapat
> <ashutosh.bapat@enterprisedb.com> wrote:
>> On Thu, Mar 2, 2017 at 6:06 PM, Rushabh Lathia <rushabh.lathia@gmail.com> wrote:
>>> While reading through the cost_agg() I found that startup cost for the
>>> group aggregate is not correctly assigned. Due to this explain plan is
>>> not printing the correct startup cost.
>>>
>>> Without patch:
>>>
>>> postgres=# explain select aid, sum(abalance) from pgbench_accounts where
>>> filler like '%foo%' group by aid;
>>>                                      QUERY PLAN
>>> -------------------------------------------------------------------------------------
>>>  GroupAggregate  (cost=81634.33..85102.04 rows=198155 width=12)
>>>    Group Key: aid
>>>    ->  Sort  (cost=81634.33..82129.72 rows=198155 width=8)
>>>          Sort Key: aid
>>>          ->  Seq Scan on pgbench_accounts  (cost=0.00..61487.89 rows=198155
>>> width=8)
>>>                Filter: (filler ~~ '%foo%'::text)
>>> (6 rows)
>>>
>>> With patch:
>>>
>>> postgres=# explain select aid, sum(abalance) from pgbench_accounts where
>>> filler like '%foo%' group by aid;
>>>                                      QUERY PLAN
>>> -------------------------------------------------------------------------------------
>>>  GroupAggregate  (cost=82129.72..85102.04 rows=198155 width=12)
>>>    Group Key: aid
>>>    ->  Sort  (cost=81634.33..82129.72 rows=198155 width=8)
>>>          Sort Key: aid
>>>          ->  Seq Scan on pgbench_accounts  (cost=0.00..61487.89 rows=198155
>>> width=8)
>>>                Filter: (filler ~~ '%foo%'::text)
>>> (6 rows)
>>>
>>
>> The reason the reason why startup_cost = input_startup_cost and not
>> input_total_cost for aggregation by sorting is we don't need the whole
>> input before the Group/Agg plan can produce the first row. But I think
>> setting startup_cost = input_startup_cost is also not exactly correct.
>> Before the plan can produce one row, it has to transit through all the
>> rows belonging to the group to which the first row belongs. On an
>> average it has to scan (total number of rows)/(number of groups)
>> before producing the first aggregated row. startup_cost will be
>> input_startup_cost + cost to scan (total number of rows)/(number of
>> groups) rows + cost of transiting over those many rows. Total cost =
>> startup_cost + cost of scanning and transiting through the remaining
>> number of input rows.
>
> While that idea has some merit, I think it's inconsistent with current
> practice.  cost_seqscan(), for example, doesn't include the cost of
> reading the first page in the startup cost, even though that certainly
> must be done before returning the first row.

OTOH, while costing for merge join, initial_cost_mergejoin() seems to
consider the cost of scanning outer and inner relations upto the first
matching tuple on both sides in the startup_cost. Different policies
at different places.

> I think there have been
> previous discussions of switching over to the practice for which you
> are advocating here, but my impression (without researching) is that
> the current practice is more like what Rushabh did.
>

I am not sure Rushabh's approach is correct. Here's the excerpt from my mail.

>> The reason the reason why startup_cost = input_startup_cost and not
>> input_total_cost for aggregation by sorting is we don't need the whole
>> input before the Group/Agg plan can produce the first row.

 
With Rushabh's approach the startup cost of aggregation by sorting
would be way higher that it's right now. Secondly, it would match that
of hash aggregation and thus forcing hash aggregation to be chosen in
almost all the cases.

I understood you reasoning of why startup_cost = input_startup_cost and not
input_total_cost for aggregation by sorting. But what I didn't understand is
how come higher startup cost for aggregation by sorting would force hash
aggregation to be chosen? I am not clear about this part.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company



Regards.
Rushabh Lathia

Re: [HACKERS] Print correct startup cost for the group aggregate.

From
Ashutosh Bapat
Date:
>
>
> I understood you reasoning of why startup_cost = input_startup_cost and not
> input_total_cost for aggregation by sorting. But what I didn't understand is
> how come higher startup cost for aggregation by sorting would force hash
> aggregation to be chosen? I am not clear about this part.

See this comment in cost_agg()    * Note: in this cost model, AGG_SORTED and AGG_HASHED have exactly the    * same
totalCPU cost, but AGG_SORTED has lower startup cost.  If the    * input path is already sorted appropriately,
AGG_SORTEDshould be    * preferred (since it has no risk of memory overflow).
 

AFAIU, if the input is already sorted, aggregation by sorting and
aggregation by hashing will have almost same cost, the startup cost of
AGG_SORTED being lower than AGG_HASHED. Because of lower startup cost,
AGG_SORTED gets chosen by add_path() over AGG_HASHED path.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company



Re: [HACKERS] Print correct startup cost for the group aggregate.

From
Robert Haas
Date:
On Sun, Mar 5, 2017 at 11:32 PM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
>> I think there have been
>> previous discussions of switching over to the practice for which you
>> are advocating here, but my impression (without researching) is that
>> the current practice is more like what Rushabh did.
>
> I am not sure Rushabh's approach is correct. Here's the excerpt from my mail.
>
>>> The reason the reason why startup_cost = input_startup_cost and not
>>> input_total_cost for aggregation by sorting is we don't need the whole
>>> input before the Group/Agg plan can produce the first row.
>
> With Rushabh's approach the startup cost of aggregation by sorting
> would be way higher that it's right now. Secondly, it would match that
> of hash aggregation and thus forcing hash aggregation to be chosen in
> almost all the cases.

You're right.  I'm wrong.  I take it all back.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company