Home > mailing lists

Thread: [HACKERS] AGG_HASHED cost estimate

[HACKERS] AGG_HASHED cost estimate

From

Jeff Janes

Date:

20 April 2017, 08:17:12

In cost_size.c, there is this comment block:

+ * Note: in this cost model, AGG_SORTED and AGG_HASHED have exactly the

+ * same total CPU cost, but AGG_SORTED has lower startup cost. If the

+ * input path is already sorted appropriately, AGG_SORTED should be

+ * preferred (since it has no risk of memory overflow). This will happen

+ * as long as the computed total costs are indeed exactly equal --- but

+ * if there's roundoff error we might do the wrong thing. So be sure

+ * that the computations below form the same intermediate values in the

+ * same order.

But, why should they have the same cost in the first place? A sorted aggregation just has to do an equality comparison on each adjacent pair, which is costed as (cpu_operator_cost * numGroupCols) * input_tuples. A hash aggregation has to do a hashcode computation for each input, which apparently is also costed at (cpu_operator_cost * numGroupCols) * input_tuples. But it also has to do the equality comparison between the input tuple and any aggregation it is about to be aggregated into, to make sure the hashcode is not a false collision. That should be another (cpu_operator_cost * numGroupCols) * (input_tuples - numGroups), shouldn't it? Currently, that is apparently assumed to be free.

Is there something here I am overlooking?

Cheers,

Jeff

Re: [HACKERS] AGG_HASHED cost estimate

From

Ashutosh Bapat

Date:

20 April 2017, 16:46:18

On Thu, Apr 20, 2017 at 7:47 AM, Jeff Janes <jeff.janes@gmail.com> wrote:
>
> In cost_size.c, there is this comment block:
>
> +        * Note: in this cost model, AGG_SORTED and AGG_HASHED have exactly
> the
> +        * same total CPU cost, but AGG_SORTED has lower startup cost.  If
> the
> +        * input path is already sorted appropriately, AGG_SORTED should be
> +        * preferred (since it has no risk of memory overflow).  This will
> happen
> +        * as long as the computed total costs are indeed exactly equal ---
> but
> +        * if there's roundoff error we might do the wrong thing.  So be
> sure
> +        * that the computations below form the same intermediate values in
> the
> +        * same order.
>
> But, why should they have the same cost in the first place?  A sorted
> aggregation just has to do an equality comparison on each adjacent pair,
> which is costed as (cpu_operator_cost * numGroupCols) * input_tuples.   A
> hash aggregation has to do a hashcode computation for each input, which
> apparently is also costed at (cpu_operator_cost * numGroupCols) *
> input_tuples.

I suspect we don't cost this.

> But it also has to do the equality comparison between the
> input tuple and any aggregation it is about to be aggregated into, to make
> sure the hashcode is not a false collision.  That should be another
> (cpu_operator_cost * numGroupCols) * (input_tuples - numGroups), shouldn't
> it?

but cost this without numGroups.
   /*    * The transCost.per_tuple component of aggcosts should be charged once    * per input tuple, corresponding to
thecosts of evaluating the aggregate    * transfns and their input expressions (with any startup cost of course    *
chargedbut once).  The finalCost component is charged once per output    * tuple, corresponding to the costs of
evaluatingthe finalfns.    *    * If we are grouping, we charge an additional cpu_operator_cost per    * grouping
columnper input tuple for grouping comparisons.    *

The reason may be that hashing isn't as costly as a comparison. I
don't how true is that.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Re: [HACKERS] AGG_HASHED cost estimate

From

Dilip Kumar

Date:

20 April 2017, 17:37:44

On Thu, Apr 20, 2017 at 4:16 PM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> but cost this without numGroups.
>
>     /*
>      * The transCost.per_tuple component of aggcosts should be charged once
>      * per input tuple, corresponding to the costs of evaluating the aggregate
>      * transfns and their input expressions (with any startup cost of course
>      * charged but once).  The finalCost component is charged once per output
>      * tuple, corresponding to the costs of evaluating the finalfns.
>      *
>      * If we are grouping, we charge an additional cpu_operator_cost per
>      * grouping column per input tuple for grouping comparisons.
>      *
>
> The reason may be that hashing isn't as costly as a comparison. I
> don't how true is that.

Earlier in GatherMerge thread[1], Rushabh mentioned that hashAggregate
is getting picked where actually grouping aggregate with GatherMerge
was faster during actual execution time and he suspected problems with
costing of hashAggregate. Maybe this is one of those?

[1]https://www.postgresql.org/message-id/CAGPqQf2164iV6k-_M75qEZWiCfRarA_SKSmHjc0Uh1rEf5RJrA%40mail.gmail.com


-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] AGG_HASHED cost estimate

From

Jeff Janes

Date:

21 April 2017, 00:05:40

On Thu, Apr 20, 2017 at 3:46 AM, Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> wrote:

On Thu, Apr 20, 2017 at 7:47 AM, Jeff Janes <jeff.janes@gmail.com> wrote:
>
> In cost_size.c, there is this comment block:
>
> + * Note: in this cost model, AGG_SORTED and AGG_HASHED have exactly
> the
> + * same total CPU cost, but AGG_SORTED has lower startup cost. If
> the
> + * input path is already sorted appropriately, AGG_SORTED should be
> + * preferred (since it has no risk of memory overflow). This will
> happen
> + * as long as the computed total costs are indeed exactly equal ---
> but
> + * if there's roundoff error we might do the wrong thing. So be
> sure
> + * that the computations below form the same intermediate values in
> the
> + * same order.
>
> But, why should they have the same cost in the first place? A sorted
> aggregation just has to do an equality comparison on each adjacent pair,
> which is costed as (cpu_operator_cost * numGroupCols) * input_tuples. A
> hash aggregation has to do a hashcode computation for each input, which
> apparently is also costed at (cpu_operator_cost * numGroupCols) *
> input_tuples.

I suspect we don't cost this.

> But it also has to do the equality comparison between the
> input tuple and any aggregation it is about to be aggregated into, to make
> sure the hashcode is not a false collision. That should be another
> (cpu_operator_cost * numGroupCols) * (input_tuples - numGroups), shouldn't
> it?

but cost this without numGroups.

/*
* The transCost.per_tuple component of aggcosts should be charged once
* per input tuple, corresponding to the costs of evaluating the aggregate
* transfns and their input expressions (with any startup cost of course
* charged but once). The finalCost component is charged once per output
* tuple, corresponding to the costs of evaluating the finalfns.
*
* If we are grouping, we charge an additional cpu_operator_cost per
* grouping column per input tuple for grouping comparisons.
*

The hash join code uses cpu_operator_cost * num_hashclauses (in initial_cost_hashjoin), so I was going by analogy with that in interpreting what kind of grouping comparison it was referring to here--the initial hash comparison or the final full-data comparison. But yeah, I can see how it was probably meant the other way. Regardless, it seems like something is getting overlooked. The way final_cost_hashjoin charges for the actual data comparison is via pg_proc.procost, rather than just assuming 1.0. I don't know if we would want to go to that effort in cost_agg or not; I assume that there was a reason the code was put in final_cost_hashjoin rather than initial_cost_hashjoin. Anyway, assuming 1.0 is going to be a lot closer to reality than assuming 0.0 is, if we don't want to go through the work of looking up the actual procost.

The initial_cost_hashjoin also throws in an addition of cpu_tuple_cost, "to model the costs of inserting the row into the hashtable". Based on the gprof and perf output of some very simple aggregates, I would say that cpu_tuple_cost is if anything an underestimate, and that it applies to all the hash table look ups, whether they end up inserting (about numGroups) or finding an existing one (approximately input_tuples - numGroups). Currently in AGG_HASHED that is charged only for numGroups, although I don't know if that charge is for inserting into the hash table, or for walking the hash table at the end, projecting out tuples. That it is charged to total_cost rather than startup_cost suggests it is meant to apply to walking the hash table at the end, rather than inserting into it. Maybe it should be charged both on the way in and on the way out?

The reason may be that hashing isn't as costly as a comparison. I
don't how true is that.

In a simple test case aggregating on md5(random()::text) strings, according to gprof, it spends about 3 times more time in texteq than it does in hash_any. (And ~10% of those hash_any calls are not part of the AGG_HASHED, but other code paths ). But according to perf, it is only about 40% more time in texteq. I think I'd probably go with perf over gprof here.

Both gprof and perf agree that tuplehash_insert and ExecStoreMinimalTuple are quite a bit more expensive than either texteq or hash_any. This is with large hash tables (25 million tuples hashed to 3 million aggregates) and I think a lot of the time goes to CPU cache misses, so they might not be so bad if the hash tables were smaller. I don't know how to model this, though, if we need it to be accurate over both regimes.

Cheers,

Jeff

Re: [HACKERS] AGG_HASHED cost estimate

From

Ashutosh Bapat

Date:

21 April 2017, 11:43:42

On Thu, Apr 20, 2017 at 11:35 PM, Jeff Janes <jeff.janes@gmail.com> wrote:

> Regardless, it seems like something is
> getting overlooked.

I agree with this.

> The way final_cost_hashjoin charges for the actual data
> comparison is via pg_proc.procost, rather than just assuming 1.0.  I don't
> know if we would want to go to that effort in cost_agg or not;

But aren't we already doing that as (cpu_operator_cost * numGroupCols)
* input_tuples. May be we should use procost of cur_eq_funcs instead
of blank cpu_operator_cost.

> I assume that
> there was a reason the code was put in final_cost_hashjoin rather than
> initial_cost_hashjoin.

I think this is part of final_cost_hashjoin because it might need a
pg_proc cache lookup. The lookup can be avoided if initial cost is
higher than the existing path's cost.


>
> The initial_cost_hashjoin also throws in an addition of cpu_tuple_cost, "to
> model the costs of inserting the row into the hashtable". Based on the gprof
> and perf output of some very simple aggregates, I would say that
> cpu_tuple_cost is if anything an underestimate, and that it applies to all
> the hash table look ups, whether they end up inserting (about numGroups) or
> finding an existing one (approximately input_tuples - numGroups).  Currently
> in AGG_HASHED that is charged only for numGroups, although I don't know if
> that charge is for inserting into the hash table, or for walking the hash
> table at the end, projecting out tuples.   That it is charged to total_cost
> rather than startup_cost suggests it is meant to apply to walking the hash
> table at the end, rather than inserting into it.

Yes. It's for final projection.

> Maybe it should be charged
> both on the way in and on the way out?

Hash lookup and insertion is costed as

startup_cost += (cpu_operator_cost * numGroupCols) * input_tuples;

May be that needs to change.


>
> Both gprof and perf agree that tuplehash_insert and ExecStoreMinimalTuple
> are quite a bit more expensive than either texteq or hash_any.  This is with
> large hash tables (25 million tuples hashed to 3 million aggregates) and I
> think a lot of the time goes to CPU cache misses, so they might not be so
> bad if the hash tables were smaller.  I don't know how to model this,
> though, if we need it to be accurate over both regimes.

I have not seen our costs modelling CPU cache behaviour; it assumes
the optimal performance in that case. But may be we want to start
modelling it.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company