Re: [HACKERS] AGG_HASHED cost estimate - Mailing list pgsql-hackers

From Jeff Janes
Subject Re: [HACKERS] AGG_HASHED cost estimate
Date
Msg-id CAMkU=1zuRcvc+6t=MezThR1NATfngVpWXNBaCKnpuD1Mw=PQYw@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] AGG_HASHED cost estimate  (Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>)
Responses Re: [HACKERS] AGG_HASHED cost estimate  (Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>)
List pgsql-hackers
On Thu, Apr 20, 2017 at 3:46 AM, Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> wrote:
On Thu, Apr 20, 2017 at 7:47 AM, Jeff Janes <jeff.janes@gmail.com> wrote:
>
> In cost_size.c, there is this comment block:
>
> +        * Note: in this cost model, AGG_SORTED and AGG_HASHED have exactly
> the
> +        * same total CPU cost, but AGG_SORTED has lower startup cost.  If
> the
> +        * input path is already sorted appropriately, AGG_SORTED should be
> +        * preferred (since it has no risk of memory overflow).  This will
> happen
> +        * as long as the computed total costs are indeed exactly equal ---
> but
> +        * if there's roundoff error we might do the wrong thing.  So be
> sure
> +        * that the computations below form the same intermediate values in
> the
> +        * same order.
>
> But, why should they have the same cost in the first place?  A sorted
> aggregation just has to do an equality comparison on each adjacent pair,
> which is costed as (cpu_operator_cost * numGroupCols) * input_tuples.   A
> hash aggregation has to do a hashcode computation for each input, which
> apparently is also costed at (cpu_operator_cost * numGroupCols) *
> input_tuples.

I suspect we don't cost this.

> But it also has to do the equality comparison between the
> input tuple and any aggregation it is about to be aggregated into, to make
> sure the hashcode is not a false collision.  That should be another
> (cpu_operator_cost * numGroupCols) * (input_tuples - numGroups), shouldn't
> it?

but cost this without numGroups.

    /*
     * The transCost.per_tuple component of aggcosts should be charged once
     * per input tuple, corresponding to the costs of evaluating the aggregate
     * transfns and their input expressions (with any startup cost of course
     * charged but once).  The finalCost component is charged once per output
     * tuple, corresponding to the costs of evaluating the finalfns.
     *
     * If we are grouping, we charge an additional cpu_operator_cost per
     * grouping column per input tuple for grouping comparisons.
     *


The hash join code uses cpu_operator_cost * num_hashclauses (in initial_cost_hashjoin), so I was going by analogy with that in interpreting what kind of grouping comparison it was referring to here--the initial hash comparison or the final full-data comparison.  But yeah, I can see how it was probably meant the other way.  Regardless, it seems like something is getting overlooked.  The way final_cost_hashjoin charges for the actual data comparison is via pg_proc.procost, rather than just assuming 1.0.  I don't know if we would want to go to that effort in cost_agg or not; I assume that there was a reason the code was put in final_cost_hashjoin rather than initial_cost_hashjoin.  Anyway, assuming 1.0 is going to be a lot closer to reality than assuming 0.0 is, if we don't want to go through the work of looking up the actual procost.

The initial_cost_hashjoin also throws in an addition of cpu_tuple_cost, "to model the costs of inserting the row into the hashtable". Based on the gprof and perf output of some very simple aggregates, I would say that cpu_tuple_cost is if anything an underestimate, and that it applies to all the hash table look ups, whether they end up inserting (about numGroups) or finding an existing one (approximately input_tuples - numGroups).  Currently in AGG_HASHED that is charged only for numGroups, although I don't know if that charge is for inserting into the hash table, or for walking the hash table at the end, projecting out tuples.   That it is charged to total_cost rather than startup_cost suggests it is meant to apply to walking the hash table at the end, rather than inserting into it.  Maybe it should be charged both on the way in and on the way out?
  
 

The reason may be that hashing isn't as costly as a comparison. I
don't how true is that.

In a simple test case aggregating on md5(random()::text) strings, according to gprof, it spends about 3 times more time in texteq than it does in hash_any.  (And ~10% of those hash_any calls are not part of the AGG_HASHED, but other code paths ).  But according to perf, it is only about 40% more time in texteq.  I think I'd probably go with perf over gprof here.

Both gprof and perf agree that tuplehash_insert and ExecStoreMinimalTuple are quite a bit more expensive than either texteq or hash_any.  This is with large hash tables (25 million tuples hashed to 3 million aggregates) and I think a lot of the time goes to CPU cache misses, so they might not be so bad if the hash tables were smaller.  I don't know how to model this, though, if we need it to be accurate over both regimes.


Cheers,

Jeff

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [HACKERS] Fwd: WIP Patch: Precalculate stable functions
Next
From: Peter Eisentraut
Date:
Subject: Re: [HACKERS] Logical replication and inheritance