Re: significant slowdown of HashAggregate between 9.6 and 10 - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: significant slowdown of HashAggregate between 9.6 and 10
Date
Msg-id 20200603193101.bjoegualnuxtjq4m@development
Whole thread Raw
In response to Re: significant slowdown of HashAggregate between 9.6 and 10  (Pavel Stehule <pavel.stehule@gmail.com>)
Responses Re: significant slowdown of HashAggregate between 9.6 and 10
List pgsql-hackers
Hi,

Not sure what's the root cause, but I can reproduce it. Timings for 9.6,
10 and master (all built from git with the same options) without explain
analyze look like this:

9.6
-----------------
Time: 1971.314 ms
Time: 1995.875 ms
Time: 1997.408 ms
Time: 2069.913 ms
Time: 2004.196 ms

10
-----------------------------
Time: 2815.434 ms (00:02.815)
Time: 2862.589 ms (00:02.863)
Time: 2841.126 ms (00:02.841)
Time: 2803.040 ms (00:02.803)
Time: 2805.527 ms (00:02.806)

master
-----------------------------
Time: 3479.233 ms (00:03.479)
Time: 3537.901 ms (00:03.538)
Time: 3459.314 ms (00:03.459)
Time: 3542.810 ms (00:03.543)
Time: 3482.141 ms (00:03.482)

So there seems to be +40% between 9.6 and 10, and further +25% between
10 and master. However, plain hashagg, measured e.g. like this:

   select count(*) cx from foo group by b, c, d having count(*) = 1;

does not indicate any slowdown at all, so I think you're right it has
something to do with the looping.

Profiles from those versions look like this:

9.6
---------------------------------------------------------
Samples
Overhead  Shared Objec  Symbol
   14.19%  postgres      [.] ExecMakeFunctionResultNoSets
   13.65%  postgres      [.] finalize_aggregates
   12.54%  postgres      [.] hash_seq_search
    6.70%  postgres      [.] finalize_aggregate.isra.0
    5.71%  postgres      [.] ExecEvalParamExec
    5.54%  postgres      [.] ExecEvalAggref
    5.00%  postgres      [.] ExecStoreMinimalTuple
    4.34%  postgres      [.] ExecAgg
    4.08%  postgres      [.] ExecQual
    2.67%  postgres      [.] slot_deform_tuple
    2.24%  postgres      [.] pgstat_init_function_usage
    2.22%  postgres      [.] check_stack_depth
    2.14%  postgres      [.] MemoryContextReset
    1.89%  postgres      [.] hash_search_with_hash_value
    1.72%  postgres      [.] project_aggregates
    1.68%  postgres      [.] pgstat_end_function_usage
    1.59%  postgres      [.] slot_getattr


10
------------------------------------------------------------
Samples
Overhead  Shared Object       Symbol
   15.18%  postgres            [.] slot_deform_tuple
   13.09%  postgres            [.] agg_retrieve_hash_table
   12.02%  postgres            [.] ExecInterpExpr
    7.47%  postgres            [.] finalize_aggregates
    7.38%  postgres            [.] tuplehash_iterate
    5.13%  postgres            [.] prepare_projection_slot
    4.86%  postgres            [.] finalize_aggregate.isra.0
    4.05%  postgres            [.] bms_is_member
    3.97%  postgres            [.] slot_getallattrs
    3.59%  postgres            [.] ExecStoreMinimalTuple
    2.85%  postgres            [.] project_aggregates
    1.95%  postgres            [.] ExecClearTuple
    1.71%  libc-2.30.so        [.] __memset_avx2_unaligned_erms
    1.69%  postgres            [.] ExecEvalParamExec
    1.58%  postgres            [.] MemoryContextReset
    1.17%  postgres            [.] slot_getattr
    1.03%  postgres            [.] slot_getsomeattrs


master
--------------------------------------------------------------
Samples
Overhead  Shared Object       Symbol
   17.07%  postgres            [.] agg_retrieve_hash_table
   15.46%  postgres            [.] tuplehash_iterate
   11.83%  postgres            [.] tts_minimal_getsomeattrs
    9.39%  postgres            [.] ExecInterpExpr
    6.94%  postgres            [.] prepare_projection_slot
    4.85%  postgres            [.] finalize_aggregates
    4.27%  postgres            [.] bms_is_member
    3.80%  postgres            [.] finalize_aggregate.isra.0
    3.80%  postgres            [.] tts_minimal_store_tuple
    2.22%  postgres            [.] project_aggregates
    2.07%  postgres            [.] tts_virtual_clear
    2.07%  postgres            [.] MemoryContextReset
    1.78%  postgres            [.] tts_minimal_clear
    1.61%  postgres            [.] ExecEvalParamExec
    1.46%  postgres            [.] slot_getsomeattrs_int
    1.34%  libc-2.30.so        [.] __memset_avx2_unaligned_erms

Not sure what to think about this. Seems slot_deform_tuple got way more
expensive between 9.6 and 10, for some reason. 


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Parallel copy
Next
From: Bruce Momjian
Date:
Subject: Re: Internal key management system