hashagg slowdown due to spill changes - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | hashagg slowdown due to spill changes |
Date | |
Msg-id | 20200606041146.slqfg7cuptx27tuy@alap3.anarazel.de Whole thread Raw |
Responses |
Re: hashagg slowdown due to spill changes
Re: hashagg slowdown due to spill changes Re: hashagg slowdown due to spill changes Re: hashagg slowdown due to spill changes |
List | pgsql-hackers |
Hi, While preparing my pgcon talk I noticed that our hash-agg performance degraded noticeably. Looks to me like it's due to the spilling-hashagg changes. Sample benchmark: config: -c huge_pages=on -c shared_buffers=32GB -c jit=0 -c max_parallel_workers_per_gather=0 (largely just to reduce variance) data prep: CREATE TABLE fewgroups_many_rows AS SELECT (random() * 4)::int cat, (random()*10000)::int val FROM generate_series(1, 100000000); VACUUM (FREEZE, ANALYZE) fewgroups_many_rows; test prep: CREATE EXTENSION IF NOT EXISTS pg_prewarm;SELECT pg_prewarm('fewgroups_many_rows', 'buffer'); test: SELECT cat, count(*) FROM fewgroups_many_rows GROUP BY 1; Using best-of-three timing: 12 12221.031 ms master 13855.129 ms While not the end of the world, that's a definitely noticable and reproducible slowdown (~12%). I don't think this is actually an inherent cost, but a question of how the code ended up being organized. Here's a perf diff of profiles for both versions: # Baseline Delta Abs Shared Object Symbol # ........ ......... ................ ......................................... # +6.70% postgres [.] LookupTupleHashEntryHash +6.37% postgres [.] prepare_hash_slot +4.74% postgres [.] TupleHashTableHash_internal.isra.0 20.36% -2.89% postgres [.] ExecInterpExpr 6.31% -2.73% postgres [.] lookup_hash_entries +2.36% postgres [.] lookup_hash_entry +2.14% postgres [.] ExecJustAssignScanVar 2.28% +1.97% postgres [.] ExecScan 2.54% +1.93% postgres [.] MemoryContextReset 3.84% -1.86% postgres [.] SeqNext 10.19% -1.50% postgres [.] tts_buffer_heap_getsomeattrs +1.42% postgres [.] hash_bytes_uint32 +1.39% postgres [.] TupleHashTableHash +1.10% postgres [.] tts_virtual_clear 3.36% -0.74% postgres [.] ExecAgg +0.45% postgres [.] CheckForSerializableConflictOutNeeded 0.25% +0.44% postgres [.] hashint4 5.80% -0.35% postgres [.] tts_minimal_getsomeattrs 1.91% -0.33% postgres [.] heap_getnextslot 4.86% -0.32% postgres [.] heapgettup_pagemode 1.46% -0.32% postgres [.] tts_minimal_clear While some of this is likely is just noise, it's pretty clear that we spend a substantial amount of additional time below lookup_hash_entries(). And looking at the code, I'm not too surprised: Before there was basically one call from nodeAgg.c to execGrouping.c for each tuple and hash table. Now it's a lot more complicated: 1) nodeAgg.c: prepare_hash_slot() 2) execGrouping.c: TupleHashTableHash() 3) nodeAgg.c: lookup_hash_entry() 4) execGrouping.c: LookupTupleHashEntryHash() For each of these data needs to be peeled out of one or more of AggState / AggStatePerHashData / TupleHashTable. There's no way the compiler can know that nothing inside those changes, therefore it has to reload the contents repeatedly. By my look at the profiles, that's where most of the time is going. There's also the issue that the signalling whether to insert / not to insert got unnecessarily complicated. There's several checks: 1) lookup_hash_entry() (p_isnew = aggstate->hash_spill_mode ? NULL : &isnew;) 2) LookupTupleHashEntry_internal() (if (isnew)) 3) lookup_hash_entry() (if (entry == NULL) and if (isnew)) 4) lookup_hash_entries() if (!in_hash_table) Not performance related: I am a bit confused why the new per-hash stuff in lookup_hash_entries() isn't in lookup_hash_entry()? I assume that's because of agg_refill_hash_table()? Why isn't the flow more like this: 1) prepare_hash_slot() 2) if (aggstate->hash_spill_mode) goto 3; else goto 4 3) entry = LookupTupleHashEntry(&hash); if (!entry) hashagg_spill_tuple(); 4) InsertTupleHashEntry(&hash, &isnew); if (isnew) initialize(entry) That way there's again exactly one call to execGrouping.c, there's no need for nodeAgg to separately compute the hash, there's far fewer branches... Doing things this way might perhaps make agg_refill_hash_table() a tiny bit more complicated, but it'll also avoid the slowdown for the vast majority of cases where we're not spilling. Greetings, Andres Freund
pgsql-hackers by date: