But the same advance in v12 which makes it harder to fool with your test case also opens the possibility of fixing your real case.
I think so much more interesting should be long time after query processing - last row was processed in 13ms, but Execution Time was 69ms .. so some cleaning is 56ms - that is pretty long.
Most of the time is not after the clock stops, but before the stepwise ANALYZE clock starts. If you just do an EXPLAIN rather than EXPLAIN ANALYZE, that is also slow. The giant hash table is created during the planning step (or somewhere around there--I notice that EXPLAIN ANALYZE output doesn't count it in what it labels as the planning step--but it is some step that EXPLAIN without ANALYZE does execute, which to me makes it a planning step).
For me, "perf top" shows kernel's __do_page_fault as the top function. tuplehash_iterate does show up at 20% (which I think is overattributed, considering how little the speedup is when dropping ANALYZE), but everything else just looks like kernel memory management code.