Hi,
On Wed, Oct 28, 2020 at 05:37:40PM +0800, bucoo@sohu.com wrote:
>Hi
>Here is patch for parallel distinct union aggregate and grouping sets support using batch hash agg.
>Please review.
>
>how to use:
>set enable_batch_hashagg = on
>
>how to work:
>like batch sort, but not sort each batch, just save hash value in each rows
>
>unfinished work:
>not support rescan yet. welcome to add. Actually I don't really understand how rescan works in parallel mode.
>
>other:
>patch 1 base on branch master(80f8eb79e24d9b7963eaf17ce846667e2c6b6e6f)
>patch 1 and 2 see https://www.postgresql.org/message-id/2020101922424962544053%40sohu.com
>patch 3:
> extpand shared tuple store and add batch store module.
> By the way, use atomic operations instead LWLock for shared tuple store get next read page.
>patch 4:
> using batch hash agg support parallels
>
Thanks for the patch!
Two generic comments:
1) It's better to always include the whole patch series - including the
parts that have not changed. Otherwise people have to scavenge the
thread and search for all the pieces, which may be a source of issues.
Also, it confuses the patch tester [1] which tries to apply patches from
a single message, so it will fail for this one.
2) I suggest you try to describe the goal of these patches, using some
example queries, explain output etc. Right now the reviewers have to
reverse engineer the patches and deduce what the intention was, which
may be causing unnecessary confusion etc. If this was my patch, I'd try
to create a couple examples (CREATE TABLE + SELECT + EXPLAIN) showing
how the patch changes the query plan, showing speedup etc.
I'd like to do a review and some testing, and this would make it much
easier for me.
kind regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services