Thread: short circuit suggestion in find_hash_columns()
Hi,
I was looking at find_hash_columns() in nodeAgg.c
It seems the first loop tries to determine the max column number needed, along with whether all columns are needed.
The loop can be re-written as shown in the patch.
In normal cases, we don't need to perform scanDesc->natts iterations.
In best case scenario, the loop would terminate after two iterations.
Please provide your comment.
Thanks
Attachment
On Sat, 10 Jul 2021 at 03:15, Zhihong Yu <zyu@yugabyte.com> wrote: > I was looking at find_hash_columns() in nodeAgg.c > > It seems the first loop tries to determine the max column number needed, along with whether all columns are needed. > > The loop can be re-written as shown in the patch. This runs during ExecInitAgg(). Do you have a test case where you're seeing any performance gains from this change? David
On Fri, Jul 9, 2021 at 8:28 AM David Rowley <dgrowleyml@gmail.com> wrote:
On Sat, 10 Jul 2021 at 03:15, Zhihong Yu <zyu@yugabyte.com> wrote:
> I was looking at find_hash_columns() in nodeAgg.c
>
> It seems the first loop tries to determine the max column number needed, along with whether all columns are needed.
>
> The loop can be re-written as shown in the patch.
This runs during ExecInitAgg(). Do you have a test case where you're
seeing any performance gains from this change?
David
Hi,
I made some attempt in varying related test but haven't seen much difference in performance.
Let me spend more time (possibly in off hours) on this.
Cheers