Thread: short circuit suggestion in find_hash_columns()

short circuit suggestion in find_hash_columns()

From
Zhihong Yu
Date:
Hi,
I was looking at find_hash_columns() in nodeAgg.c

It seems the first loop tries to determine the max column number needed, along with whether all columns are needed.

The loop can be re-written as shown in the patch.

In normal cases, we don't need to perform scanDesc->natts iterations.
In best case scenario, the loop would terminate after two iterations.

Please provide your comment.

Thanks
Attachment

Re: short circuit suggestion in find_hash_columns()

From
David Rowley
Date:
On Sat, 10 Jul 2021 at 03:15, Zhihong Yu <zyu@yugabyte.com> wrote:
> I was looking at find_hash_columns() in nodeAgg.c
>
> It seems the first loop tries to determine the max column number needed, along with whether all columns are needed.
>
> The loop can be re-written as shown in the patch.

This runs during ExecInitAgg().  Do you have a test case where you're
seeing any performance gains from this change?

David



Re: short circuit suggestion in find_hash_columns()

From
Zhihong Yu
Date:


On Fri, Jul 9, 2021 at 8:28 AM David Rowley <dgrowleyml@gmail.com> wrote:
On Sat, 10 Jul 2021 at 03:15, Zhihong Yu <zyu@yugabyte.com> wrote:
> I was looking at find_hash_columns() in nodeAgg.c
>
> It seems the first loop tries to determine the max column number needed, along with whether all columns are needed.
>
> The loop can be re-written as shown in the patch.

This runs during ExecInitAgg().  Do you have a test case where you're
seeing any performance gains from this change?

David

Hi,
I made some attempt in varying related test but haven't seen much difference in performance.

Let me spend more time (possibly in off hours) on this.

Cheers