Home > mailing lists

Re: Questions regarding distinct operation implementation - Mailing list pgsql-hackers

From	Ankit Kumar Pandey
Subject	Re: Questions regarding distinct operation implementation
Date	November 24, 2022 17:57:11
Msg-id	60471090-3d24-565f-cdbb-5150a8895780@gmail.com Whole thread Raw
In response to	Questions regarding distinct operation implementation (Ankit Kumar Pandey <itsankitkp@gmail.com>)
Responses	Re: Questions regarding distinct operation implementation
List	pgsql-hackers

Tree view

On 23/11/22 23:48, Ankit Kumar Pandey wrote:

Hello,

I have questions regarding distinct operation and would be glad if someone could help me out.
Consider the following table (mytable):
id, name
1, A
1, A
2, B
3, A
1, A
If we do select avg(id) over (partition by name) from mytable, partition logic goes like this:
for A: 1, 1, 3, 1
If we want to implement something like this select avg(distinct id) over (partition by name) from mytable
and remove duplicate by storing last datum of aggregate column (id) and comparing it with current value. It fails here because aggregate column is not sorted within the partition.
Questions:
1. Is sorting prerequisite for finding distinct values?
2. Is it okay to sort aggregate column (within partition) for distinct to work in case of window function?
3. Is an alternative way exists to handle this scenario (because here sort is of no use in aggregation)?

Thanks
-- 
Regards,
Ankit Kumar Pandey

Hi,

After little more digging, I can see that aggregation on Window functions are of running type, it would be bit more effective if a lookup hashtable is created where every value in current aggregate column get inserted. Whenever frame moves ahead, a lookup if performed for presence of duplicate.

On performance standpoint, this might be bad idea though.

Please let me know any opinions on this.

-- 
Regards,
Ankit Kumar Pandey

pgsql-hackers by date:

From: Simon Riggs
Date: 24 November 2022, 17:42:31
Subject: Re: Add 64-bit XIDs into PostgreSQL 15

From: Cary Huang
Date: 24 November 2022, 18:15:39
Subject: Re: Patch: Global Unique Index

Re: Questions regarding distinct operation implementation - Mailing list pgsql-hackers

Previous

Next