Thread: count (DISTINCT field) OVER ()
I would like to implement the equivalent of "count (DISTINCT field) OVER ()": SELECT id, name, similarity(name, 'Tooneyvara') as delta, count (id) OVER() AS cnt FROM vtown WHERE similarity(name, 'Tooneyvara') > 0.1 ORDER BY delta DESC produces result: 1787 Toomyvara 0.5 4 1787 Toomevara 0.4 4 1700 Ardcroney 0.105 4 1788 Townsfield 0.1 4 What I would like is a "3" in the cnt column (ignoring id duplicates). How can I achieve this? The best I can come up with is: SELECT id, name,delta, count (*) OVER() FROM ( SELECT DISTINCT ON (id) id, name, similarity(name, 'Tooneyvara') as delta FROM vtown WHERE similarity(name, 'Tooneyvara') > 0.1 ORDER BY id, delta DESC ) AS x ORDER by delta DESC -- Best Regards, Tarlika Elisabeth Schmitz
On Nov 9, 2011, at 18:52, Tarlika Elisabeth Schmitz <postgresql6@numerixtechnology.de> wrote: > I would like to implement the equivalent of "count (DISTINCT field) OVER > ()": > > > SELECT > id, name, similarity(name, 'Tooneyvara') as delta, > count (id) OVER() AS cnt > FROM vtown > WHERE > similarity(name, 'Tooneyvara') > 0.1 > ORDER BY delta DESC > > produces result: > 1787 Toomyvara 0.5 4 > 1787 Toomevara 0.4 4 > 1700 Ardcroney 0.105 4 > 1788 Townsfield 0.1 4 > > What I would like is a "3" in the cnt column (ignoring id duplicates). > > > How can I achieve this? > > > The best I can come up with is: > > SELECT id, name,delta, count (*) OVER() > FROM ( > SELECT DISTINCT ON (id) > id, name, similarity(name, 'Tooneyvara') as delta > FROM vtown > WHERE > similarity(name, 'Tooneyvara') > 0.1 > ORDER BY id, delta DESC > ) AS x > ORDER by delta DESC > > -- > > Best Regards, > Tarlika Elisabeth Schmitz > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general 1. Write your main query in a WITH (CTE) 2. Query #1 with appropriate GROUP BY clause (CTE) 3. In the main statement JOIN 1 and 2 David J.
Tarlika Elisabeth Schmitz, 10.11.2011 00:52: > I would like to implement the equivalent of "count (DISTINCT field) OVER ()": > > > SELECT > id, name, similarity(name, 'Tooneyvara') as delta, > count (id) OVER() AS cnt > FROM vtown > WHERE > similarity(name, 'Tooneyvara')> 0.1 > ORDER BY delta DESC > > produces result: > 1787 Toomyvara 0.5 4 > 1787 Toomevara 0.4 4 > 1700 Ardcroney 0.105 4 > 1788 Townsfield 0.1 4 > > What I would like is a "3" in the cnt column (ignoring id duplicates). > > This should do it: SELECT id, name, delta, sum(case when rn = 1 then rn else null end) over() as distinct_id_count FROM ( SELECT id, name, similarity(name, 'Tooneyvara') as delta, row_number() OVER(partition by id) AS rn FROM vtown WHERE similarity(name, 'Tooneyvara') > 0.1 ) t ORDER BY delta DESC
On Thu, 10 Nov 2011 10:02:36 +0100 Thomas Kellerer <spam_eater@gmx.net> wrote: >Tarlika Elisabeth Schmitz, 10.11.2011 00:52: >> I would like to implement the equivalent of "count (DISTINCT id) >> OVER ()": >> >>[...] >> >> produces result: >> id, name, delta, cnt >> 1787 Toomyvara 0.5 4 >> 1787 Toomevara 0.4 4 >> 1700 Ardcroney 0.105 4 >> 1788 Townsfield 0.1 4 >> > >This should do it: > >SELECT id, > name, > delta, > sum(case when rn = 1 then rn else null end) over() as > distinct_id_count >FROM ( > SELECT > id, name, similarity(name, 'Tooneyvara') as delta, > row_number() OVER(partition by id) AS rn > FROM vtown > WHERE > similarity(name, 'Tooneyvara') > 0.1 >) t >ORDER BY delta DESC > I like you suggestion, Thomas. It is not that dissimilar from but cleaner than my original SELECT ... FROM (SELECT DISTINCT ON(id) attempt. It's also very slightly faster. Here's another, slightly shorter, variation of your suggestion: SELECT id, name, delta, max(rank) OVER() as cnt FROM ( SELECT id, name, similarity(name, 'Tooneyvara') as delta, dense_rank() OVER(ORDER BY id) AS rank FROM vtown WHERE similarity(name, 'Tooneyvara') > 0.1 ) t ORDER BY delta DESC
Tarlika Elisabeth Schmitz, 10.11.2011 11:24: >> SELECT id, >> name, >> delta, >> sum(case when rn = 1 then rn else null end) over() as >> distinct_id_count >>FROM ( >> SELECT >> id, name, similarity(name, 'Tooneyvara') as delta, >> row_number() OVER(partition by id) AS rn >> FROM vtown >> WHERE >> similarity(name, 'Tooneyvara')> 0.1 >> ) t >> ORDER BY delta DESC >> > > > I like you suggestion, Thomas. It is not that dissimilar from but > cleaner than my original SELECT ... FROM (SELECT DISTINCT ON(id) > attempt. > It's also very slightly faster. > > Here's another, slightly shorter, variation of your suggestion: > > SELECT id, name, delta, > max(rank) OVER() as cnt > FROM ( > SELECT > id, name, similarity(name, 'Tooneyvara') as delta, > dense_rank() OVER(ORDER BY id) AS rank > FROM vtown > WHERE > similarity(name, 'Tooneyvara')> 0.1 > ) t > ORDER BY delta DESC Nice trick with the dense_rank(), never thought of that. Regards Thomas