[GENERAL] DISTINCT vs GROUP BY - was Re: is (not) distinct from - Mailing list pgsql-general

From George Neuner
Subject [GENERAL] DISTINCT vs GROUP BY - was Re: is (not) distinct from
Date
Msg-id g7rhbcpnpn4jqokptb3tvc9vl7j13q6c1k@4ax.com
Whole thread Raw
In response to [GENERAL] is (not) distinct from  (Johann Spies <johann.spies@gmail.com>)
Responses Re: [GENERAL] DISTINCT vs GROUP BY - was Re: is (not) distinct from  ("Sven R. Kunze" <srkunze@mail.de>)
Re: [GENERAL] DISTINCT vs GROUP BY - was Re: is (not) distinct from  (David Rowley <david.rowley@2ndquadrant.com>)
List pgsql-general
On Wed, 01 Mar 2017 11:12:29 -0500, Tom Lane <tgl@sss.pgh.pa.us>
wrote:

>This is a great example of "select distinct" being used as a band-aid
>over a fundamental misunderstanding of SQL.  It's good advice to never use
>"distinct" unless you know exactly why your query is generating duplicate
>rows in the first place.

On that note:

I know most people here don't pay much - or any - attention to
SQLServer, however there was an interesting article recently regarding
significant performance differences between DISTINCT and GROUP BY as
used to remove duplicates.

https://sqlperformance.com/2017/01/t-sql-queries/surprises-assumptions-group-by-distinct


Now I'm wondering if something similar might be lurking in Postgresql?

[Yeah, I know - test it and find out!

Thing is, the queries used in the article are not simple.  Although
not explicitly stated, it hints that - at least for SQLServer - a
simple case involving a string column is probably insufficient, and
complex scenarios are required to produce significant differences.
]


I'll get around to doing some testing soon.  For now, I am just asking
if anyone has ever run into something like this?

George

pgsql-general by date:

Previous
From: rob stone
Date:
Subject: Re: [GENERAL] CentOS 7.3, PostgreSQL 9.6.2, PHP 5.4 deliver arrayas string
Next
From: Alexander Farber
Date:
Subject: Re: [GENERAL] CentOS 7.3, PostgreSQL 9.6.2, PHP 5.4 deliver array as string