Home > mailing lists

Re: Faster distinct query? - Mailing list pgsql-general

From	Geoff Winkless
Subject	Re: Faster distinct query?
Date	September 23, 2021 18:36:48
Msg-id	CAEzk6fdsP_CfOe3bD3otw+J0rXBVQ2-Z+z_rKwnwbtpo9Fu_aQ@mail.gmail.com Whole thread
In response to	Faster distinct query? (Israel Brewster <ijbrewster@alaska.edu>)
Responses	Re: Faster distinct query?
List	pgsql-general

Tree view

On Wed, 22 Sept 2021 at 21:05, Israel Brewster <ijbrewster@alaska.edu> wrote:

I was wondering if there was any way to improve the performance of this query:

SELECT station,array_agg(distinct(channel)) as channels FROM data GROUP BY station;

If you have tables of possible stations and channels (and if not, why not?), then an EXISTS query, something like

SELECT stations.name, ARRAY_AGG(channels.name)
FROM stations, channels
WHERE EXISTS
(SELECT FROM data WHERE data.channels=channels.name AND data.station=stations.name)
GROUP BY stations.name

will usually be much faster, because it can stop scanning after the first match in the index.

Geoff

pgsql-general by date:

From: Rob Sargent
Date: 23 September 2021, 17:08:38
Subject: Re: Faster distinct query?

From: Jaime Solorzano
Date: 23 September 2021, 19:57:27
Subject: Postgres incremental backups per db (not per cluster)

Re: Faster distinct query? - Mailing list pgsql-general

Previous

Next