Home > mailing lists

Re: Faster distinct query? - Mailing list pgsql-general

From	Michael Lewis
Subject	Re: Faster distinct query?
Date	September 22, 2021 23:20:46
Msg-id	CAHOFxGrvxWFTxNQZNe-cWgf__5ErEZ5JUoFzHNXGGnK9=oqX-A@mail.gmail.com Whole thread Raw
In response to	Faster distinct query? (Israel Brewster <ijbrewster@alaska.edu>)
Responses	Re: Faster distinct query? Re: Faster distinct query?
List	pgsql-general

Tree view

In the future, please share the plan returned by explain analyze, and some data about how many rows in the involved tables, what type of system you are running it on, any changes from default configs, etc.

How does the below work? It should do a very simple index scan only, then aggregate the relative few rows after the fact.

select station, array_agg(distinct(channel)) as channels

FROM(

SELECT station,channel FROM data GROUP BY station,channel

) AS sub

group by station;

If there is correlation between station & channel, then you might look at creating a multivariate statistics object and analyzing the table so the planner can make better choices, knowing that channel is dependent on station perhaps. I wouldn't necessarily think that it would help this query, but perhaps others. Also, you might try creating only dependencies, only ndistinct type, or some combination other than all 3 types.

ref- https://www.postgresql.org/docs/current/sql-createstatistics.html

pgsql-general by date:

From: "David G. Johnston"
Date: 22 September 2021, 23:20:28
Subject: Re: Faster distinct query?

From: "David G. Johnston"
Date: 22 September 2021, 23:26:55
Subject: Re: Faster distinct query?

Re: Faster distinct query? - Mailing list pgsql-general

Previous

Next