Re: Faster distinct query? - Mailing list pgsql-general

From David G. Johnston
Subject Re: Faster distinct query?
Date
Msg-id CAKFQuwYJbD2Xy+jNx30zdLmm7Xk6ZcP1_2N_hJqyqAcoYfB8SQ@mail.gmail.com
Whole thread Raw
In response to Faster distinct query?  (Israel Brewster <ijbrewster@alaska.edu>)
Responses Re: Faster distinct query?  (Israel Brewster <ijbrewster@alaska.edu>)
Re: Faster distinct query?  (Mladen Gogala <gogala.mladen@gmail.com>)
List pgsql-general
On Wed, Sep 22, 2021 at 1:05 PM Israel Brewster <ijbrewster@alaska.edu> wrote:
To work around the issue, I created a materialized view that I can update periodically, and of course I can query said view in no time flat. However, I’m concerned that as the dataset grows, the time it takes to refresh the view will also grow (correct me if I am wrong there).

I'd probably turn that index into a foreign key that just ensures that every (station,channel) that appears in the data table also appears on the lookup table.  Grouping and array-ifying the lookup table would be trivial.  Either modify the application code or add a trigger to populate the lookup table as needed.

The parentheses around channel in "array_agg(distinct(channel))" are unnecessary - you are invoking composite-type syntax, which is ignored in the single column case unless you write the optional ROW keyword, i.e., distinct ROW(channel)
David J.

pgsql-general by date:

Previous
From: Israel Brewster
Date:
Subject: Faster distinct query?
Next
From: Michael Lewis
Date:
Subject: Re: Faster distinct query?