Re: Faster distinct query? - Mailing list pgsql-general

From Rob Sargent
Subject Re: Faster distinct query?
Date
Msg-id 670727da-f703-bbd3-f239-9e208130b1f0@gmail.com
Whole thread Raw
In response to Re: Faster distinct query?  (Israel Brewster <ijbrewster@alaska.edu>)
Responses Re: Faster distinct query?  (Israel Brewster <ijbrewster@alaska.edu>)
List pgsql-general
On 9/23/21 10:16 AM, Israel Brewster wrote:
On Sep 23, 2021, at 4:34 AM, Ryan Booz <ryan@timescale.com> wrote:

Heh, I honestly forgot about the recursive CTE. Certainly worth a try and wouldn't require installing other extensions.

This is what depesz is referring to: https://wiki.postgresql.org/wiki/Loose_indexscan

Thanks for the pointer. Will definitely have to spend some time wrapping my brain around that one - I’ve done some CTE’s before, but not recursive that I can recall. Should be fun!

If it helps matters any, my structure is currently the following:

table “stations” listing station details (name, latitude, longitude, etc) with a smallint primary key “id"
table “data” with many (many!) data columns (mostly doubles), a station column that is a smallint referencing the stations table, and a channel column which is a varchar containing the *name* of the channel the data came in on.

I will readily accept that this may not be the best structure for the DB. For example, perhaps the channel column should be normalized out as has been mentioned a couple of times as an option. This would make sense, and would certainly simplify this portion of the project.

If I do go with a lookup table updated by a trigger, what would be the best option for the query the trigger runs - an upset (ON CONFLICT DO NOTHING)? Or a query followed by an insert if needed? The normal case would be that the entry already exists (millions of hits vs only the occasional insert needed).


I would look into pre-loading the lookup table (and pre-emptive maintenance).  Add the foreign key, but not the trigger.

pgsql-general by date:

Previous
From: Israel Brewster
Date:
Subject: Re: Faster distinct query?
Next
From: "David G. Johnston"
Date:
Subject: Re: Get COUNT results from two different columns