Home > mailing lists

Re: Faster distinct query? - Mailing list pgsql-general

From	Israel Brewster
Subject	Re: Faster distinct query?
Date	September 24, 2021 18:03:21
Msg-id	2795367B-80ED-4B44-9354-5C2A6EE15D40@alaska.edu Whole thread Raw
In response to	Re: Faster distinct query? (Michael Lewis <mlewis@entrata.com>)
List	pgsql-general

Tree view

On Sep 23, 2021, at 8:55 PM, Michael Lewis <mlewis@entrata.com> wrote:

It’s only when I add in the AND data.channels=channels.channel that the query time blows up to 5+ minutes. I personally don’t understand why there would be such a large difference between the two queries - something wrong with my indexes?

Did you do the manual vacuum as suggested by Tom?

I ran a VACUUM ANALYZE, yes.

Maybe at least check pg_stat_user_tables for the last vacuum or autovacuum timestamp, and how many records have changed since then.

volcano_seismology=# SELECT seq_scan,seq_tup_read,idx_scan,idx_tup_fetch,n_tup_ins,n_tup_upd,n_tup_del,n_tup_hot_upd,n_live_tup,n_dead_tup,n_mod_since_analyze,n_ins_since_vacuum,last_vacuum,last_autovacuum,last_analyze,last_autoanalyze,vacuum_count,autovacuum_count,analyze_count,autoanalyze_count FROM pg_stat_user_tables WHERE relname='data';
-[ RECORD 1 ]-------+------------------------------
seq_scan   | 95
seq_tup_read   | 25899340540
idx_scan   | 728372
idx_tup_fetch     | 51600217033
n_tup_ins     | 840283699
n_tup_upd     | 66120702
n_tup_del     | 2375651
n_tup_hot_upd     | 0
n_live_tup   | 839266956
n_dead_tup   | 66585751
n_mod_since_analyze | 58896052
n_ins_since_vacuum  | 24890460
last_vacuum     | 2021-09-22 21:32:11.367855+00
last_autovacuum     | 2021-09-14 07:13:23.745862+00
last_analyze   | 2021-09-22 21:32:21.071092+00
last_autoanalyze   | 2021-09-21 11:54:36.924762+00
vacuum_count   | 1
autovacuum_count   | 1
analyze_count     | 1
autoanalyze_count   | 2

Note that the update count was due to a (hopefully) one-time process where I had to change the value of a bunch of records. Generally this *should be* an insert-once-read-many database.

The system is deciding on an index scan because that should be very fast, but if it needs to look at the actual table table to determine if the process executing that query should still be allowed to see that tuple (row version) then the index only scan slows down a lot I believe. The huge number of heap fetches that Tom pointed out mean that the system doesn't know that ALL processes should be able to see those tuples. As someone else suggested lowering the max freeze age, you might just do a manual "vacuum freeze" as well to allow future auto vacuum processes to quickly skip over those pages as well.

Ok, ran a VACUUM (ANALYZE, FREEZE), and am now showing this from pg_stat_user_tables:

However, adding the AND data.channels=channels.channel to the query still makes it take around 5 minutes (https://explain.depesz.com/s/7hb1). So, again, running VACUUM didn’t appear to help any.

Also perhaps interestingly, if I again modify the query to only match on channel, not station, it is again fast (though not quite as fast): https://explain.depesz.com/s/HLb8

So, basically, I can quickly get a list of all channels for which I have data, or all stations for which I have data, but getting a list of all channels for each station is slow.

I've heard of the loose indexscan before mentioned on this thread, but I'm not seeing how to implement that for multiple columns. Anyone have an example or pseudo code perhaps?

This is my stumbling block to implementing this option as well. That said, with the lookup table in place, perhaps all the rest of this is a moot point?

---

Israel Brewster
Software Engineer
Alaska Volcano Observatory
Geophysical Institute - UAF
2156 Koyukuk Drive
Fairbanks AK 99775-7320

Work: 907-474-5172
cell: 907-328-9145

pgsql-general by date:

From: "Clive Swan"
Date: 24 September 2021, 10:48:00
Subject: RE: Get COUNT results from two different columns

From: Garfield Lewis
Date: 24 September 2021, 21:43:54
Subject: Using XMLNAMESPACES with XMLEMENT

Re: Faster distinct query? - Mailing list pgsql-general

Previous

Next