On Fri, Nov 12, 2021 at 09:12:38PM +0100, Jiří Fejfar wrote:
> * I know that PG is focused on OLTP rather then analytics, but we are happy
> with it at all and do not wish to use another engine for analytical
> queries... isn't somewhere some "PG analytical best practice" available?
It's a good question. Here's some ideas:
I don't think we know what version you're using - that's important, and there's
other ideas here:
https://wiki.postgresql.org/wiki/Slow_Query_Questions
You said that your query was slow "probably after VACUUM ANALYZE".
Is it really faster without stats ? You can do this to see if there was really
a better plan "before":
| begin; DELETE FROM pg_statistic WHERE starelid='thetable'::regclass; explain analyze ...; rollback;
Try enable_nestloop=off for analytic queries;
Test whether jit=off helps you or hurts you (you said that it's already disabled);
You can do other things that can improve estimates, by sacrificing planning time
(which for an analytic query is a small component of the total query time, and
pays off at runtime if you can get a btter plan):
- FKs can help with estimates since pg9.6;
- CREATE STATISTICS;
- ALTER SET STATISTICS or increase default_statistics_target;
- increase from_collapse_limit and join_collapse_limit. But I don't think it
will help your current query plan.
- partitioning data increases planning time, and (if done well) can allow
improved execution plans;
You can REINDEX or maybe CLUSTER during "off hours" to optimize indexes/tables.
BRIN indexes (WITH autoanalyze) are very successful for us, here.
You can monitor your slow queries using auto_explain and/or pg_stat_statements.
You can reduce autovacuum_analyze_threshold to analyze more often.
I'd be interested to hear if others have more suggestions.
--
Justin