Home > mailing lists

Re: performance of analytical query - Mailing list pgsql-performance

From	Justin Pryzby
Subject	Re: performance of analytical query
Date	November 23, 2021 22:42:35
Msg-id	20211123194234.GV17618@telsasoft.com Whole thread Raw
In response to	Re: performance of analytical query (Jiří Fejfar <jurafejfar@gmail.com>)
List	pgsql-performance

Tree view

On Fri, Nov 12, 2021 at 09:12:38PM +0100, Jiří Fejfar wrote:
> * I know that PG is focused on OLTP rather then analytics, but we are happy
> with it at all and do not wish to use another engine for analytical
> queries... isn't somewhere some "PG analytical best practice" available?

It's a good question. Here's some ideas:

I don't think we know what version you're using - that's important, and there's
other ideas here:
https://wiki.postgresql.org/wiki/Slow_Query_Questions

You said that your query was slow "probably after VACUUM ANALYZE".
Is it really faster without stats ? You can do this to see if there was really
a better plan "before":
| begin; DELETE FROM pg_statistic WHERE starelid='thetable'::regclass; explain analyze ...; rollback;

Try enable_nestloop=off for analytic queries;

Test whether jit=off helps you or hurts you (you said that it's already disabled);

You can do other things that can improve estimates, by sacrificing planning time
(which for an analytic query is a small component of the total query time, and
pays off at runtime if you can get a btter plan):
- FKs can help with estimates since pg9.6;
- CREATE STATISTICS;
- ALTER SET STATISTICS or increase default_statistics_target;
- increase from_collapse_limit and join_collapse_limit. But I don't think it
will help your current query plan.
- partitioning data increases planning time, and (if done well) can allow
improved execution plans;

You can REINDEX or maybe CLUSTER during "off hours" to optimize indexes/tables.

BRIN indexes (WITH autoanalyze) are very successful for us, here.

You can monitor your slow queries using auto_explain and/or pg_stat_statements.

You can reduce autovacuum_analyze_threshold to analyze more often.

I'd be interested to hear if others have more suggestions.

--
Justin

pgsql-performance by date:

From: Robert Creager
Date: 19 November 2021, 21:47:45
Subject: Re: Need help identifying a periodic performance issue.

From: aditya desai
Date: 24 November 2021, 08:27:37
Subject: Out of memory error

Re: performance of analytical query - Mailing list pgsql-performance

Previous

Next