On Tue, 20 Oct 2020 at 22:38, Mats Julian Olsen <mats@duneanalytics.com> wrote: > > The crux of our issue is that the query planner chooses a nested loop join for this query. Essentially making this query (and other queries) take a very long time to complete. In contrast, by toggling `enable_nestloop` and `enable_seqscan` off we can take the total runtime down from 16 minutes to 2 minutes. > > 1) Vanilla plan (16 min) : https://explain.depesz.com/s/NvDR > 2) enable_nestloop=off (4 min): https://explain.depesz.com/s/buKK > 3) enable_nestloop=off; enable_seqscan=off (2 min): https://explain.depesz.com/s/0WXx > > How can I get Postgres not to loop over 12M rows?
You'll likely want to look at what random_page_cost is set to. If the planner is preferring nested loops then it may be too low. You'll also want to see if effective_cache_size is set to something realistic. Higher values of that will prefer nested loops like this.
random_page_cost is 1.1 and effective_cache_size is '60GB' (listed in the gist). random_page_cost may be too low?
You may also want to reduce max_parallel_workers_per_gather. It looks like you're not getting your parallel workers as often as you'd like. If the planner chooses a plan thinking it's going to get some workers and gets none, then that plan may be inferior the one that the planner would have chosen if it had known the workers would be unavailable.