Home > mailing lists

Re: Query plan prefers hash join when nested loop is much faster - Mailing list pgsql-general

From	David Rowley
Subject	Re: Query plan prefers hash join when nested loop is much faster
Date	August 25, 2020 13:36:30
Msg-id	CAApHDvpgtNhKQVPuNCrisKuCB+33BtNK_K2Cn0Cst2muJJPshg@mail.gmail.com Whole thread Raw
In response to	Re: Query plan prefers hash join when nested loop is much faster (iulian dragos <iulian.dragos@databricks.com>)
Responses	Re: Query plan prefers hash join when nested loop is much faster
List	pgsql-general

Tree view

On Tue, 25 Aug 2020 at 22:10, iulian dragos
<iulian.dragos@databricks.com> wrote:
> Thanks for the tip! Indeed, `n_distinct` isn't right. I found it in pg_stats set at 131736.0, but the actual number
ismuch higher: 210104361. I tried to set it manually, but the plan is still the same (both the actual number and a
percentage,-0.4, as you suggested): 

You'll need to run ANALYZE on the table after doing the ALTER TABLE to
change the n_distinct.  The ANALYZE writes the value to pg_statistic.
ALTER TABLE only takes it as far as pg_attribute's attoptions.
ANALYZE reads that column to see if the n_distinct estimate should be
overwritten before writing out pg_statistic

Just remember if you're hardcoding a positive value that it'll stay
fixed until you change it. If the table is likely to grow, then you
might want to reconsider using a positive value and consider using a
negative value as mentioned in the doc link.

David

pgsql-general by date:

From: Peter Eisentraut
Date: 25 August 2020, 13:28:58
Subject: Re: pgbouncer bug?

From: "Peter J. Holzer"
Date: 25 August 2020, 14:24:00
Subject: Re: Most effective and fast way to load few Tbyte of data from flat files into postgresql

Re: Query plan prefers hash join when nested loop is much faster - Mailing list pgsql-general

Previous

Next