Re: Inaccurate (sometimes wildly so) row estimates for simple join - Mailing list pgsql-general

From Greg Nolle
Subject Re: Inaccurate (sometimes wildly so) row estimates for simple join
Date
Msg-id CA+wHfn-iS3D5gLQG8fJrJvOZvT7v-dWKJuLPg+qPj=3uv_7RXg@mail.gmail.com
Whole thread Raw
In response to Re: Inaccurate (sometimes wildly so) row estimates for simple join  (Michael Lewis <mlewis@entrata.com>)
List pgsql-general
On 22 May 2020 at 15:30:42, Michael Lewis (mlewis@entrata.com) wrote:
On Fri, May 22, 2020 at 7:27 AM Greg Nolle <greg.nolle@voidbridge.com> wrote:
The crux seems to be that test_b does not have an even distribution for a_id values: it only has records for two of the values in the referenced table. This is how our real dataset is too and isn’t something we can really change.

How does it perform with extended statistics created for these two columns?

I forgot to mention that I had already tried that actually, and it sadly made no difference. I tried both dependencies and ndistinct modes and the estimates remained the same. Having read through the docs I think that’s to be expected since dependencies is designed to help when you have multiple predicates and ndistinct when you’re doing grouping. Neither of which is the case with my query.

For good measure I just tried mcv mode on PG12 with the test script I posted and that also made no difference.

pgsql-general by date:

Previous
From: Michael Lewis
Date:
Subject: Re: Inaccurate (sometimes wildly so) row estimates for simple join
Next
From: Tom Lane
Date:
Subject: Re: btree_gist extension - gbt_cash_union return type