Re: Selectivity and row count estimates for JSONB columns - Mailing list pgsql-general

From Tom Lane
Subject Re: Selectivity and row count estimates for JSONB columns
Date
Msg-id 144197.1635967450@sss.pgh.pa.us
Whole thread Raw
In response to Re: Selectivity and row count estimates for JSONB columns  (Joel Perren <joel.perren@gmail.com>)
List pgsql-general
Joel Perren <joel.perren@gmail.com> writes:
> - generic_restriction_selectivity() returns the default value (0.01) for
> data types that Postgres doesn't collect standard MCV and/or histogram
> statistics for. I think this is what happens with Table B which (quite
> correctly) does not have these statistics in pg_stats

There's nothing "correct" about that.  JSONB does have sorting support
(admittedly with a pretty arbitrary sort order), so I'd expect ANALYZE
to collect a histogram as well as MCV values for it.  Perhaps for your
Table B it's omitting the histogram because the MCV list captures the
entire contents of the column?  Or you've got auto-analyze disabled
for some reason?

Anyway, assuming you're using v14, the idea of matchingsel() is to
apply the given restriction clause to all the MCV and histogram
entries to see how many satisfy it [1].  In principle this'll produce
a far better estimate than any fixed default could provide.  If
you're coming out with a crappy estimate, you might be able to
improve matters by increasing the column's statistics target so
that more MCV and histogram entries are collected.

            regards, tom lane

[1] https://www.postgresql.org/message-id/flat/12237.1582833074%40sss.pgh.pa.us



pgsql-general by date:

Previous
From: Bryn Llewellyn
Date:
Subject: Re: to_date() and to_timestamp() with negative years
Next
From: Adrian Klaver
Date:
Subject: Re: to_date() and to_timestamp() with negative years