Re: Showing applied extended statistics in explain - Mailing list pgsql-hackers

From Dmitry Dolgov
Subject Re: Showing applied extended statistics in explain
Date
Msg-id 20210729140954.e5ptavuk6zpwysc6@localhost
Whole thread Raw
In response to Re: Showing applied extended statistics in explain  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Responses Re: Showing applied extended statistics in explain
List pgsql-hackers
> On Tue, Jul 27, 2021 at 10:20:34PM +0200, Tomas Vondra wrote:
>
> >> 1) The information is stashed in multiple lists added to a Plan. Maybe
> >> there's a better place, and maybe we need to invent a better way to
> >> track the info (a new node stashed in a single List).
> >>
> >> ...
> >>
> >> 3) It does not work for functional dependencies, because we effectively
> >> "merge" all functional dependencies and apply the entries. Not sure how
> >> to display this, but I think it should show the individual dependencies
> >> actually applied.
> >>
> >> ...
> >>
> >> 5) It includes just statistics name + clauses, but maybe we should
> >> include additional info (e.g estimate for that combination of clauses).
> >
> > Yes, a new node would be nice to have. The other questions above are
> > somewhat related to what it should contain, and I guess it depends on
> > the use case this patch targets. E.g. for the case "help to figure out
> > if an extended statistics was applied" even "merged" functional
> > dependencies would be fine I believe.
>
> What do you mean by "merged" functional dependencies? I guess we could
> say "these clauses were estimated using dependencies" without listing
> which exact dependencies were applied.

Yes, that's exactly what I was thinking. From the original email I've
got an impression that in case of functional dependencies you plan to
display the info only with the individual dependencies (when
implemented) or nothing at all. By "merged" I wanted to refer to the
statement about "merge" all functional dependencies and apply the
entries.

> > More advanced plan troubleshooting may benefit from estimates.
>
> I'm sorry, I don't know what you mean by this. Can you elaborate?

Yeah, sorry for not being clear. The idea was that the question about including
"additional info" strongly depends on which use cases the patch tries to
address, and I follow up on that further. There is no more hidden detailed
meaning here :)

> > I've got few more questions after reading the patch:
> >
> > * Is there any particular reason behind choosing only some scan nodes to
> >   display extended stats? E.g. BitmapHeapScan is missing.
> >
>
> All nodes that may apply extended stats to estimate quals should include
> this info. I guess BitmapHeapScan may do that for the filter, right?
>

Yes, something like this (stats output added by me):

     Bitmap Heap Scan on public.tenk1
       Output: unique1
       Recheck Cond: (tenk1.unique1 < 1000)
       Filter: (tenk1.stringu1 = 'xxx'::name)
       Statistics: public.s Clauses: ((unique1 < 1000) AND (stringu1 = 'xxx'::name))
       ->  Bitmap Index Scan on tenk1_unique1
             Index Cond: (tenk1.unique1 < 1000)



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Out-of-memory error reports in libpq
Next
From: Masahiko Sawada
Date:
Subject: Re: [PoC] Improve dead tuple storage for lazy vacuum