Re: Add estimated hit ratio to Memoize in EXPLAIN to explain cost adjustment - Mailing list pgsql-hackers

From Ilia Evdokimov
Subject Re: Add estimated hit ratio to Memoize in EXPLAIN to explain cost adjustment
Date
Msg-id 3ba16aa7-3cb2-4289-b773-522ae2f03a77@tantorlabs.com
Whole thread Raw
In response to Re: Add estimated hit ratio to Memoize in EXPLAIN to explain cost adjustment  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Add estimated hit ratio to Memoize in EXPLAIN to explain cost adjustment
Re: Add estimated hit ratio to Memoize in EXPLAIN to explain cost adjustment
List pgsql-hackers


On 26.03.2025 22:37, Robert Haas wrote:
I also don't think that we should be too concerned about bloating the
EXPLAIN output in the context of a patch that only affects Memoize.
Memoize nodes are not incredibly common in the query plans that I see,
so even if we added another line or three to the output for each one,
I don't think that would create major problems. On the other hand,
maybe there's an argument that what this patch touches is only the tip
of the iceberg, and that we're eventually going to want the same kinds
of things for Nested Loop and Hash Joins and Merge Joins, and maybe
even more detail that can be displayed in say 3 lines. In that case,
there's a double concern. On the one hand, that really would make the
output a whole lot more verbose, and on the other hand, it might
generate a fair amount of work to maintain it across future planner
changes. I can see deciding to reject changes of that sort on the
grounds that we're not prepared to maintain it, or deciding to gate it
behind a new option on the grounds that it is so verbose that even
people who say EXPLAIN VERBOSE are going to be sad if they get all
that crap by default. I'm not saying that we SHOULD make those
decisions -- I think exposing more detail here could be pretty useful
to people trying to solve query plan problems, including me, so I hope
we don't just kick that idea straight to the curb without due thought
-- but I would understand them.

The part I'm least sure about with respect to the proposed patch is
the actual stuff being displayed. I don't have the experience to know
whether it's useful for tracking down issues. If it's not, then I
agree we shouldn't display it. If it is, then I'm tentatively in favor
of showing it in standard EXPLAIN, possibly only with VERBOSE, with
the caveats from the previous paragraph: if more-common node types are
also going to have a bunch of stuff like this, then we need to think
more carefully. If Memoize is exceptional in needing additional
information displayed, then I think it's fine.

--
Robert Haas
EDB: http://www.enterprisedb.com


I understand the concerns raised about the risk of opening the door to more diagnostic detail across various plan nodes. However, in Hash Join, Merge Join, and Nested Loop, EXPLAIN typically reveals at least some of the planner’s expectations. For example, Hash Join shows the number of batches and originally expected buckets, giving insight into whether the hash table fit in memory. Merge Join shows unexpected Sort nodes when presorted inputs were assumed. Nested Loop reflects planner assumptions via loops and row estimates. In other words, these nodes expose at least some information about what the planner thought would happen.

Memoize is unique in that it shows runtime statistics (hits, misses, evictions), but reveals nothing about the planner’s expectations. We don’t see how many distinct keys were estimated or how many entries the planner thought would fit in memory. This makes it very difficult to understand whether Memoize was a good choice or not, or how to fix it when it performs poorly.

--
Best regards,
Ilia Evdokimov,
Tantor Labs LLC.

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: [PATCH] PGSERVICEFILE as part of a normal connection string
Next
From: Peter Eisentraut
Date:
Subject: Re: Thread-safe nl_langinfo() and localeconv()