Re: Add estimated hit ratio to Memoize in EXPLAIN to explain cost adjustment - Mailing list pgsql-hackers
From | Ilia Evdokimov |
---|---|
Subject | Re: Add estimated hit ratio to Memoize in EXPLAIN to explain cost adjustment |
Date | |
Msg-id | 3ba16aa7-3cb2-4289-b773-522ae2f03a77@tantorlabs.com Whole thread Raw |
In response to | Re: Add estimated hit ratio to Memoize in EXPLAIN to explain cost adjustment (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: Add estimated hit ratio to Memoize in EXPLAIN to explain cost adjustment
Re: Add estimated hit ratio to Memoize in EXPLAIN to explain cost adjustment |
List | pgsql-hackers |
I also don't think that we should be too concerned about bloating the EXPLAIN output in the context of a patch that only affects Memoize. Memoize nodes are not incredibly common in the query plans that I see, so even if we added another line or three to the output for each one, I don't think that would create major problems. On the other hand, maybe there's an argument that what this patch touches is only the tip of the iceberg, and that we're eventually going to want the same kinds of things for Nested Loop and Hash Joins and Merge Joins, and maybe even more detail that can be displayed in say 3 lines. In that case, there's a double concern. On the one hand, that really would make the output a whole lot more verbose, and on the other hand, it might generate a fair amount of work to maintain it across future planner changes. I can see deciding to reject changes of that sort on the grounds that we're not prepared to maintain it, or deciding to gate it behind a new option on the grounds that it is so verbose that even people who say EXPLAIN VERBOSE are going to be sad if they get all that crap by default. I'm not saying that we SHOULD make those decisions -- I think exposing more detail here could be pretty useful to people trying to solve query plan problems, including me, so I hope we don't just kick that idea straight to the curb without due thought -- but I would understand them. The part I'm least sure about with respect to the proposed patch is the actual stuff being displayed. I don't have the experience to know whether it's useful for tracking down issues. If it's not, then I agree we shouldn't display it. If it is, then I'm tentatively in favor of showing it in standard EXPLAIN, possibly only with VERBOSE, with the caveats from the previous paragraph: if more-common node types are also going to have a bunch of stuff like this, then we need to think more carefully. If Memoize is exceptional in needing additional information displayed, then I think it's fine. -- Robert Haas EDB: http://www.enterprisedb.com
I understand the concerns raised about the risk of opening the door to more diagnostic detail across various plan nodes. However, in Hash Join, Merge Join, and Nested Loop, EXPLAIN typically reveals at least some of the planner’s expectations. For example, Hash Join shows the number of batches and originally expected buckets, giving insight into whether the hash table fit in memory. Merge Join shows unexpected Sort nodes when presorted inputs were assumed. Nested Loop reflects planner assumptions via loops and row estimates. In other words, these nodes expose at least some information about what the planner thought would happen.
Memoize is unique in that it shows runtime statistics (hits, misses, evictions), but reveals nothing about the planner’s expectations. We don’t see how many distinct keys were estimated or how many entries the planner thought would fit in memory. This makes it very difficult to understand whether Memoize was a good choice or not, or how to fix it when it performs poorly.
--
Best regards,
Ilia Evdokimov,
Tantor Labs LLC.
pgsql-hackers by date: