Re: Add estimated hit ratio to Memoize in EXPLAIN to explain cost adjustment - Mailing list pgsql-hackers

From Ilia Evdokimov
Subject Re: Add estimated hit ratio to Memoize in EXPLAIN to explain cost adjustment
Date
Msg-id 67b5c84a-dedd-4103-af7e-0117046102fe@tantorlabs.com
Whole thread Raw
In response to Re: Add estimated hit ratio to Memoize in EXPLAIN to explain cost adjustment  (Andrei Lepikhov <lepihov@gmail.com>)
Responses Re: Add estimated hit ratio to Memoize in EXPLAIN to explain cost adjustment
List pgsql-hackers
Then we need to decide clearly what exactly to display in EXPLAIN for 
the Memoize node: absolute values (estimated distinct keys and estimated 
cache capacity) or ratios (hit_ratio and evict_ratio). Ratios have the 
advantage of quickly reflecting the overall effectiveness of Memoize. 
However, absolute values have a significant advantage as they explicitly 
reveal the reason of Memoize's poor performance, making problem 
diagnosis simpler.

With absolute values, users can directly understand the underlying 
reason for poor performance. For example: insufficient memory (capacity 
< distinct keys), inaccurate planner statistics (distinct keys 
significantly different from actual values), poorly ordered keys 
(capacity ~ distinct keys, but frequent evictions as seen in the 
Evictions parameter), or Memoize simply not being beneficial (capacity ~ 
distinct keys ~ calls). Ratios, by contrast, only reflect the final 
outcome without clearly indicating the cause or the specific steps 
needed to resolve the issue.

Thus, absolute values do more than just inform users that a problem 
exists; they provide actionable details that enable users to directly 
address the problem (increase work_mem, refresh statistics, create 
extended statistics, or disable Memoize entirely). Additionally, no 
other plan nodes in PostgreSQL currently use a similar ratio-based 
approach - everywhere else absolute values are consistently shown (e.g., 
number of rows, buckets, batches, memory used, etc.). Using absolute 
values in Memoize maintains consistency with existing practice.

I've updated the patch to v5, since the new parameter est_unique_keys in 
make_memoize() is now placed near est_entries, which is more logical and 
readable than putting it at the end.

Any thoughts?

--
Best Regards,
Ilia Evdokimov,
Tantor Labs LLC.

Attachment

pgsql-hackers by date:

Previous
From: Andrei Lepikhov
Date:
Subject: Re: POC, WIP: OR-clause support for indexes
Next
From: Alexander Pyhalov
Date:
Subject: Re: SQLFunctionCache and generic plans