Thanks for taking a look and thinking about this problem.
On Fri, 20 Dec 2024 at 03:49, Matheus Alcantara
<matheusssilv97@gmail.com> wrote:
> static void
> plan_consider_jit(Plan *plan)
> {
> plan->jit = false;
>
> if (jit_enabled)
> {
> Cost total_cost;
>
> total_cost = plan->total_cost * plan->plan_rows;
>
> if (total_cost > jit_above_cost)
> plan->jit = true;
> }
> }
Unfortunately, it's not that easy. "plan_rows" isn't the estimated
number of times the node will be rescanned, it's the number of rows we
expect it to return. Multiplying that by the plan->total_cost is a
nonsensical value.
I made an attempt in [1] to get the nloop estimate down to where it's
needed so we could multiply the total_cost by that. IIRC, a few people
weren't happy about the churn that caused. I didn't come up with an
alternative.
Generally, I think that going with the number of expected evaluations
of the expression is a more flexible option. The total_cost *
est_nloops of a plan node really isn't a great indication of how much
effort will be spent evaluating the expression. The join filter on a
non-parameterised nested loop is likely the upper extreme end, and
maybe something like a join filter on a parameterized nested loop or a
Bitmap Heap Scan recheck condition is maybe the lower extreme where
the expr eval cost just isn't that well correlated to the total node
cost.
I'll restate we don't need perfect right away, but maybe if we come up
with infrastructure that can be incrementally improved, that would be
the best direction to move towards. My best guess at how that might
work was in the email you replied to. However, I admit to not having
spent much time relooking at it or even thinking about it in the past
few years.
David
[1] https://postgr.es/m/CAApHDvoq5VhV%3D2euyjgBN2bC8Bds9Dtr0bG7R%3DreeefJWKJRXA%40mail.gmail.com