On Fri, 24 Apr 2020 at 15:26, Andy Fan <zhihui.fan1213@gmail.com> wrote:
>
> Actually I have a different opinion to handle this issue, to execute the
> a > (select avg(a) from tinner where x = touer.x); The drawback of current
> path is because it may calculates the same touer.x value multi-times. So
> if we cache the values we have calculated before, we can avoid the cost.
> Material path may be the one we can reference but it assumes all the tuples
> in the tuplestore matches the input params, which is not the fact here.
>
> But what if the input params doesn't change? If so we can use Material path
> to optimize this case. But since we don't know if the if the input params changed
> or not during plan time, we just add the path (let's assume we can add it with some
> rules or cost calculation). If the input params is not changed, we use the cached
> values, if the input params changed, we can ReScan the Material node. To optimize
> the the cache invalidation frequent issue like (1, 2, 1, 2, 1, 2) case, we may consider
> a sort path to change the input values to (1, 1, 1, 2, 2, 2). But overall it is a big effort.
This does not seem quite right to me. What you need is some sort of
parameterized materialize. Materialize just reads its subnode and
stores the entire thing input and reuses it any time that it
rescanned.
You likely need something more like what is mentioned in [1]. There's
also a bunch of code from Heikki in the initial email in that thread.
Heikki put it in nodeSubplan.c. I think it should be a node of its
own.
David
[1] https://www.postgresql.org/message-id/CAKJS1f-kAk1cGVvzg9TXCLhPsxx_oFVOrTGSR5yTRXKWntTVFA@mail.gmail.com