Re: Projection pushdown to index access method - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: Projection pushdown to index access method |
Date | |
Msg-id | CA+TgmobizmMY1DXNqLriOOaVBBUrFruh_CAzkMh-6556mk873A@mail.gmail.com Whole thread Raw |
In response to | Projection pushdown to index access method (Chris Cleveland <ccleveland@dieselpoint.com>) |
Responses |
Re: Projection pushdown to index access method
|
List | pgsql-hackers |
On Tue, Sep 19, 2023 at 12:35 PM Chris Cleveland <ccleveland@dieselpoint.com> wrote: > I'm working on an index access method. I have a function which can appear in a projection list which should be evaluatedby the access method itself. Example: > > SELECT title, my_special_function(body) > FROM books > WHERE book_id <===> 42; > > "<===>" is the operator that invokes the access method. The value returned by my_special_function() gets calculated duringthe index scan, and depends on information that exists only in the index. > > How do I get the system to pull the value from the index instead of trying to calculate it? I don't see how you can do this in general, because there's no guarantee that the plan will be an Index Scan or Index Only Scan instead of a Seq Scan or Bitmap Heap/Index Scan. > So far, I have created a CustomScan and set it using set_rel_pathlist_hook. The hook function gives us a PlannerInfo, RelOptInfo,Index, and RangeTblEntry. So far as I can tell, only RelOptInfo.reltarget.exprs gives us any info on the SELECTexpressions, but unfortunately, the exprs are Var nodes that contain the (title, body) columns from above, and do notsay anything about my_special_function(). So what does the EXPLAIN plan look like? I'm not quite sure what's happening here, but the planner likes to make plans that just fetch attributes from all the relations being joined (here, there's just one) and then perform the calculation of any expressions at the very end, as the final step, or at least as the final step at that subquery level. And if it plans to ask your custom scan for title, body, and book_id and then compute my_special_function(body) after the fact, the thing you want to happen is not going to happen. If the planner can be induced to ask your custom scan for my_special_function(body), then I *think* you should be able to arrange to get that value any way you like and just return it. But I don't quite know how to induce the planner to do that -- and especially if this query involved more than one table, because of the planner's tendency to postpone expression evaluation until after joins are done. > I know this is possible, because the docs for PathTarget say this: > > PathTarget > * > * This struct contains what we need to know during planning about the > * targetlist (output columns) that a Path will compute. Each RelOptInfo > * includes a default PathTarget, which its individual Paths may simply > * reference. However, in some cases a Path may compute outputs different > * from other Paths, and in that case we make a custom PathTarget for it. > * For example, an indexscan might return index expressions that would > * otherwise need to be explicitly calculated. It's just worth keeping in mind that the planner and the executor are very tightly bound together here. This may be one of those cases getting the executor to do what you want is the easy part, and getting the planner to produce a plan that tells it to do that is the hard part. -- Robert Haas EDB: http://www.enterprisedb.com
pgsql-hackers by date: