RFC: extensible planner state - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | RFC: extensible planner state |
Date | |
Msg-id | CA+TgmoYxfg90rw13+JcYwn4dwSC+agw7o8-A+fA3M0fh96pg8w@mail.gmail.com Whole thread Raw |
Responses |
Re: RFC: extensible planner state
Re: RFC: extensible planner state |
List | pgsql-hackers |
I've been working on planner extensibility for some time now, and am still not quite ready to make a full-fledged proposal, but see partially-fledged proposals here and here: http://postgr.es/m/CA+TgmoZY+baV-T-5ifDn6P=L=aV-VkVBrPmi0TQkcEq-5Finww@mail.gmail.com http://postgr.es/m/CA+TgmoZxQO8svE_vtNCkEubnCYrnrCEnhftdbkdZ496Nfhg=wQ@mail.gmail.com While trying to build out a real, working example based on those patches, I ran into the problem that it's rather difficult for multiple planner hooks to coordinate with each other. For example, you might want to do some calculation once per query, or once per RelOptnfo, and that's somewhat difficult to arrange right now. I tried having my planner hook push an item onto a state stack before calling standard_planner() and pop it afterward, and then any hooks called during planning can look at the top of the state stack. But that doesn't quite work because plan_cluster_use_sort() and plan_create_index_workers() can provide a backdoor into the planner code, allowing get_relation_info() to be called not in reference to the most recent call to planner(). My first instinct was to invent QSRC_DUMMY and have those functions use that, which as far as I can see is an adequate solution to that immediate problem, since get_relation_info() can now identify those cases cleanly. But that still requires the extension to do a lot of bookkeeping just for the privilege of storing some per-query private state, and it seems to me that you might well want to store some private state per-RelOptInfo or possibly per-PlannerInfo, which seems to require an even-more-unreasonable amount of effort. An extension might be able to spin up a hash table keyed by pointer address or maybe some identifying properties of a RelOptInfo, but I think it's going to be slow, fragile, and ugly. So what I'd like to propose instead is something along the lines of the private-ExplainState-data system: http://postgr.es/m/CA+TgmoYSzg58hPuBmei46o8D3SKX+SZoO4K_aGQGwiRzvRApLg@mail.gmail.com https://git.postgresql.org/pg/commitdiff/c65bc2e1d14a2d4daed7c1921ac518f2c5ac3d17 The attached (untested) patch shows how this could work, allowing extensible state in each PlannerGlobal, PlannerInfo, and RelOptInfo, which seem like the logical places to me. I have use cases for the first and the third at present, so the second could be omitted on suspicion of being unuseful, but I bet it isn't. As compared with c65bc2e1d14a2d4daed7c1921ac518f2c5ac3d17, I reduced the initial allocation size to 4 from 16 and made the getter functions static inline, out of the feeling that you're not likely to have more than one ExplainState and the speed of EXPLAIN doesn't matter much, but you might store and access private per-RelOptInfo state a lot of times in one query planner invocation. I'm not altogether convinced this is the right design. It seems slightly unwieldy, and having to allocate an extra array, even if a small one, for every RelOptInfo that has private state seems like it could add a noticeable amount of overhead. On the other hand, I strongly suspect that assuming that there's only ever one planner extension in operation is short-sighted. The fact that we have none right now seems to me to be evidence of the absence of infrastructure rather than the absence of demand. If that is correct then I don't quite see how to do better than this. But I'm interested in hearing what other people think. If people like this design, I will propose it here or on another thread for commit, after suitable testing and polishing. If people do not like this design, then I would like to know what alternative they would prefer. Thanks, -- Robert Haas EDB: http://www.enterprisedb.com
Attachment
pgsql-hackers by date: