Re: [HACKERS] RTE_NAMEDTUPLESTORE, enrtuples and comments - Mailing list pgsql-hackers
From | Noah Misch |
---|---|
Subject | Re: [HACKERS] RTE_NAMEDTUPLESTORE, enrtuples and comments |
Date | |
Msg-id | 20170613064012.GC1664286@rfd.leadboat.com Whole thread Raw |
In response to | Re: [HACKERS] RTE_NAMEDTUPLESTORE, enrtuples and comments (Thomas Munro <thomas.munro@enterprisedb.com>) |
Responses |
Re: [HACKERS] RTE_NAMEDTUPLESTORE, enrtuples and comments
|
List | pgsql-hackers |
On Mon, Jun 12, 2017 at 04:04:23PM +1200, Thomas Munro wrote: > On Sun, Jun 11, 2017 at 11:11 PM, Thomas Munro > <thomas.munro@enterprisedb.com> wrote: > > On Sun, Jun 11, 2017 at 6:25 PM, Noah Misch <noah@leadboat.com> wrote: > >> This fourth point is not necessarily a defect: I wonder if RangeTblEntry is > >> the right place for enrtuples. It's a concept regularly seen in planner data > >> structures but not otherwise seen at parse tree level. > > > > I agree that this is strange. Perhaps > > set_namedtuplestore_size_estimates should instead look up the > > EphemeralNamedRelation by rte->enrname to find its way to > > enr->md.enrtuples, but I'm not sure off the top of my head how it > > should get its hands on the QueryEnvironment required to do that. I > > will look into this on Monday, but other ideas/clues welcome... > > Here's some background: If you look at the interface changes > introduced by 18ce3a4 you will see that it is now possible for a > QueryEnvironment object to be injected into the parser and executor. > Currently the only use for it is to inject named tuplestores into the > system via SPI_register_relation or SPI_register_trigger_data. That's > to support SQL standard transition tables, but anyone can now use SPI > to expose data to SQL via an ephemeral named relation in this way. > (In future we could imagine other kinds of objects like table > variables, anonymous functions or streams). > > The QueryEnvironment is used by the parser to resolve names and build > RangeTblEntry objects. The planner doesn't currently need it, because > all the information it needs is in the RangeTblEntry, including the > offending row estimate. Then the executor needs it to get its hands > on the tuplestores. So the question is: how can we get it into > costsize.c, so that it can look up the EphermalNamedRelationMetaData > object by name, instead of trafficking statistics through parser data > structures? > > Here are a couple of ways forward that I can see: > > 1. Figure out how to get the QueryEnvironment through more of these > stack frames (possibly inside other objects), so that > set_namedtuplestore_size_estimates can look up enrtuples by enrname: > > set_namedtuplestore_size_estimates <-- would need QueryEnvironment > set_namedtuplestore_pathlist > set_rel_size > set_base_rel_sizes > make_one_rel > query_planner > grouping_planner > subquery_planner > standard_planner > planner > pg_plan_query > pg_plan_queries <-- doesn't receive QueryEnvironment > BuildCachedPlan <-- receives QueryEnvironment > GetCachedPlan > _SPI_execute_plan > SPI_execute_plan_with_paramlist > > 2. Rip the row estimation out for now, use a bogus hard coded > estimate like we do in some other cases, and revisit later. See > attached (including changes from my previous message). > Unsurprisingly, a query plan changes. > > Thoughts? I'm not sufficiently familiar with the relevant code to judge this one. Let's see if planner experts voice an opinion. Absent more opinions, the current design stands.
pgsql-hackers by date: