Re: [HACKERS] RTE_NAMEDTUPLESTORE, enrtuples and comments - Mailing list pgsql-hackers

From Noah Misch
Subject Re: [HACKERS] RTE_NAMEDTUPLESTORE, enrtuples and comments
Date
Msg-id 20170613064012.GC1664286@rfd.leadboat.com
Whole thread Raw
In response to Re: [HACKERS] RTE_NAMEDTUPLESTORE, enrtuples and comments  (Thomas Munro <thomas.munro@enterprisedb.com>)
Responses Re: [HACKERS] RTE_NAMEDTUPLESTORE, enrtuples and comments
List pgsql-hackers
On Mon, Jun 12, 2017 at 04:04:23PM +1200, Thomas Munro wrote:
> On Sun, Jun 11, 2017 at 11:11 PM, Thomas Munro
> <thomas.munro@enterprisedb.com> wrote:
> > On Sun, Jun 11, 2017 at 6:25 PM, Noah Misch <noah@leadboat.com> wrote:
> >> This fourth point is not necessarily a defect: I wonder if RangeTblEntry is
> >> the right place for enrtuples.  It's a concept regularly seen in planner data
> >> structures but not otherwise seen at parse tree level.
> >
> > I agree that this is strange.  Perhaps
> > set_namedtuplestore_size_estimates should instead look up the
> > EphemeralNamedRelation by rte->enrname to find its way to
> > enr->md.enrtuples, but I'm not sure off the top of my head how it
> > should get its hands on the QueryEnvironment required to do that.  I
> > will look into this on Monday, but other ideas/clues welcome...
> 
> Here's some background:  If you look at the interface changes
> introduced by 18ce3a4 you will see that it is now possible for a
> QueryEnvironment object to be injected into the parser and executor.
> Currently the only use for it is to inject named tuplestores into the
> system via SPI_register_relation or SPI_register_trigger_data.  That's
> to support SQL standard transition tables, but anyone can now use SPI
> to expose data to SQL via an ephemeral named relation in this way.
> (In future we could imagine other kinds of objects like table
> variables, anonymous functions or streams).
> 
> The QueryEnvironment is used by the parser to resolve names and build
> RangeTblEntry objects.  The planner doesn't currently need it, because
> all the information it needs is in the RangeTblEntry, including the
> offending row estimate.  Then the executor needs it to get its hands
> on the tuplestores.  So the question is: how can we get it into
> costsize.c, so that it can look up the EphermalNamedRelationMetaData
> object by name, instead of trafficking statistics through parser data
> structures?
> 
> Here are a couple of ways forward that I can see:
> 
> 1.  Figure out how to get the QueryEnvironment through more of these
> stack frames (possibly inside other objects), so that
> set_namedtuplestore_size_estimates can look up enrtuples by enrname:
> 
>   set_namedtuplestore_size_estimates <-- would need QueryEnvironment
>   set_namedtuplestore_pathlist
>   set_rel_size
>   set_base_rel_sizes
>   make_one_rel
>   query_planner
>   grouping_planner
>   subquery_planner
>   standard_planner
>   planner
>   pg_plan_query
>   pg_plan_queries <-- doesn't receive QueryEnvironment
>   BuildCachedPlan <-- receives QueryEnvironment
>   GetCachedPlan
>   _SPI_execute_plan
>   SPI_execute_plan_with_paramlist
> 
> 2.  Rip the row estimation out for now, use a bogus hard coded
> estimate like we do in some other cases, and revisit later.  See
> attached (including changes from my previous message).
> Unsurprisingly, a query plan changes.
> 
> Thoughts?

I'm not sufficiently familiar with the relevant code to judge this one.  Let's
see if planner experts voice an opinion.  Absent more opinions, the current
design stands.



pgsql-hackers by date:

Previous
From: Noah Misch
Date:
Subject: Re: [HACKERS] Get stuck when dropping a subscription during synchronizing table
Next
From: Dean Rasheed
Date:
Subject: Re: [HACKERS] PG10 Partitioned tables and relation_is_updatable()