Re: Runtime checking of MCV (Was: ... histogram bucket numdistinct statistics) - Mailing list pgsql-hackers

From Csaba Nagy
Subject Re: Runtime checking of MCV (Was: ... histogram bucket numdistinct statistics)
Date
Msg-id 1213167761.14929.96.camel@PCD12478
Whole thread Raw
In response to Re: Proposal - improve eqsel estimates by including histogram bucket numdistinct statistics  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Tue, 2008-06-10 at 19:03 -0400, Tom Lane wrote:
> Given such an MCV list, the planner will always make the right choice
> of whether to do index or seqscan ... as long as it knows the value
> being searched for, that is.  Parameterized plans have a hard time here,
> but that's not really the fault of the statistics.

This is maybe the best example where multiple (sub)plans could be glued
together with some kind of plan fork node, so that the actual plan to be
executed would be decided based on the parameter values and checking the
statistics at runtime instead of plan time for parameterized plans... so
the planner creates alternative (sub)plans (e.g. seqscan vs index scan)
for the cases where the parameters are MCV or not, and then place them
in different branches of a runtime check of the parameter values vs the
statistics. Of course the number of branches must be limited, this would
be the challenge of such a feature... to cover the parameter space with
the minimal number of plan branches so that disastrous plans for special
parameter values are avoided. It would also be possible perhaps to
gradually grow the alternative counts as a reaction to the actual
parameter values used by queries, so that only the parameter space
actually in use by queries is covered.

In fact I would be interested in experimenting with this. Would it be
possible to add new planner behavior as external code ? I would expect
not, as the planner is in charge also for the correctness of the results
and any external code would put that correctness at risk I guess... in
any case, I'll go and check the source.

BTW, there was a discussion about global prepared statements/caching of
query plans, is there any advance on that ? Thorough  planning would
make the most sense in that context, possibly by using a special syntax
for the application to signal the need for such planning for the most
problematic (not necessarily the most used though) queries.

Cheers,
Csaba.




pgsql-hackers by date:

Previous
From: "Heikki Linnakangas"
Date:
Subject: Re: Overhauling GUCS
Next
From: "Heikki Linnakangas"
Date:
Subject: B-tree "finish incomplete split" bug