Re: Planner hints in Postgresql - Mailing list pgsql-hackers

From Pavel Stehule
Subject Re: Planner hints in Postgresql
Date
Msg-id CAFj8pRBpAyP50Rw18s6pmPXSrU0cdtrrfaaPB+oEHp7DOD+KdQ@mail.gmail.com
Whole thread Raw
In response to Re: Planner hints in Postgresql  (Atri Sharma <atri.jiit@gmail.com>)
Responses Re: Planner hints in Postgresql  (Merlin Moncure <mmoncure@gmail.com>)
List pgsql-hackers



2014-03-17 19:35 GMT+01:00 Atri Sharma <atri.jiit@gmail.com>:



On Mon, Mar 17, 2014 at 11:50 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
On Mon, Mar 17, 2014 at 12:57 PM, Atri Sharma <atri.jiit@gmail.com> wrote:
>
>> There's a big difference between saying to the planner, "Use plan X"
>> vs "Here's some information describing the data supporting choosing
>> plan X intelligently".  The latter allows for better plans in the face
>> of varied/changing data, integrates with the planner in natural way,
>> and encourages users to understand how the planner works.
>
> +1
>
> I was thinking of varying the 'weight' of a user defined plan by an fixed
> experimental factor to tell the planner to give higher/lower preference to
> this plan, but after your idea above, I think Stephen's point of introducing
> a GUC for the factor is the only way possible and I agree with him on the
> point that eventually the user will figure out a way to force usage of his
> plan using the GUC.

GUC is not the answer beyond the "broad brush" mostly debugging level
features they already support.   What do you do if your plan
simultaneously needs and does not need nestloops?

A query plan is a complicated thing that is the result of detail
analysis of the data.  I bet there are less than 100 users on the
planet with the architectural knowledge of the planner to submit a
'plan'.  What users do have is knowledge of the data that the database
can't effectively gather for some reason.  Looking at my query above,
what it would need (assuming the planner could not be made to look
through length()) would be something like:

SELECT * FROM foo WHERE
  length(bar) <= 1000 WITH SELECTIVITY 0.999
  AND length(bar) >= 2 WITH SELECTIVITY 0.999;



Wont this have scaling issues and  issues over time as the data in the table changes?

Suppose I make a view with the above query. With time, as the data in the table changes, the selectivity values wont be good for planning. This may potentially lead to a lot of changes in the view definition and other places where this query was used.



In general, I think I step back on my point that specifying the selectivity is a bad idea.

Could this also work (for the time being) for cross-column statistics?


It is another issue.

I don't believe so SELECTIVITY can work well too. Slow queries are usually related to some strange points in data. I am thinking so well concept should be based on validity of estimations. Some plans are based on totally wrong estimation, but should be fast due less sensitivity to bad estimations. So well concept is penalization some risk plans - or use brute force - like COLUMN store engine does. Their plan is usually simply and tolerant to bad estimations.

Pavel
 
Regards,

Atri



--
Regards,
 
Atri
l'apprenant

pgsql-hackers by date:

Previous
From: Atri Sharma
Date:
Subject: Re: Planner hints in Postgresql
Next
From: Tom Lane
Date:
Subject: Re: First-draft release notes for next week's releases