Re: Prepared statements and generic plans - Mailing list pgsql-hackers
From | David G. Johnston |
---|---|
Subject | Re: Prepared statements and generic plans |
Date | |
Msg-id | CAKFQuwZCzOzWkXQOH5wE1=GuC66x=d54BDRQM0bdmVbLSz64EA@mail.gmail.com Whole thread Raw |
In response to | Prepared statements and generic plans (Bruce Momjian <bruce@momjian.us>) |
Responses |
Re: Prepared statements and generic plans
|
List | pgsql-hackers |
In Postgres 9.2 we improved the logic of when generic plans are used by
EXECUTE. We weren't sure how well it would work, and the docs included
a vague description on when generic plans are chosen.
I have gotten a few questions lately about how prepared statements are
handled with multiple executions so I have updated the PREPARE manual
page with the attached patch to more clearly explain generic plans and
when they are chosen.
I would like to apply this to the 9.6 docs.
While putting the proposed patch in context I came across this.
"""
Prepared statements have the largest performance advantage when a single session is being used to execute a large number of similar statements. The performance difference will be particularly significant if the statements are complex to plan or rewrite, for example, if the query involves a join of many tables or requires the application of several rules. If the statement is relatively simple to plan and rewrite but relatively expensive to execute, the performance advantage of prepared statements will be less noticeable."""
Until and unless the generic plan is used the "...if statements are complex to plan..." doesn't make sense; and no where in the description itself do we introduce the generic plan concept. This is inconsistent but I'm not addressing it below though its worth considering before a patch to this area is committed.
As to the patch...
Isn't this backwards? <note> [change]
"""
otherwise it <switching to a generic plan> occurs only after five or more executions produce [execution /strike plans] plus planning costs that are, on average, [roughly equal to /strike cheaper] than the generic plan.
"""
I'd maybe go with something like this:
All executions of a prepared statement having zero parameters will use the same plan so the planning time taken during the first execution will be spread across all subsequent executions. For statements having parameters the first five executions will result in value-specific plans as previously described. However, on the sixth execution a generic plan will also be computed and if the average planning + execution cost of all previous value-specific plans is about equal to the execution cost of the generic plan the generic plan will be chosen for that and all subsequent executions.
<existing next paragraph>
If we are getting generic plans significantly cheaper than the value-specific plans I suspect there is a problem...so any comparison that indicates "less-than" is prone to cause confusion. The original is worded well on this point: "...generic plan appears to be not much more expensive..." but lacks detail elsewhere.
This part:
! A generic plan assumes each value supplied to <command>EXECUTE</command>
! is one of the column's distinct values and that column values are
! uniformly distributed. For example, if statistics records three
! distinct column values, a generic plan assumes a column equality
! comparison will match 33% of processed rows. Column statistics
! also allows generic plans to accurately compute the selectivity of
! unique columns. Comparisons on non-uniformly-distributed columns and
! specification of non-existent values affects the average plan cost,
! and hence if and when a generic plan is chosen. [elided the last sentence, placed in the first paragraph]
I'm not sure of the specific goal here but this level detail seems a bit out-of-place in the SQL Command documentation. So, do we want this user-facing and if so do we want it here?
So while looking to find a home for this detail I came across the fact that Chapter 66 doesn't have a table of contents....quite possibly because it is presently a single section chapter...
It seems like this content would find a nice home there as well (though we should fix any TOC bug before adding the new section...)
IMO there are now three possibilities for the existing paragraph:
Status-quo
First portion of Bruce's patch
Mine, above
I think the second part of Bruce's should be, possibly with additional detail and maybe some concrete examples, added to Chapter 66 instead. A pointer to that section in the SQL Command docs could be made - though I'm not sure how much pointing into internals we do from the user-facing areas of the docs. In any case the material would be available to those who wish to see it.
The pertinent points, 5 value-specific executions and comparing the average of all previous executions to the generic plan generated on the 6th execution, both do seem worth upsetting the status-quo for. Beyond that it is style - except I think Bruce got the comparison direction and magnitude wrong as noted above.
This leaves Bruce's second alteration: which probably should follow the rest over to chapter 66. The point of the existing sentence is to give the casual user the means to detect the current type of plan and I think that is all that is needed here.
David J.
pgsql-hackers by date: