Thread: Advice on setting cost for function
The documentation in CREATE FUNCTION is fine as far is it goes regarding the COST setting, but that isn’t very far, and I haven’t had any luck finding good advice.
How do I determine what value to set as the COST of a function? I understand that it’s a relative, abstract, per-row sort of thing, but that’s not quite enough to go on to do it well.
If I have a function that looks up a single value from an index, should that be lower than 100? 20, say?
If I have a function that calls multiple other functions, each of which does a reasonable amount of work, should I set the caller to a higher COST, or will Postgres use the costs of the functions it calls?
I can imagine that a SQL function which is just inlined might ignore the COST estimate. Is that so? Are there other such considerations?
In general, a section in the CREATE FUNCTION documentation of two or three paragraphs with a few examples and general guidelines would be very helpful. I would be happy to write it if someone will explain it to me.
guyren@relevantlogic.com writes: > The documentation in CREATE FUNCTION is fine as far is it goes regarding the COST setting, but that isn’t very far, andI haven’t had any luck finding good advice. Fair complaint. A quick data dump: procost = 1 is supposed to represent the cost of a simple built-in function --- think addition or comparison. Expensive C functions, such as tsvector parsing, are supposed to have procosts like 10 or 100. Keep in mind that procost is actually scaled by cpu_operator_cost (0.0025 typically) before being folded into a plan cost estimate. PL functions are inevitably going to be way more expensive than a simple built-in function. It's likely that the default procost setting of 100 is a drastic underestimate in most cases, and that a saner default might be more like 1000. I'm hesitant to change that for fear of causing surprising plan changes, though. Both the CREATE FUNCTION man page and the pg_proc catalog documentation claim that for a set-returning function, procost is per-output-row. That may have been true when written but it seems to be horsepucky now; it's only charged once regardless. We should change the docs. > If I have a function that looks up a single value from an index, should that be lower than 100? 20, say? Well, if you accept the planner's default estimate that a single-row indexed lookup costs around 8 cost units (twice random_page_cost), then a function embodying that should also cost that much; dividing by cpu_operator_cost leads to the conclusion that its procost should be 3200. There are lots of reasons to be more optimistic than that, but for sure I wouldn't think that a function that embodies database access should have procost less than 100. > I can imagine that a SQL function which is just inlined might ignore the COST estimate. Is that so? Are there other suchconsiderations? Yeah, once it gets inlined its procost marking is no longer a factor; we'll look at the substituted expression instead. > If I have a function that calls multiple other functions, each of which does a reasonable amount of work, should I setthe caller to a higher COST, or will Postgres use the costs of the functions it calls? Nope, you'll need to adjust the cost of the calling function. Except for the inlined case, PG will just take the cost marking at face value. I believe that the PostGIS people just recently increased the cost markings on all their expensive functions to better reflect reality. You might want to go dig in their git repo to see what they did (I don't think those changes are released yet). > In general, a section in the CREATE FUNCTION documentation of two or three paragraphs with a few examples and general guidelineswould be very helpful. I would be happy to write it if someone will explain it to me. Have at it ... regards, tom lane