Thread: SQL Functions and plan time
I define an SQL function: CREATE FUNCTION foo(timestamptz) AS ' ...' LANGUAGE 'SQL'; Does the plan for the body get built at creation or execution time? I could have sworn it was execution time, but timings on one function suggest creation? For those new to functions, plpgsql always prepares its plan at "compile" time so doesn't have actual values available to refine its plan. -- Richard Huxton
Richard Huxton <dev@archonet.com> writes: > Does the plan for the body get built at creation or execution time? I could > have sworn it was execution time, but timings on one function suggest > creation? It's planned once per query. There's been some talk of building a cross-query plan cache for SQL functions, like plpgsql has, but nothing done yet. regards, tom lane
> It's planned once per query. There's been some talk of building a > cross-query plan cache for SQL functions, like plpgsql has, but > nothing done yet. Several weeks ago I posted a note about a plpgsql function that fails if an index that existed when it was created is subsequently dropped. (I think this should be considered a bug, the existence or nonexistence of indexes should affect performance, not whether queries fail.) Would a cross-query plan cache for SQL functions create similar problems? -- Mike Nolan
nolan@celery.tssi.com writes: > Several weeks ago I posted a note about a plpgsql function that fails > if an index that existed when it was created is subsequently dropped. No surprise... > Would a cross-query plan cache for SQL functions create similar problems? Yeah. We need to think about a mechanism for invalidating plans before we go too much further down the road of caching plans. regards, tom lane
On Tuesday 08 Jul 2003 6:39 am, Tom Lane wrote: > Richard Huxton <dev@archonet.com> writes: > > Does the plan for the body get built at creation or execution time? I > > could have sworn it was execution time, but timings on one function > > suggest creation? > > It's planned once per query. There's been some talk of building a > cross-query plan cache for SQL functions, like plpgsql has, but > nothing done yet. So why am I getting substantially different times for identical queries (except for parameter substitution) [41] LOG: query: SELECT zzz2('2003-07-07 17:00:00+01','2003-07-07 17:20:00+01'); [42-1] LOG: query: [42-2] DELETE FROM stats_telcon WHERE st_hour >= $1 AND st_hour < $2; ... [43] LOG: duration: 7.524765 sec [44] LOG: query: DELETE FROM stats_telcon WHERE st_hour>='2003-07-07 17:00:00+01' AND st_hour<'2003-07-07 17:20:00+01'; [45] LOG: duration: 0.032860 sec [46-1] LOG: query: INSERT INTO stats_telcon ... [47] LOG: duration: 1.810267 sec I don't think I'm getting caching issues here - these aren't the first run. -- Richard Huxton
> So why am I getting substantially different times for identical queries > (except for parameter substitution) This sounds like the question I asked two weeks ago. What I concluded was that though they may be identical queries, they are not identical in terms of what happens when you execute them. Why? Because the tuples have been updated, meaning that the affected rows are in different physical locations than they used to be in both the table and in any indexes. Imagine that you have to pick up 10 items at the grocery store. You have a list of which aisles they're in. Tomorrow you go back to the store to pick up the same 10 items. However, overnight the store has moved them all to different locations. Even though you have an updated list, it will likely take you a different amount of time to pick up those 10 items. Running a vacuum analyze and rebuilding indexes between runs should produce more consistent timings, if consistent timings are important. Readonly queries running on an otherwise idle server should produce more consistent timings, subject to caching issues. -- Mike Nolan
On Tuesday 08 Jul 2003 9:34 am, nolan@celery.tssi.com wrote: > > So why am I getting substantially different times for identical queries > > (except for parameter substitution) > > This sounds like the question I asked two weeks ago. > > What I concluded was that though they may be identical queries, they are > not identical in terms of what happens when you execute them. > > Why? Because the tuples have been updated, meaning that the affected > rows are in different physical locations than they used to be in both > the table and in any indexes. The "SELECT" part, which is the determining factor in all of this is reading unchanged data from tables unchanged since a vacuum full/analyse. Besides, the timings are consistent. > Running a vacuum analyze and rebuilding indexes between runs should > produce more consistent timings, if consistent timings are important. > > Readonly queries running on an otherwise idle server should produce > more consistent timings, subject to caching issues. Yep - which is why I'm puzzled. It's the readonly part of this that's taking the extra time. The DELETE/INSERT are to another table - I'm summarising activity logs into an hourly stats table. -- Richard Huxton
Richard Huxton <dev@archonet.com> writes: > So why am I getting substantially different times for identical queries > (except for parameter substitution) > [41] LOG: query: SELECT zzz2('2003-07-07 17:00:00+01','2003-07-07 > 17:20:00+01'); > [42-1] LOG: query: > [42-2] DELETE FROM stats_telcon WHERE st_hour >= $1 AND st_hour < $2; > ... > [43] LOG: duration: 7.524765 sec > [44] LOG: query: DELETE FROM stats_telcon WHERE st_hour>='2003-07-07 > 17:00:00+01' AND st_hour<'2003-07-07 17:20:00+01'; > [45] LOG: duration: 0.032860 sec They're not the same query from the planner's viewpoint: one has constants from which it can infer the number of rows to be fetched, the other has only parameter symbols. My guess is that the parameterized query is getting stuck with a seqscan plan, but it's hard to be sure without more data. regards, tom lane
On Tuesday 08 Jul 2003 4:33 pm, Tom Lane wrote: > Richard Huxton <dev@archonet.com> writes: > > So why am I getting substantially different times for identical queries > > (except for parameter substitution) > > > > [41] LOG: query: SELECT zzz2('2003-07-07 17:00:00+01','2003-07-07 > > 17:20:00+01'); > > [42-1] LOG: query: > > [42-2] DELETE FROM stats_telcon WHERE st_hour >= $1 AND st_hour < $2; > > ... > > [43] LOG: duration: 7.524765 sec > > > > [44] LOG: query: DELETE FROM stats_telcon WHERE st_hour>='2003-07-07 > > 17:00:00+01' AND st_hour<'2003-07-07 17:20:00+01'; > > [45] LOG: duration: 0.032860 sec > > They're not the same query from the planner's viewpoint: one has > constants from which it can infer the number of rows to be fetched, > the other has only parameter symbols. > > My guess is that the parameterized query is getting stuck with a seqscan > plan, but it's hard to be sure without more data. That was my guess, but I couldn't think of a way to get an EXPLAIN out of the function. I turned the plan debugging on for both but I'll need some free time to format it up and figure out what's happening. Can I ask why, since the plan is constructed at query-time the parameters aren't substitued *before* planning? -- Richard Huxton
Richard Huxton <dev@archonet.com> writes: > Can I ask why, since the plan is constructed at query-time the parameters > aren't substitued *before* planning? Because then the plan couldn't be re-used. A SQL function may be executed many times in a query, so the plan has to be reusable. (Or, if you prefer, we have query-level caching of SQL function plans.) regards, tom lane
On Tuesday 08 Jul 2003 7:48 pm, Tom Lane wrote: > Richard Huxton <dev@archonet.com> writes: > > Can I ask why, since the plan is constructed at query-time the parameters > > aren't substitued *before* planning? > > Because then the plan couldn't be re-used. A SQL function may be > executed many times in a query, so the plan has to be reusable. > (Or, if you prefer, we have query-level caching of SQL function plans.) Ah - I see. Fair enough Thanks Tom. -- Richard Huxton