Re: SQLFunctionCache and generic plans - Mailing list pgsql-hackers
From | Alexander Pyhalov |
---|---|
Subject | Re: SQLFunctionCache and generic plans |
Date | |
Msg-id | 448c63cf9b4f917c01802309b9c29eb9@postgrespro.ru Whole thread Raw |
In response to | Re: SQLFunctionCache and generic plans (Tom Lane <tgl@sss.pgh.pa.us>) |
List | pgsql-hackers |
Hi. Tom Lane писал(а) 2025-02-27 23:40: > Alexander Pyhalov <a.pyhalov@postgrespro.ru> writes: >> Now sql functions plans are actually saved. The most of it is a >> simplified version of plpgsql plan cache. Perhaps, I've missed >> something. > > A couple of thoughts about v6: > > * I don't think it's okay to just summarily do this: > > /* It's stale; unlink and delete */ > fcinfo->flinfo->fn_extra = NULL; > MemoryContextDelete(fcache->fcontext); > fcache = NULL; > > when fmgr_sql sees that the cache is stale. If we're > doing a nested call of a recursive SQL function, this'd be > cutting the legs out from under the outer recursion level. > plpgsql goes to some lengths to do reference-counting of > function cache entries, and I think you need the same here. > I've looked for original bug report 7881 ( https://www.postgresql.org/message-id/E1U5ytP-0006E3-KB%40wrigleys.postgresql.org ). It's interesting, but it seems that plan cache is not affected by it as when fcinfo xid mismatches we destroy fcinfo, not plan itself (cached plan survives and still can be used). I thought about another case - when recursive function is invalidated during its execution. But I haven't found such case. If function is modified during function execution in another backend, the original backend uses old snapshot during function execution and still sees the old function definition. Looked at the following case - create or replace function f (int) returns setof int language sql as $$ select i from t where t.j = $1 union all select f(i) from t where t.j = $1 $$; and changed function definition to create or replace function f (int) returns setof int language sql as $$ select i from t where t.j = $1 and i > 0 union all select f(i) from t where t.j = $1 $$; during execution of the function. As expected, backend still sees the old definition and uses cached plan. > * I don't like much of anything about 0004. It's messy and it > gives up all the benefit of plan caching in some pretty-common > cases (anywhere where the user was sloppy about what data type > is being returned). I wonder if we couldn't solve that with > more finesse by applying check_sql_fn_retval() to the query tree > after parse analysis, but before we hand it to plancache.c? > This is different from what happens now because we'd be applying > it before not after query rewrite, but I don't think that > query rewrite ever changes the targetlist results. Another > point is that the resultTargetList output might change subtly, > but I don't think we care there either: I believe we only examine > that output for its result data types and resjunk markers. > (This is nonobvious and inadequately documented, but for sure we > couldn't try to execute that tlist --- it's never passed through > the planner.) I'm also not fond of the last patch. So tried to fix it in a way you've suggested - we call check_sql_fn_retval() before creating cached plans. It fixes issue with revalidation happening after modifying query results. One test now changes behavior. What's happening is that after moving extension to another schema, cached plan is invalidated. But revalidation happens and rebuilds the plan. As we've saved analyzed parse tree, not the raw one, we refer to public.dep_req2() not by name, but by oid. Oid didn't change, so cached plan is rebuilt and used. Don't know, should we fix it and if should, how. > > * One diff that caught my eye was > > - if (!ActiveSnapshotSet() && > - plansource->raw_parse_tree && > - analyze_requires_snapshot(plansource->raw_parse_tree)) > + if (!ActiveSnapshotSet() && StmtPlanRequiresRevalidation(plansource)) > > Because StmtPlanRequiresRevalidation uses > stmt_requires_parse_analysis, this is basically throwing away the > distinction between stmt_requires_parse_analysis and > analyze_requires_snapshot. I do not think that's okay, for the > reasons explained in analyze_requires_snapshot. We should expend the > additional notational overhead to keep those things separate. > > * I'm also not thrilled by teaching StmtPlanRequiresRevalidation > how to do something equivalent to stmt_requires_parse_analysis for > Query trees. The entire point of the current division of labor is > for it *not* to know that, but keep the knowledge of the properties > of different statement types in parser/analyze.c. So it looks to me > like we need to add a function to parser/analyze.c that does this. > Not quite sure what to call it though. > querytree_requires_parse_analysis() would be a weird name, because > if it's a Query tree then it's already been through parse analysis. > Maybe querytree_requires_revalidation()? I've created querytree_requires_revalidation() in analyze.c and used it in both StmtPlanRequiresRevalidation() and BuildingPlanRequiresSnapshot(). Both are essentially the same, but this allows to preserve the distinction between stmt_requires_parse_analysis and analyze_requires_snapshot. I've checked old performance results: create or replace function fx2(int) returns int as $$ select 2 * $1; $$ language sql immutable; create or replace function fx3 (int) returns int immutable begin atomic select $1 + $1; end; create or replace function fx4(int) returns numeric as $$ select $1 + $1; $$ language sql immutable; postgres=# do $$ begin for i in 1..1000000 loop perform fx((random()*100)::int); end loop; end; $$; DO Time: 2896.729 ms (00:02.897) postgres=# do $$ begin for i in 1..1000000 loop perform fx((random()*100)::int); end loop; end; $$; DO Time: 2943.926 ms (00:02.944) postgres=# do $$ begin for i in 1..1000000 loop perform fx3((random()*100)::int); end loop; end; $$; DO Time: 2941.629 ms (00:02.942) postgres=# do $$ begin for i in 1..1000000 loop perform fx4((random()*100)::int); end loop; end; $$; DO Time: 2952.383 ms (00:02.952) Here we see the only distinction - fx4() is now also fast, as we use cached plan for it. -- Best regards, Alexander Pyhalov, Postgres Professional
Attachment
pgsql-hackers by date: