Thread: SQLFunctionCache and generic plans
Hello, It has been brought to my attention that SQL functions always use generic plans. Take this function for example: create or replace function test_plpgsql(p1 oid) returns text as $$ BEGIN RETURN (SELECT relname FROM pg_class WHERE oid = p1 OR p1 IS NULL LIMIT 1); END; $$ language plpgsql; As expected, the PlanCache takes care of generating parameter specific plans, and correctly prunes the redundant OR depending on wether we call the function with a NULL value or not: ro=# select test_plpgsql(NULL); LOG: duration: 0.030 ms plan: Query Text: (SELECT relname FROM pg_class WHERE oid = p1 OR p1 IS NULL LIMIT 1) Result (cost=0.04..0.05 rows=1 width=64) InitPlan 1 (returns $0) -> Limit (cost=0.00..0.04 rows=1 width=64) -> Seq Scan on pg_class (cost=0.00..18.12 rows=412 width=64) LOG: duration: 0.662 ms plan: Query Text: select test_plpgsql(NULL); Result (cost=0.00..0.26 rows=1 width=32) ro=# select test_plpgsql(1); LOG: duration: 0.075 ms plan: Query Text: (SELECT relname FROM pg_class WHERE oid = p1 OR p1 IS NULL LIMIT 1) Result (cost=8.29..8.30 rows=1 width=64) InitPlan 1 (returns $0) -> Limit (cost=0.27..8.29 rows=1 width=64) -> Index Scan using pg_class_oid_index on pg_class (cost=0.27..8.29 rows=1 width=64) Index Cond: (oid = '1'::oid) LOG: duration: 0.675 ms plan: Query Text: select test_plpgsql(1); Result (cost=0.00..0.26 rows=1 width=32) But writing the same function in SQL: create or replace function test_sql(p1 oid) returns text as $$ SELECT relname FROM pg_class WHERE oid = p1 OR p1 IS NULL LIMIT 1 $$ language sql; we end up with a generic plan: ro=# select test_sql(1); LOG: duration: 0.287 ms plan: Query Text: SELECT relname FROM pg_class WHERE oid = p1 OR p1 IS NULL LIMIT 1 Query Parameters: $1 = '1' Limit (cost=0.00..6.39 rows=1 width=32) -> Seq Scan on pg_class (cost=0.00..19.16 rows=3 width=32) Filter: ((oid = $1) OR ($1 IS NULL)) This is due to the fact that SQL functions are planned once for the whole query using a specific SQLFunctionCache instead of using the whole PlanCache machinery. The following comment can be found in functions.c, about the SQLFunctionCache: * Note that currently this has only the lifespan of the calling query. * Someday we should rewrite this code to use plancache.c to save parse/plan * results for longer than that. I would be interested in working on this, primarily to avoid this problem of having generic query plans for SQL functions but maybe having a longer lived cache as well would be nice to have. Is there any reason not too, or pitfalls we would like to avoid ? Best regards, -- Ronan Dunklau
Ronan Dunklau <ronan.dunklau@aiven.io> writes: > The following comment can be found in functions.c, about the SQLFunctionCache: > * Note that currently this has only the lifespan of the calling query. > * Someday we should rewrite this code to use plancache.c to save parse/plan > * results for longer than that. > I would be interested in working on this, primarily to avoid this problem of > having generic query plans for SQL functions but maybe having a longer lived > cache as well would be nice to have. > Is there any reason not too, or pitfalls we would like to avoid ? AFAIR it's just lack of round tuits. There would probably be some semantic side-effects, though if you pay attention you could likely make things better while you are at it. The existing behavior of parsing and planning all the statements at once is not very desirable --- for instance, it doesn't work to do CREATE TABLE foo AS ...; SELECT * FROM foo; I think if we're going to nuke this code and start over, we should try to make that sort of case work. regards, tom lane
Hi, Alexander! On Tue, Sep 3, 2024 at 10:33 AM Alexander Pyhalov <a.pyhalov@postgrespro.ru> wrote: > Tom Lane писал(а) 2023-02-07 18:29: > > Ronan Dunklau <ronan.dunklau@aiven.io> writes: > >> The following comment can be found in functions.c, about the > >> SQLFunctionCache: > > > >> * Note that currently this has only the lifespan of the calling > >> query. > >> * Someday we should rewrite this code to use plancache.c to save > >> parse/plan > >> * results for longer than that. > > > >> I would be interested in working on this, primarily to avoid this > >> problem of > >> having generic query plans for SQL functions but maybe having a longer > >> lived > >> cache as well would be nice to have. > >> Is there any reason not too, or pitfalls we would like to avoid ? > > > > AFAIR it's just lack of round tuits. There would probably be some > > semantic side-effects, though if you pay attention you could likely > > make things better while you are at it. The existing behavior of > > parsing and planning all the statements at once is not very desirable > > --- for instance, it doesn't work to do > > CREATE TABLE foo AS ...; > > SELECT * FROM foo; > > I think if we're going to nuke this code and start over, we should > > try to make that sort of case work. > > > > regards, tom lane > > Hi. > > I've tried to make SQL functions use CachedPlan machinery. The main goal > was to allow SQL functions to use custom plans > (the work was started from question - why sql function is so slow > compared to plpgsql one). It turned out that > plpgsql function used custom plan and eliminated scan of all irrelevant > sections, but > exec-time pruning didn't cope with pruning when ScalarArrayOpExpr, > filtering data using int[] parameter. > > In current prototype there are two restrictions. The first one is that > CachecPlan has lifetime of a query - it's not > saved for future use, as we don't have something like plpgsql hashtable > for long live function storage. Second - > SQL language functions in sql_body form (with stored queryTree_list) are > handled in the old way, as we currently lack > tools to make cached plans from query trees. > > Currently this change solves the issue of inefficient plans for queries > over partitioned tables. For example, function like > > CREATE OR REPLACE FUNCTION public.test_get_records(ids integer[]) > RETURNS SETOF test > LANGUAGE sql > AS $function$ > select * > from test > where id = any (ids) > $function$; > > for hash-distributed table test can perform pruning in plan time and > can have plan like > > Append (cost=0.00..51.88 rows=26 width=36) > -> Seq Scan on test_0 test_1 (cost=0.00..25.88 rows=13 > width=36) > Filter: (id = ANY ('{1,2}'::integer[])) > -> Seq Scan on test_2 (cost=0.00..25.88 rows=13 width=36) > Filter: (id = ANY ('{1,2}'::integer[])) > > instead of > > Append (cost=0.00..155.54 rows=248 width=36) > -> Seq Scan on test_0 test_1 (cost=0.00..38.58 rows=62 > width=36) > Filter: (id = ANY ($1)) > -> Seq Scan on test_1 test_2 (cost=0.00..38.58 rows=62 > width=36) > Filter: (id = ANY ($1)) > -> Seq Scan on test_2 test_3 (cost=0.00..38.58 rows=62 > width=36) > Filter: (id = ANY ($1)) > -> Seq Scan on test_3 test_4 (cost=0.00..38.58 rows=62 > width=36) > Filter: (id = ANY ($1)) > > This patch definitely requires more work, and I share it to get some > early feedback. > > What should we do with "pre-parsed" SQL functions (when prosrc is > empty)? How should we create cached plans when we don't have raw > parsetrees? > Currently we can create cached plans without raw parsetrees, but this > means that plan revalidation doesn't work, choose_custom_plan() > always returns false and we get generic plan. Perhaps, we need some form > of GetCachedPlan(), which ignores raw_parse_tree? I don't think you need a new form of GetCachedPlan(). Instead, it seems that StmtPlanRequiresRevalidation() should be revised. As I got from comments and the d8b2fcc9d4 commit message, the primary goal was to skip revalidation of utility statements. Skipping revalidation was a positive side effect, as long as we didn't support custom plans for them anyway. But as you're going to change this, StmtPlanRequiresRevalidation() needs to be revised. I also think it's not necessary to implement long-lived plan cache in the initial patch. The work could be split into two patches. The first could implement query lifetime plan cache. This is beneficial already by itself as you've shown by example. The second could implement long-lived plan cache. I appreciate your work in this direction. I hope you got the feedback to go ahead and work on remaining issues. ------ Regards, Alexander Korotkov Supabase
Hi
út 31. 12. 2024 v 16:36 odesílatel Alexander Pyhalov <a.pyhalov@postgrespro.ru> napsal:
Hi.
>> What should we do with "pre-parsed" SQL functions (when prosrc is
>> empty)? How should we create cached plans when we don't have raw
>> parsetrees?
>> Currently we can create cached plans without raw parsetrees, but this
>> means that plan revalidation doesn't work, choose_custom_plan()
>> always returns false and we get generic plan. Perhaps, we need some
>> form
>> of GetCachedPlan(), which ignores raw_parse_tree?
>
> I don't think you need a new form of GetCachedPlan(). Instead, it
> seems that StmtPlanRequiresRevalidation() should be revised. As I got
> from comments and the d8b2fcc9d4 commit message, the primary goal was
> to skip revalidation of utility statements. Skipping revalidation was
> a positive side effect, as long as we didn't support custom plans for
> them anyway. But as you're going to change this,
> StmtPlanRequiresRevalidation() needs to be revised.
>
Thanks for feedback.
I've modifed StmtPlanRequiresRevalidation() so that it looks on queries
command type.
Not sure if it's enough or I have to recreate something more similar to
stmt_requires_parse_analysis()
logic. I was wondering if this can lead to triggering plan revalidation
in RevalidateCachedQuery().
I suppose not - as we plan in executor (so shouldn't catch userid change
or see some changes in
related objects. Revalidation would kill our plan, destroying
resultDesc.
Also while looking at this, fixed processing of instead of rules (which
would lead to NULL execution_state).
--
there are lot of fails found by tester
Please, can you check it?
regards
Pavel
Best regards,
Alexander Pyhalov,
Postgres Professional
Hi
čt 30. 1. 2025 v 9:50 odesílatel Alexander Pyhalov <a.pyhalov@postgrespro.ru> napsal:
Alexander Pyhalov писал(а) 2025-01-29 17:35:
> Tom Lane писал(а) 2025-01-17 21:27:
>> Alexander Pyhalov <a.pyhalov@postgrespro.ru> writes:
>>> I've rebased patch on master. Tests pass here.
>>
>> The cfbot still doesn't like it; my guess is that you built without
>> --with-libxml and so didn't notice the effects on xml.out.
>
> Hi. Thank you for review.
>
> I've updated patch.
Sorry, missed one local patch to fix memory bloat during replaning. Also
fixed a compiler warning.
Did you do some performance checks?
I tried some worst case
CREATE OR REPLACE FUNCTION fx(int)
RETURNS int AS $$
SELECT $1 + $1
$$ LANGUAGE SQL IMMUTABLE;
CREATE OR REPLACE FUNCTION fx2(int)
RETURNS int AS $$
SELECT $1 * 2
$$ LANGUAGE SQL IMMUTABLE;
do $$
begin
for i in 1..1000000 loop
perform fx(i); -- or fx2
end loop;
end;
$$;
DO
begin
for i in 1..1000000 loop
perform fx(i); -- or fx2
end loop;
end;
$$;
DO
The patched version reduces the difference between execution fx and fx2, but patched version is about 10% slower than unpatched.
The overhead of plan cache looks significant for simple cases (and a lot of SQL functions are very simple).
Regards
Pavel
--
Best regards,
Alexander Pyhalov,
Postgres Professional
Pavel Stehule <pavel.stehule@gmail.com> writes: > Did you do some performance checks? This is a good question to ask ... > I tried some worst case > CREATE OR REPLACE FUNCTION fx(int) > RETURNS int AS $$ > SELECT $1 + $1 > $$ LANGUAGE SQL IMMUTABLE; ... but I don't think tests like this will give helpful answers. That function is simple enough to be inlined: regression=# explain verbose select fx(f1) from int4_tbl; QUERY PLAN --------------------------------------------------------------- Seq Scan on public.int4_tbl (cost=0.00..1.06 rows=5 width=4) Output: (f1 + f1) (2 rows) So functions.c shouldn't have any involvement at all in the actually-executed PERFORM expression, and whatever difference you measured must have been noise. (If the effect *is* real, we'd better find out why.) You need to test with a non-inline-able function. Looking at the inlining conditions in inline_function(), one simple hack is to make the function return SETOF. That'll only exercise the returns-set path in functions.c though, so it'd be advisable to check other inline-blocking conditions too. regards, tom lane
Hi
I did multiple benchmarking, and still looks so the proposed patch doesn't help and has significant overhead
testcase:
create or replace function fx(int) returns int as $$ select $1 + $1; $$ language sql immutable;
create or replace function fx2(int) returns int as $$ select 2 * $1; $$ language sql immutable;
I tested
do $$
begin
for i in 1..1000000 loop
perform fx((random()*100)::int); -- or fx2
end loop;
end;
$$;
begin
for i in 1..1000000 loop
perform fx((random()*100)::int); -- or fx2
end loop;
end;
$$;
Results (master, patched):
fx: 17067 ms, 22165 ms
fx2: 2234 ms, 2311 ms
the execution of dynamic sql
2025-02-03 18:47:33) postgres=# do $$
begin
for i in 1..1000000 loop
execute 'select $1 + $1' using (random()*100)::int;
end loop;
end;
$$;
DO
Time: 13412.990 ms (00:13.413)
begin
for i in 1..1000000 loop
execute 'select $1 + $1' using (random()*100)::int;
end loop;
end;
$$;
DO
Time: 13412.990 ms (00:13.413)
In the profiler I see a significant overhead of the parser, so it looks like there is some more (overhead), but plan cache is not used.
Please, can somebody recheck my tests?
Regards
Pavel
Pavel Stehule <pavel.stehule@gmail.com> writes: > I did multiple benchmarking, and still looks so the proposed patch doesn't > help and has significant overhead Yeah, your fx() test case is clearly worse. For me, HEAD: regression=# do $$ begin for i in 1..1000000 loop perform fx((random()*100)::int); -- or fx2 end loop; end; $$; DO Time: 5229.184 ms (00:05.229) PATCH: regression=# do $$ begin for i in 1..1000000 loop perform fx((random()*100)::int); -- or fx2 end loop; end; $$; DO Time: 6934.413 ms (00:06.934) Adding some debug printout shows me that BuildCachedPlan is called to construct a custom plan on every single execution, which is presumably because the patch doesn't make any attempt to carry plancache state across successive executions of the same query. If we were saving that state it would have soon switched to a generic plan and then won big. So, even though I thought we could leave that for later, it seems like maybe we have to have it before we'll have a committable patch. There might be some residual inefficiency in there though. In the unpatched code we'd be calling pg_parse_query and pg_plan_query once per execution. You'd think that would cost more than BuildCachedPlan, which can skip the raw-parsing part. Even more interesting, the patch gets slower yet if we use a new-style SQL function: regression=# create or replace function fx3 (int) returns int immutable regression-# begin atomic select $1 + $1; end; CREATE FUNCTION Time: 0.813 ms regression=# do $$ begin for i in 1..1000000 loop perform fx3((random()*100)::int); -- or fx2 end loop; end; $$; DO Time: 8007.062 ms (00:08.007) That makes no sense either, because with a new-style SQL function we should be skipping parse analysis as well. But wait: HEAD takes Time: 6632.709 ms (00:06.633) to do the same thing. So somehow the new-style SQL function stuff is very materially slower in this use-case, with or without this patch. I do not understand why. Definitely some performance investigation needs to be done here. Even without cross-query plan caching, I don't see why the patch isn't better than it is. It ought to be at least competitive with the unpatched code. (I've not read the v5 patch yet, so I have no theories.) regards, tom lane
I wrote: > But wait: HEAD takes > Time: 6632.709 ms (00:06.633) > to do the same thing. So somehow the new-style SQL function > stuff is very materially slower in this use-case, with or > without this patch. I do not understand why. "perf" tells me that in the fx3 test, a full third of the runtime is spent inside stringToNode(), with about three-quarters of that going into pg_strtok(). This makes no sense to me: we'll be reading the prosqlbody of fx3(), sure, but that's not enormously long (about 1200 bytes). And pg_strtok() doesn't look remarkably slow. There's no way this should be taking more time than raw parsing + parse analysis, even for such a trivial query as "select $1 + $1". There's been some talk of getting rid of our existing nodetree storage format in favor of something more efficient. Maybe we should put a higher priority on getting that done. But anyway, that seems orthogonal to the current patch. > Even without cross-query plan caching, I don't see why the > patch isn't better than it is. It ought to be at least > competitive with the unpatched code. This remains true. regards, tom lane
hI
I can confirm 60% speedup for execution of function fx and fx3 - both functions are very primitive, so for real code the benefit can be higher
Unfortunately, there is about 5% slowdown for inlined code, and for just plpgsql code too.
I tested fx4
create or replace function fx4(int) returns int immutable as $$ begin return $1 + $1; end $$ language plpgsql;
and fx2
create or replace function fx2(int) returns int as $$ select 2 * $1; $$
language sql immutable;
language sql immutable;
and execution of patched code is about 5% slower. It is strange so this patch has a negative impact on plpgsql execution.
Regards
Pavel
Pavel Stehule писал(а) 2025-02-26 22:34: > hI > > I can confirm 60% speedup for execution of function fx and fx3 - both > functions are very primitive, so for real code the benefit can be > higher > > Unfortunately, there is about 5% slowdown for inlined code, and for > just plpgsql code too. > > I tested fx4 > > create or replace function fx4(int) returns int immutable as $$ begin > return $1 + $1; end $$ language plpgsql; > > and fx2 > > create or replace function fx2(int) returns int as $$ select 2 * $1; > $$ > language sql immutable; > > and execution of patched code is about 5% slower. It is strange so > this patch has a negative impact on plpgsql execution. > > Regards > > Pavel Hi. I've tried to reproduce slowdown and couldn't. create or replace function fx4(int) returns int immutable as $$ begin return $1 + $1; end $$ language plpgsql; do $$ begin for i in 1..5000000 loop perform fx4((random()*100)::int); -- or fx2 end loop; end; $$; OLD results: Time: 8268.614 ms (00:08.269) Time: 8178.836 ms (00:08.179) Time: 8306.694 ms (00:08.307) New (patched) results: Time: 7743.945 ms (00:07.744) Time: 7803.109 ms (00:07.803) Time: 7736.735 ms (00:07.737) Not sure why new is faster (perhaps, some noise?) - looking at perf flamegraphs I don't see something evident. create or replace function fx2(int) returns int as $$ select 2 * $1; $$ language sql immutable; do $$ begin for i in 1..5000000 loop perform fx2((random()*100)::int); -- or fx2 end loop; end; $$; OLD results: Time: 5346.471 ms (00:05.346) Time: 5359.222 ms (00:05.359) Time: 5316.747 ms (00:05.317) New (patched) results: Time: 5188.363 ms (00:05.188) Time: 5225.322 ms (00:05.225) Time: 5203.667 ms (00:05.204) -- Best regards, Alexander Pyhalov, Postgres Professional
čt 27. 2. 2025 v 13:25 odesílatel Alexander Pyhalov <a.pyhalov@postgrespro.ru> napsal:
Pavel Stehule писал(а) 2025-02-26 22:34:
> hI
>
> I can confirm 60% speedup for execution of function fx and fx3 - both
> functions are very primitive, so for real code the benefit can be
> higher
>
> Unfortunately, there is about 5% slowdown for inlined code, and for
> just plpgsql code too.
>
> I tested fx4
>
> create or replace function fx4(int) returns int immutable as $$ begin
> return $1 + $1; end $$ language plpgsql;
>
> and fx2
>
> create or replace function fx2(int) returns int as $$ select 2 * $1;
> $$
> language sql immutable;
>
> and execution of patched code is about 5% slower. It is strange so
> this patch has a negative impact on plpgsql execution.
>
> Regards
>
> Pavel
Hi. I've tried to reproduce slowdown and couldn't.
create or replace function fx4(int) returns int immutable as $$ begin
return $1 + $1; end $$ language plpgsql;
do $$
begin
for i in 1..5000000 loop
perform fx4((random()*100)::int); -- or fx2
end loop;
end;
$$;
OLD results:
Time: 8268.614 ms (00:08.269)
Time: 8178.836 ms (00:08.179)
Time: 8306.694 ms (00:08.307)
New (patched) results:
Time: 7743.945 ms (00:07.744)
Time: 7803.109 ms (00:07.803)
Time: 7736.735 ms (00:07.737)
Not sure why new is faster (perhaps, some noise?) - looking at perf
flamegraphs I don't see something evident.
create or replace function fx2(int) returns int as $$ select 2 * $1; $$
language sql immutable;
do $$
begin
for i in 1..5000000 loop
perform fx2((random()*100)::int); -- or fx2
end loop;
end;
$$;
OLD results:
Time: 5346.471 ms (00:05.346)
Time: 5359.222 ms (00:05.359)
Time: 5316.747 ms (00:05.317)
New (patched) results:
Time: 5188.363 ms (00:05.188)
Time: 5225.322 ms (00:05.225)
Time: 5203.667 ms (00:05.204)
I'll try to get profiles.
Regards
Pavel
--
Best regards,
Alexander Pyhalov,
Postgres Professional
Pavel Stehule <pavel.stehule@gmail.com> writes: > čt 27. 2. 2025 v 13:25 odesílatel Alexander Pyhalov < > a.pyhalov@postgrespro.ru> napsal: >>> Unfortunately, there is about 5% slowdown for inlined code, and for >>> just plpgsql code too. >> Hi. I've tried to reproduce slowdown and couldn't. > I'll try to get profiles. I tried to reproduce this too. What I got on my usual development workstation (RHEL8/gcc 8.5.0 on x86_64) was: fx2 example: v6 patch about 2.4% slower than HEAD fx4 example: v6 patch about 7.3% slower than HEAD I was quite concerned after that result, but then I tried it on another machine (macOS/clang 16.0.0 on Apple M1) and got: fx2 example: v6 patch about 0.2% slower than HEAD fx4 example: v6 patch about 0.7% faster than HEAD (These are average-of-three-runs tests on --disable-cassert builds; I trust you guys were not doing performance tests on assert-enabled builds?) So taken together, our results are all over the map, anywhere from 7% speedup to 7% slowdown. My usual rule of thumb is that you can see up to 2% variation in this kind of microbenchmark even when "nothing has changed", just due to random build details like whether critical loops cross a cacheline or not. 7% is pretty well above that threshold, but maybe it's just random build variation anyway. Furthermore, since neither example involves functions.c at all (fx2 would be inlined, and fx4 isn't SQL-language), it's hard to see how the patch would directly affect either example unless it were adding overhead to plancache.c. And I don't see any changes there that would amount to meaningful overhead for the existing use-case with a raw parse tree. So right at the moment I'm inclined to write this off as measurement noise. Perhaps it'd be worth checking a few more platforms, though. regards, tom lane
čt 27. 2. 2025 v 20:52 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:
Pavel Stehule <pavel.stehule@gmail.com> writes:
> čt 27. 2. 2025 v 13:25 odesílatel Alexander Pyhalov <
> a.pyhalov@postgrespro.ru> napsal:
>>> Unfortunately, there is about 5% slowdown for inlined code, and for
>>> just plpgsql code too.
>> Hi. I've tried to reproduce slowdown and couldn't.
> I'll try to get profiles.
I tried to reproduce this too. What I got on my usual development
workstation (RHEL8/gcc 8.5.0 on x86_64) was:
fx2 example: v6 patch about 2.4% slower than HEAD
fx4 example: v6 patch about 7.3% slower than HEAD
I was quite concerned after that result, but then I tried it on
another machine (macOS/clang 16.0.0 on Apple M1) and got:
fx2 example: v6 patch about 0.2% slower than HEAD
fx4 example: v6 patch about 0.7% faster than HEAD
(These are average-of-three-runs tests on --disable-cassert
builds; I trust you guys were not doing performance tests on
assert-enabled builds?)
So taken together, our results are all over the map, anywhere
from 7% speedup to 7% slowdown. My usual rule of thumb is that
Where do you see 7% speedup? Few lines up you wrote 0.7% faster.
you can see up to 2% variation in this kind of microbenchmark even
when "nothing has changed", just due to random build details like
whether critical loops cross a cacheline or not. 7% is pretty
well above that threshold, but maybe it's just random build
variation anyway.
Furthermore, since neither example involves functions.c at all
(fx2 would be inlined, and fx4 isn't SQL-language), it's hard
to see how the patch would directly affect either example unless
it were adding overhead to plancache.c. And I don't see any
changes there that would amount to meaningful overhead for the
existing use-case with a raw parse tree.
So right at the moment I'm inclined to write this off as
measurement noise. Perhaps it'd be worth checking a few
more platforms, though.
regards, tom lane
Alexander Pyhalov <a.pyhalov@postgrespro.ru> writes: > Now sql functions plans are actually saved. The most of it is a > simplified version of plpgsql plan cache. Perhaps, I've missed > something. A couple of thoughts about v6: * I don't think it's okay to just summarily do this: /* It's stale; unlink and delete */ fcinfo->flinfo->fn_extra = NULL; MemoryContextDelete(fcache->fcontext); fcache = NULL; when fmgr_sql sees that the cache is stale. If we're doing a nested call of a recursive SQL function, this'd be cutting the legs out from under the outer recursion level. plpgsql goes to some lengths to do reference-counting of function cache entries, and I think you need the same here. * I don't like much of anything about 0004. It's messy and it gives up all the benefit of plan caching in some pretty-common cases (anywhere where the user was sloppy about what data type is being returned). I wonder if we couldn't solve that with more finesse by applying check_sql_fn_retval() to the query tree after parse analysis, but before we hand it to plancache.c? This is different from what happens now because we'd be applying it before not after query rewrite, but I don't think that query rewrite ever changes the targetlist results. Another point is that the resultTargetList output might change subtly, but I don't think we care there either: I believe we only examine that output for its result data types and resjunk markers. (This is nonobvious and inadequately documented, but for sure we couldn't try to execute that tlist --- it's never passed through the planner.) * One diff that caught my eye was - if (!ActiveSnapshotSet() && - plansource->raw_parse_tree && - analyze_requires_snapshot(plansource->raw_parse_tree)) + if (!ActiveSnapshotSet() && StmtPlanRequiresRevalidation(plansource)) Because StmtPlanRequiresRevalidation uses stmt_requires_parse_analysis, this is basically throwing away the distinction between stmt_requires_parse_analysis and analyze_requires_snapshot. I do not think that's okay, for the reasons explained in analyze_requires_snapshot. We should expend the additional notational overhead to keep those things separate. * I'm also not thrilled by teaching StmtPlanRequiresRevalidation how to do something equivalent to stmt_requires_parse_analysis for Query trees. The entire point of the current division of labor is for it *not* to know that, but keep the knowledge of the properties of different statement types in parser/analyze.c. So it looks to me like we need to add a function to parser/analyze.c that does this. Not quite sure what to call it though. querytree_requires_parse_analysis() would be a weird name, because if it's a Query tree then it's already been through parse analysis. Maybe querytree_requires_revalidation()? regards, tom lane
Pavel Stehule <pavel.stehule@gmail.com> writes: > čt 27. 2. 2025 v 20:52 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal: >> So taken together, our results are all over the map, anywhere >> from 7% speedup to 7% slowdown. My usual rule of thumb is that > Where do you see 7% speedup? Few lines up you wrote 0.7% faster. Alexander got that on the fx4 case, according to his response a few messages ago [1]. It'd be good if someone else could reproduce that, because right now we have two "it's slower" results versus only one "it's faster". regards, tom lane [1] https://www.postgresql.org/message-id/e5724d1ba8398c7ff20ead1de73b4db4%40postgrespro.ru
Hi
čt 27. 2. 2025 v 21:45 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:
Pavel Stehule <pavel.stehule@gmail.com> writes:
> čt 27. 2. 2025 v 20:52 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:
>> So taken together, our results are all over the map, anywhere
>> from 7% speedup to 7% slowdown. My usual rule of thumb is that
> Where do you see 7% speedup? Few lines up you wrote 0.7% faster.
Alexander got that on the fx4 case, according to his response a
few messages ago [1]. It'd be good if someone else could reproduce
that, because right now we have two "it's slower" results versus
only one "it's faster".
ok
here is a profile from master
6.98% postgres postgres [.] hash_bytes
6.30% postgres postgres [.] palloc0
3.57% postgres postgres [.] SearchCatCacheInternal
3.29% postgres postgres [.] AllocSetAlloc
2.65% postgres plpgsql.so [.] exec_stmts
2.55% postgres postgres [.] expression_tree_walker_impl
2.34% postgres postgres [.] _SPI_execute_plan
2.13% postgres postgres [.] CheckExprStillValid
2.02% postgres postgres [.] fmgr_info_cxt_security
1.89% postgres postgres [.] ExecInitFunc
1.51% postgres postgres [.] ExecInterpExpr
1.48% postgres postgres [.] ResourceOwnerForget
1.44% postgres postgres [.] AllocSetReset
1.35% postgres postgres [.] MemoryContextCreate
1.30% postgres plpgsql.so [.] plpgsql_exec_function
1.29% postgres libc.so.6 [.] __memcmp_sse2
1.24% postgres postgres [.] MemoryContextDelete
1.13% postgres postgres [.] check_stack_depth
1.11% postgres postgres [.] AllocSetContextCreateInternal
1.09% postgres postgres [.] resolve_polymorphic_argtypes
1.08% postgres postgres [.] hash_search_with_hash_value
1.07% postgres postgres [.] standard_ExecutorStart
1.07% postgres postgres [.] ExprEvalPushStep
1.04% postgres postgres [.] ExecInitExprRec
0.95% postgres plpgsql.so [.] plpgsql_estate_setup
0.91% postgres postgres [.] ExecReadyInterpretedExp
6.30% postgres postgres [.] palloc0
3.57% postgres postgres [.] SearchCatCacheInternal
3.29% postgres postgres [.] AllocSetAlloc
2.65% postgres plpgsql.so [.] exec_stmts
2.55% postgres postgres [.] expression_tree_walker_impl
2.34% postgres postgres [.] _SPI_execute_plan
2.13% postgres postgres [.] CheckExprStillValid
2.02% postgres postgres [.] fmgr_info_cxt_security
1.89% postgres postgres [.] ExecInitFunc
1.51% postgres postgres [.] ExecInterpExpr
1.48% postgres postgres [.] ResourceOwnerForget
1.44% postgres postgres [.] AllocSetReset
1.35% postgres postgres [.] MemoryContextCreate
1.30% postgres plpgsql.so [.] plpgsql_exec_function
1.29% postgres libc.so.6 [.] __memcmp_sse2
1.24% postgres postgres [.] MemoryContextDelete
1.13% postgres postgres [.] check_stack_depth
1.11% postgres postgres [.] AllocSetContextCreateInternal
1.09% postgres postgres [.] resolve_polymorphic_argtypes
1.08% postgres postgres [.] hash_search_with_hash_value
1.07% postgres postgres [.] standard_ExecutorStart
1.07% postgres postgres [.] ExprEvalPushStep
1.04% postgres postgres [.] ExecInitExprRec
0.95% postgres plpgsql.so [.] plpgsql_estate_setup
0.91% postgres postgres [.] ExecReadyInterpretedExp
and from patched
7.08% postgres postgres [.] hash_bytes
6.25% postgres postgres [.] palloc0
3.52% postgres postgres [.] SearchCatCacheInternal
3.30% postgres postgres [.] AllocSetAlloc
2.39% postgres postgres [.] expression_tree_walker_impl
2.37% postgres plpgsql.so [.] exec_stmts
2.15% postgres postgres [.] _SPI_execute_plan
2.10% postgres postgres [.] CheckExprStillValid
1.94% postgres postgres [.] fmgr_info_cxt_security
1.71% postgres postgres [.] ExecInitFunc
1.41% postgres postgres [.] AllocSetReset
1.40% postgres postgres [.] ExecInterpExpr
1.38% postgres postgres [.] ExprEvalPushStep
1.34% postgres postgres [.] ResourceOwnerForget
1.31% postgres postgres [.] MemoryContextDelete
1.24% postgres libc.so.6 [.] __memcmp_sse2
1.21% postgres postgres [.] MemoryContextCreate
1.18% postgres postgres [.] AllocSetContextCreateInternal
1.17% postgres postgres [.] hash_search_with_hash_value
1.13% postgres postgres [.] resolve_polymorphic_argtypes
1.13% postgres plpgsql.so [.] plpgsql_exec_function
1.03% postgres postgres [.] standard_ExecutorStart
0.98% postgres postgres [.] ExecInitExprRec
0.96% postgres postgres [.] check_stack_depth
6.25% postgres postgres [.] palloc0
3.52% postgres postgres [.] SearchCatCacheInternal
3.30% postgres postgres [.] AllocSetAlloc
2.39% postgres postgres [.] expression_tree_walker_impl
2.37% postgres plpgsql.so [.] exec_stmts
2.15% postgres postgres [.] _SPI_execute_plan
2.10% postgres postgres [.] CheckExprStillValid
1.94% postgres postgres [.] fmgr_info_cxt_security
1.71% postgres postgres [.] ExecInitFunc
1.41% postgres postgres [.] AllocSetReset
1.40% postgres postgres [.] ExecInterpExpr
1.38% postgres postgres [.] ExprEvalPushStep
1.34% postgres postgres [.] ResourceOwnerForget
1.31% postgres postgres [.] MemoryContextDelete
1.24% postgres libc.so.6 [.] __memcmp_sse2
1.21% postgres postgres [.] MemoryContextCreate
1.18% postgres postgres [.] AllocSetContextCreateInternal
1.17% postgres postgres [.] hash_search_with_hash_value
1.13% postgres postgres [.] resolve_polymorphic_argtypes
1.13% postgres plpgsql.so [.] plpgsql_exec_function
1.03% postgres postgres [.] standard_ExecutorStart
0.98% postgres postgres [.] ExecInitExprRec
0.96% postgres postgres [.] check_stack_depth
looks so there is only one significant differences
ExprEvalPushStep 1.07 x 1.38%
Regards
Pavel
compiled without assertions on gcc 15 with 02
vendor_id : GenuineIntel
cpu family : 6
model : 42
model name : Intel(R) Core(TM) i7-2630QM CPU @ 2.00GHz
stepping : 7
microcode : 0x2f
cpu MHz : 2691.102
cache size : 6144 KB
cpu family : 6
model : 42
model name : Intel(R) Core(TM) i7-2630QM CPU @ 2.00GHz
stepping : 7
microcode : 0x2f
cpu MHz : 2691.102
cache size : 6144 KB
regards, tom lane
[1] https://www.postgresql.org/message-id/e5724d1ba8398c7ff20ead1de73b4db4%40postgrespro.ru
Hi
pá 28. 2. 2025 v 7:29 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:
Hičt 27. 2. 2025 v 21:45 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:Pavel Stehule <pavel.stehule@gmail.com> writes:
> čt 27. 2. 2025 v 20:52 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:
>> So taken together, our results are all over the map, anywhere
>> from 7% speedup to 7% slowdown. My usual rule of thumb is that
> Where do you see 7% speedup? Few lines up you wrote 0.7% faster.
Alexander got that on the fx4 case, according to his response a
few messages ago [1]. It'd be good if someone else could reproduce
that, because right now we have two "it's slower" results versus
only one "it's faster".okhere is a profile from master6.98% postgres postgres [.] hash_bytes
6.30% postgres postgres [.] palloc0
3.57% postgres postgres [.] SearchCatCacheInternal
3.29% postgres postgres [.] AllocSetAlloc
2.65% postgres plpgsql.so [.] exec_stmts
2.55% postgres postgres [.] expression_tree_walker_impl
2.34% postgres postgres [.] _SPI_execute_plan
2.13% postgres postgres [.] CheckExprStillValid
2.02% postgres postgres [.] fmgr_info_cxt_security
1.89% postgres postgres [.] ExecInitFunc
1.51% postgres postgres [.] ExecInterpExpr
1.48% postgres postgres [.] ResourceOwnerForget
1.44% postgres postgres [.] AllocSetReset
1.35% postgres postgres [.] MemoryContextCreate
1.30% postgres plpgsql.so [.] plpgsql_exec_function
1.29% postgres libc.so.6 [.] __memcmp_sse2
1.24% postgres postgres [.] MemoryContextDelete
1.13% postgres postgres [.] check_stack_depth
1.11% postgres postgres [.] AllocSetContextCreateInternal
1.09% postgres postgres [.] resolve_polymorphic_argtypes
1.08% postgres postgres [.] hash_search_with_hash_value
1.07% postgres postgres [.] standard_ExecutorStart
1.07% postgres postgres [.] ExprEvalPushStep
1.04% postgres postgres [.] ExecInitExprRec
0.95% postgres plpgsql.so [.] plpgsql_estate_setup
0.91% postgres postgres [.] ExecReadyInterpretedExpand from patched7.08% postgres postgres [.] hash_bytes
6.25% postgres postgres [.] palloc0
3.52% postgres postgres [.] SearchCatCacheInternal
3.30% postgres postgres [.] AllocSetAlloc
2.39% postgres postgres [.] expression_tree_walker_impl
2.37% postgres plpgsql.so [.] exec_stmts
2.15% postgres postgres [.] _SPI_execute_plan
2.10% postgres postgres [.] CheckExprStillValid
1.94% postgres postgres [.] fmgr_info_cxt_security
1.71% postgres postgres [.] ExecInitFunc
1.41% postgres postgres [.] AllocSetReset
1.40% postgres postgres [.] ExecInterpExpr
1.38% postgres postgres [.] ExprEvalPushStep
1.34% postgres postgres [.] ResourceOwnerForget
1.31% postgres postgres [.] MemoryContextDelete
1.24% postgres libc.so.6 [.] __memcmp_sse2
1.21% postgres postgres [.] MemoryContextCreate
1.18% postgres postgres [.] AllocSetContextCreateInternal
1.17% postgres postgres [.] hash_search_with_hash_value
1.13% postgres postgres [.] resolve_polymorphic_argtypes
1.13% postgres plpgsql.so [.] plpgsql_exec_function
1.03% postgres postgres [.] standard_ExecutorStart
0.98% postgres postgres [.] ExecInitExprRec
0.96% postgres postgres [.] check_stack_depthlooks so there is only one significant differencesExprEvalPushStep 1.07 x 1.38%RegardsPavelcompiled without assertions on gcc 15 with 02vendor_id : GenuineIntel
cpu family : 6
model : 42
model name : Intel(R) Core(TM) i7-2630QM CPU @ 2.00GHz
stepping : 7
microcode : 0x2f
cpu MHz : 2691.102
cache size : 6144 KB
I tested the patches on another notebook with more recent cpu
vendor_id : GenuineIntel
cpu family : 6
model : 154
model name : 12th Gen Intel(R) Core(TM) i7-12700H
stepping : 3
microcode : 0x436
cpu MHz : 400.000
cache size : 24576 KB
cpu family : 6
model : 154
model name : 12th Gen Intel(R) Core(TM) i7-12700H
stepping : 3
microcode : 0x436
cpu MHz : 400.000
cache size : 24576 KB
And the difference are smaller - about 3%
Regards
Pavel
regards, tom lane
[1] https://www.postgresql.org/message-id/e5724d1ba8398c7ff20ead1de73b4db4%40postgrespro.ru
Hi. Tom Lane писал(а) 2025-02-27 23:40: > Alexander Pyhalov <a.pyhalov@postgrespro.ru> writes: >> Now sql functions plans are actually saved. The most of it is a >> simplified version of plpgsql plan cache. Perhaps, I've missed >> something. > > A couple of thoughts about v6: > > * I don't think it's okay to just summarily do this: > > /* It's stale; unlink and delete */ > fcinfo->flinfo->fn_extra = NULL; > MemoryContextDelete(fcache->fcontext); > fcache = NULL; > > when fmgr_sql sees that the cache is stale. If we're > doing a nested call of a recursive SQL function, this'd be > cutting the legs out from under the outer recursion level. > plpgsql goes to some lengths to do reference-counting of > function cache entries, and I think you need the same here. > I've looked for original bug report 7881 ( https://www.postgresql.org/message-id/E1U5ytP-0006E3-KB%40wrigleys.postgresql.org ). It's interesting, but it seems that plan cache is not affected by it as when fcinfo xid mismatches we destroy fcinfo, not plan itself (cached plan survives and still can be used). I thought about another case - when recursive function is invalidated during its execution. But I haven't found such case. If function is modified during function execution in another backend, the original backend uses old snapshot during function execution and still sees the old function definition. Looked at the following case - create or replace function f (int) returns setof int language sql as $$ select i from t where t.j = $1 union all select f(i) from t where t.j = $1 $$; and changed function definition to create or replace function f (int) returns setof int language sql as $$ select i from t where t.j = $1 and i > 0 union all select f(i) from t where t.j = $1 $$; during execution of the function. As expected, backend still sees the old definition and uses cached plan. > * I don't like much of anything about 0004. It's messy and it > gives up all the benefit of plan caching in some pretty-common > cases (anywhere where the user was sloppy about what data type > is being returned). I wonder if we couldn't solve that with > more finesse by applying check_sql_fn_retval() to the query tree > after parse analysis, but before we hand it to plancache.c? > This is different from what happens now because we'd be applying > it before not after query rewrite, but I don't think that > query rewrite ever changes the targetlist results. Another > point is that the resultTargetList output might change subtly, > but I don't think we care there either: I believe we only examine > that output for its result data types and resjunk markers. > (This is nonobvious and inadequately documented, but for sure we > couldn't try to execute that tlist --- it's never passed through > the planner.) I'm also not fond of the last patch. So tried to fix it in a way you've suggested - we call check_sql_fn_retval() before creating cached plans. It fixes issue with revalidation happening after modifying query results. One test now changes behavior. What's happening is that after moving extension to another schema, cached plan is invalidated. But revalidation happens and rebuilds the plan. As we've saved analyzed parse tree, not the raw one, we refer to public.dep_req2() not by name, but by oid. Oid didn't change, so cached plan is rebuilt and used. Don't know, should we fix it and if should, how. > > * One diff that caught my eye was > > - if (!ActiveSnapshotSet() && > - plansource->raw_parse_tree && > - analyze_requires_snapshot(plansource->raw_parse_tree)) > + if (!ActiveSnapshotSet() && StmtPlanRequiresRevalidation(plansource)) > > Because StmtPlanRequiresRevalidation uses > stmt_requires_parse_analysis, this is basically throwing away the > distinction between stmt_requires_parse_analysis and > analyze_requires_snapshot. I do not think that's okay, for the > reasons explained in analyze_requires_snapshot. We should expend the > additional notational overhead to keep those things separate. > > * I'm also not thrilled by teaching StmtPlanRequiresRevalidation > how to do something equivalent to stmt_requires_parse_analysis for > Query trees. The entire point of the current division of labor is > for it *not* to know that, but keep the knowledge of the properties > of different statement types in parser/analyze.c. So it looks to me > like we need to add a function to parser/analyze.c that does this. > Not quite sure what to call it though. > querytree_requires_parse_analysis() would be a weird name, because > if it's a Query tree then it's already been through parse analysis. > Maybe querytree_requires_revalidation()? I've created querytree_requires_revalidation() in analyze.c and used it in both StmtPlanRequiresRevalidation() and BuildingPlanRequiresSnapshot(). Both are essentially the same, but this allows to preserve the distinction between stmt_requires_parse_analysis and analyze_requires_snapshot. I've checked old performance results: create or replace function fx2(int) returns int as $$ select 2 * $1; $$ language sql immutable; create or replace function fx3 (int) returns int immutable begin atomic select $1 + $1; end; create or replace function fx4(int) returns numeric as $$ select $1 + $1; $$ language sql immutable; postgres=# do $$ begin for i in 1..1000000 loop perform fx((random()*100)::int); end loop; end; $$; DO Time: 2896.729 ms (00:02.897) postgres=# do $$ begin for i in 1..1000000 loop perform fx((random()*100)::int); end loop; end; $$; DO Time: 2943.926 ms (00:02.944) postgres=# do $$ begin for i in 1..1000000 loop perform fx3((random()*100)::int); end loop; end; $$; DO Time: 2941.629 ms (00:02.942) postgres=# do $$ begin for i in 1..1000000 loop perform fx4((random()*100)::int); end loop; end; $$; DO Time: 2952.383 ms (00:02.952) Here we see the only distinction - fx4() is now also fast, as we use cached plan for it. -- Best regards, Alexander Pyhalov, Postgres Professional
Attachment
Hi
čt 6. 3. 2025 v 9:57 odesílatel Alexander Pyhalov <a.pyhalov@postgrespro.ru> napsal:
Hi.
Tom Lane писал(а) 2025-02-27 23:40:
> Alexander Pyhalov <a.pyhalov@postgrespro.ru> writes:
>> Now sql functions plans are actually saved. The most of it is a
>> simplified version of plpgsql plan cache. Perhaps, I've missed
>> something.
>
> A couple of thoughts about v6:
>
> * I don't think it's okay to just summarily do this:
>
> /* It's stale; unlink and delete */
> fcinfo->flinfo->fn_extra = NULL;
> MemoryContextDelete(fcache->fcontext);
> fcache = NULL;
>
> when fmgr_sql sees that the cache is stale. If we're
> doing a nested call of a recursive SQL function, this'd be
> cutting the legs out from under the outer recursion level.
> plpgsql goes to some lengths to do reference-counting of
> function cache entries, and I think you need the same here.
>
I've looked for original bug report 7881 (
https://www.postgresql.org/message-id/E1U5ytP-0006E3-KB%40wrigleys.postgresql.org
).
It's interesting, but it seems that plan cache is not affected by it as
when fcinfo xid mismatches we destroy fcinfo, not plan itself (cached
plan survives and still can be used).
I thought about another case - when recursive function is invalidated
during its execution. But I haven't found such case. If function is
modified during function execution in another backend, the original
backend uses old snapshot during function execution and still sees the
old function definition. Looked at the following case -
create or replace function f (int) returns setof int language sql as $$
select i from t where t.j = $1
union all
select f(i) from t where t.j = $1
$$;
and changed function definition to
create or replace function f (int) returns setof int language sql as $$
select i from t where t.j = $1 and i > 0
union all
select f(i) from t where t.j = $1
$$;
during execution of the function. As expected, backend still sees the
old definition and uses cached plan.
> * I don't like much of anything about 0004. It's messy and it
> gives up all the benefit of plan caching in some pretty-common
> cases (anywhere where the user was sloppy about what data type
> is being returned). I wonder if we couldn't solve that with
> more finesse by applying check_sql_fn_retval() to the query tree
> after parse analysis, but before we hand it to plancache.c?
> This is different from what happens now because we'd be applying
> it before not after query rewrite, but I don't think that
> query rewrite ever changes the targetlist results. Another
> point is that the resultTargetList output might change subtly,
> but I don't think we care there either: I believe we only examine
> that output for its result data types and resjunk markers.
> (This is nonobvious and inadequately documented, but for sure we
> couldn't try to execute that tlist --- it's never passed through
> the planner.)
I'm also not fond of the last patch. So tried to fix it in a way you've
suggested - we call check_sql_fn_retval() before creating cached plans.
It fixes issue with revalidation happening after modifying query
results.
One test now changes behavior. What's happening is that after moving
extension to another schema, cached plan is invalidated. But
revalidation
happens and rebuilds the plan. As we've saved analyzed parse tree, not
the raw one, we refer to public.dep_req2() not by name, but by oid. Oid
didn't change, so cached plan is rebuilt and used. Don't know, should we
fix it and if should, how.
>
> * One diff that caught my eye was
>
> - if (!ActiveSnapshotSet() &&
> - plansource->raw_parse_tree &&
> - analyze_requires_snapshot(plansource->raw_parse_tree))
> + if (!ActiveSnapshotSet() && StmtPlanRequiresRevalidation(plansource))
>
> Because StmtPlanRequiresRevalidation uses
> stmt_requires_parse_analysis, this is basically throwing away the
> distinction between stmt_requires_parse_analysis and
> analyze_requires_snapshot. I do not think that's okay, for the
> reasons explained in analyze_requires_snapshot. We should expend the
> additional notational overhead to keep those things separate.
>
> * I'm also not thrilled by teaching StmtPlanRequiresRevalidation
> how to do something equivalent to stmt_requires_parse_analysis for
> Query trees. The entire point of the current division of labor is
> for it *not* to know that, but keep the knowledge of the properties
> of different statement types in parser/analyze.c. So it looks to me
> like we need to add a function to parser/analyze.c that does this.
> Not quite sure what to call it though.
> querytree_requires_parse_analysis() would be a weird name, because
> if it's a Query tree then it's already been through parse analysis.
> Maybe querytree_requires_revalidation()?
I've created querytree_requires_revalidation() in analyze.c and used it
in both
StmtPlanRequiresRevalidation() and BuildingPlanRequiresSnapshot(). Both
are essentially the same,
but this allows to preserve the distinction between
stmt_requires_parse_analysis and
analyze_requires_snapshot.
I've checked old performance results:
create or replace function fx2(int) returns int as $$ select 2 * $1; $$
language sql immutable;
create or replace function fx3 (int) returns int immutable begin atomic
select $1 + $1; end;
create or replace function fx4(int) returns numeric as $$ select $1 +
$1; $$ language sql immutable;
postgres=# do $$
begin
for i in 1..1000000 loop
perform fx((random()*100)::int);
end loop;
end;
$$;
DO
Time: 2896.729 ms (00:02.897)
postgres=# do $$
begin
for i in 1..1000000 loop
perform fx((random()*100)::int);
end loop;
end;
$$;
DO
Time: 2943.926 ms (00:02.944)
postgres=# do $$ begin
for i in 1..1000000 loop
perform fx3((random()*100)::int);
end loop;
end;
$$;
DO
Time: 2941.629 ms (00:02.942)
postgres=# do $$
begin
for i in 1..1000000 loop
perform fx4((random()*100)::int);
end loop;
end;
$$;
DO
Time: 2952.383 ms (00:02.952)
Here we see the only distinction - fx4() is now also fast, as we use
cached plan for it.
I checked these tests with
vendor_id : GenuineIntel
cpu family : 6
model : 154
model name : 12th Gen Intel(R) Core(TM) i7-12700H
stepping : 3
microcode : 0x436
cpu MHz : 400.000
cache size : 24576 KB
cpu family : 6
model : 154
model name : 12th Gen Intel(R) Core(TM) i7-12700H
stepping : 3
microcode : 0x436
cpu MHz : 400.000
cache size : 24576 KB
CREATE OR REPLACE FUNCTION fx(int)
RETURNS int AS $$
SELECT $1 + $1
$$ LANGUAGE SQL IMMUTABLE;
CREATE OR REPLACE FUNCTION fx2(int)
RETURNS int AS $$
SELECT $1 * 2
$$ LANGUAGE SQL IMMUTABLE;
create or replace function fx3 (int) returns int immutable begin atomic
select $1 + $1; end;
create or replace function fx4(int) returns numeric as $$ select $1 +
$1; $$ language sql immutable;
create or replace function fx5(int) returns int
as $$
begin
return $1 + $1;
end
$$ language plpgsql immutable;
create or replace function fx6(int) returns int
as $$
begin
return $1 + $1;
end
$$ language plpgsql volatile;
RETURNS int AS $$
SELECT $1 + $1
$$ LANGUAGE SQL IMMUTABLE;
CREATE OR REPLACE FUNCTION fx2(int)
RETURNS int AS $$
SELECT $1 * 2
$$ LANGUAGE SQL IMMUTABLE;
create or replace function fx3 (int) returns int immutable begin atomic
select $1 + $1; end;
create or replace function fx4(int) returns numeric as $$ select $1 +
$1; $$ language sql immutable;
create or replace function fx5(int) returns int
as $$
begin
return $1 + $1;
end
$$ language plpgsql immutable;
create or replace function fx6(int) returns int
as $$
begin
return $1 + $1;
end
$$ language plpgsql volatile;
postgres=# do $$
begin
for i in 1..10000000 loop
perform fx6((random()*100)::int); -- or fx2
end loop;
end;
$$;
begin
for i in 1..10000000 loop
perform fx6((random()*100)::int); -- or fx2
end loop;
end;
$$;
My results are
master f1, fx2, fx3, fx4, fx5, fx6
36233, 7297,45693,40794, 11020,10897
19446, 7315,19777,20547, 11144,10954
36233, 7297,45693,40794, 11020,10897
19446, 7315,19777,20547, 11144,10954
Still I see a small slowdown in today's fast cases, but probably it will not be extra important - on 10M operations it is about 50ms
so in real world there will be other factors more stronger. The speedup in the slow cases is about 50%.
Regards
Pavel
--
Best regards,
Alexander Pyhalov,
Postgres Professional
Hi
I am checking last patches
Maybe interesting change is the change of error message context
QUERY: SELECT public.dep_req2() || ' req3b'.
-CONTEXT: SQL function "dep_req3b" during startup
+CONTEXT: SQL function "dep_req3b" statement 1
-CONTEXT: SQL function "dep_req3b" during startup
+CONTEXT: SQL function "dep_req3b" statement 1
almost all SQL functions have just one statement, so showing the number of the statement looks useless
(until now, I didn't see multiple statements SQL function) ,
we lost the time info "during startup". Maybe the error message can be enhanced more like plpgsql,
instead of statement numbers, the lines or positions should be displayed.
The changing context text can be done in a separate patch - and in this moment, we
can use old behaviour.
Regards
Pavel
Pavel Stehule <pavel.stehule@gmail.com> writes: > Maybe interesting change is the change of error message context > QUERY: SELECT public.dep_req2() || ' req3b'. > -CONTEXT: SQL function "dep_req3b" during startup > +CONTEXT: SQL function "dep_req3b" statement 1 I'm not hugely excited about that given that it's just happening in one case. It might be useful to understand exactly why it's changing, but I doubt it's something we need to "fix". regards, tom lane
Tom Lane писал(а) 2025-03-13 21:29: > Pavel Stehule <pavel.stehule@gmail.com> writes: >> Maybe interesting change is the change of error message context > >> QUERY: SELECT public.dep_req2() || ' req3b'. >> -CONTEXT: SQL function "dep_req3b" during startup >> +CONTEXT: SQL function "dep_req3b" statement 1 > > I'm not hugely excited about that given that it's just happening in > one case. It might be useful to understand exactly why it's changing, > but I doubt it's something we need to "fix". > > regards, tom lane Hi. What happens here that dep_req3b() was already cached and planned. But when another extension test_ext_req_schema2 is moved to another schema, plan is invalidated. Note that function is already planned, and error hppens not "during startup", but when we execute the first cached plan. Now when we are executing the first execution_state, init_execution_state() is called. RevalidateCachedQuery() tries to rewrite query and gets an error while fcache->planning_stmt_number is 1. So, it correctly reports that it's the first statement. Earlier this error was also called in init_execution_state() (in pg_analyze_and_rewrite_withcb()), but it was considered a startup error (as fcache->func_state) doesn't exist when error is thrown. -- Best regards, Alexander Pyhalov, Postgres Professional
I spent some time today going through the actual code in this patch. I realized that there's no longer any point in 0001: the later patches don't move or repeatedly-call that bit of code, so it can be left as-is. What I think we could stand to split out, though, is the changes in the plancache support. The new 0001 attached is just the plancache and analyze.c changes. That could be committed separately, although of course there's little point in pushing it till we're happy with the rest. In general, this patch series is paying far too little attention to updating existing comments that it obsoletes or adding new ones explaining what's going on. For example, the introductory comment for struct SQLFunctionCache still says * Note that currently this has only the lifespan of the calling query. * Someday we should rewrite this code to use plancache.c to save parse/plan * results for longer than that. and I wonder how much of the para after that is still accurate either. The new structs aren't adequately documented either IMO. We now have about three different structs that have something to do with caches by their names, but the reader is left to guess how they fit together. Another example is that the header comment for init_execution_state still describes an argument list it hasn't got anymore. I tried to clean up the comment situation in the plancache in 0001, but I've not done much of anything to functions.c. I'm fairly confused why 0002 and 0003 are separate patches, and the commit messages for them do nothing to clarify that. It seems like you're expecting reviewers to review a very transitory state of affairs in 0002, and it's not clear why. Maybe the code is fine and you just need to explain the change sequence a bit more in the commit messages. 0002 could stand to explain the point of the new test cases, too, especially since one of them seems to be demonstrating the fixing of a pre-existing bug. Something is very wrong in 0004: it should not be breaking that test case in test_extensions. It seems to me we should already have the necessary infrastructure for that, in that the plan ought to have a PlanInvalItem referencing public.dep_req2(), and the ALTER SET SCHEMA that gets done on that function should result in an invalidation. So it looks to me like that patch has somehow rearranged things so we miss an invalidation. I've not tried to figure out why. I'm also sad that 0004 doesn't appear to include any test cases showing it doing something right: without that, why do it at all? regards, tom lane From f975519041cbd278005f1d4035d4fe53a80cb665 Mon Sep 17 00:00:00 2001 From: Tom Lane <tgl@sss.pgh.pa.us> Date: Fri, 14 Mar 2025 15:54:56 -0400 Subject: [PATCH v8 1/4] Support cached plans that work from a parse-analyzed Query. Up to now, plancache.c dealt only with raw parse trees as the starting point for a cached plan. However, we'd like to use this infrastructure for SQL functions, and in the case of a new-style SQL function we'll only have the stored querytree, which corresponds to an analyzed-but-not-rewritten Query. Fortunately, we can make plancache.c handle that scenario with only minor modifications; the biggest change is in RevalidateCachedQuery() where we will need to apply only pg_rewrite_query not pg_analyze_and_rewrite. This patch just installs the infrastructure; there's no caller as yet. Author: Alexander Pyhalov <a.pyhalov@postgrespro.ru> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/8216639.NyiUUSuA9g@aivenlaptop --- src/backend/parser/analyze.c | 39 +++++++ src/backend/utils/cache/plancache.c | 158 +++++++++++++++++++++------- src/include/parser/analyze.h | 1 + src/include/utils/plancache.h | 23 +++- 4 files changed, 179 insertions(+), 42 deletions(-) diff --git a/src/backend/parser/analyze.c b/src/backend/parser/analyze.c index 76f58b3aca3..1f4d6adda52 100644 --- a/src/backend/parser/analyze.c +++ b/src/backend/parser/analyze.c @@ -591,6 +591,45 @@ analyze_requires_snapshot(RawStmt *parseTree) return stmt_requires_parse_analysis(parseTree); } +/* + * query_requires_rewrite_plan() + * Returns true if rewriting or planning is non-trivial for this Query. + * + * This is much like stmt_requires_parse_analysis(), but applies one step + * further down the pipeline. + * + * We do not provide an equivalent of analyze_requires_snapshot(): callers + * can assume that any rewriting or planning activity needs a snapshot. + */ +bool +query_requires_rewrite_plan(Query *query) +{ + bool result; + + if (query->commandType != CMD_UTILITY) + { + /* All optimizable statements require rewriting/planning */ + result = true; + } + else + { + /* This list should match stmt_requires_parse_analysis() */ + switch (nodeTag(query->utilityStmt)) + { + case T_DeclareCursorStmt: + case T_ExplainStmt: + case T_CreateTableAsStmt: + case T_CallStmt: + result = true; + break; + default: + result = false; + break; + } + } + return result; +} + /* * transformDeleteStmt - * transforms a Delete Statement diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c index 6c2979d5c82..5983927a4c2 100644 --- a/src/backend/utils/cache/plancache.c +++ b/src/backend/utils/cache/plancache.c @@ -14,7 +14,7 @@ * Cache invalidation is driven off sinval events. Any CachedPlanSource * that matches the event is marked invalid, as is its generic CachedPlan * if it has one. When (and if) the next demand for a cached plan occurs, - * parse analysis and rewrite is repeated to build a new valid query tree, + * parse analysis and/or rewrite is repeated to build a new valid query tree, * and then planning is performed as normal. We also force re-analysis and * re-planning if the active search_path is different from the previous time * or, if RLS is involved, if the user changes or the RLS environment changes. @@ -63,6 +63,7 @@ #include "nodes/nodeFuncs.h" #include "optimizer/optimizer.h" #include "parser/analyze.h" +#include "rewrite/rewriteHandler.h" #include "storage/lmgr.h" #include "tcop/pquery.h" #include "tcop/utility.h" @@ -74,18 +75,6 @@ #include "utils/syscache.h" -/* - * We must skip "overhead" operations that involve database access when the - * cached plan's subject statement is a transaction control command or one - * that requires a snapshot not to be set yet (such as SET or LOCK). More - * generally, statements that do not require parse analysis/rewrite/plan - * activity never need to be revalidated, so we can treat them all like that. - * For the convenience of postgres.c, treat empty statements that way too. - */ -#define StmtPlanRequiresRevalidation(plansource) \ - ((plansource)->raw_parse_tree != NULL && \ - stmt_requires_parse_analysis((plansource)->raw_parse_tree)) - /* * This is the head of the backend's list of "saved" CachedPlanSources (i.e., * those that are in long-lived storage and are examined for sinval events). @@ -100,6 +89,8 @@ static dlist_head saved_plan_list = DLIST_STATIC_INIT(saved_plan_list); static dlist_head cached_expression_list = DLIST_STATIC_INIT(cached_expression_list); static void ReleaseGenericPlan(CachedPlanSource *plansource); +static bool StmtPlanRequiresRevalidation(CachedPlanSource *plansource); +static bool BuildingPlanRequiresSnapshot(CachedPlanSource *plansource); static List *RevalidateCachedQuery(CachedPlanSource *plansource, QueryEnvironment *queryEnv, bool release_generic); @@ -166,7 +157,7 @@ InitPlanCache(void) } /* - * CreateCachedPlan: initially create a plan cache entry. + * CreateCachedPlan: initially create a plan cache entry for a raw parse tree. * * Creation of a cached plan is divided into two steps, CreateCachedPlan and * CompleteCachedPlan. CreateCachedPlan should be called after running the @@ -220,6 +211,7 @@ CreateCachedPlan(RawStmt *raw_parse_tree, plansource = (CachedPlanSource *) palloc0(sizeof(CachedPlanSource)); plansource->magic = CACHEDPLANSOURCE_MAGIC; plansource->raw_parse_tree = copyObject(raw_parse_tree); + plansource->analyzed_parse_tree = NULL; plansource->query_string = pstrdup(query_string); MemoryContextSetIdentifier(source_context, plansource->query_string); plansource->commandTag = commandTag; @@ -255,6 +247,34 @@ CreateCachedPlan(RawStmt *raw_parse_tree, return plansource; } +/* + * CreateCachedPlanForQuery: initially create a plan cache entry for a Query. + * + * This is used in the same way as CreateCachedPlan, except that the source + * query has already been through parse analysis, and the plancache will never + * try to re-do that step. + * + * Currently this is used only for new-style SQL functions, where we have a + * Query from the function's prosqlbody, but no source text. The query_string + * is typically empty, but is required anyway. + */ +CachedPlanSource * +CreateCachedPlanForQuery(Query *analyzed_parse_tree, + const char *query_string, + CommandTag commandTag) +{ + CachedPlanSource *plansource; + MemoryContext oldcxt; + + /* Rather than duplicating CreateCachedPlan, just do this: */ + plansource = CreateCachedPlan(NULL, query_string, commandTag); + oldcxt = MemoryContextSwitchTo(plansource->context); + plansource->analyzed_parse_tree = copyObject(analyzed_parse_tree); + MemoryContextSwitchTo(oldcxt); + + return plansource; +} + /* * CreateOneShotCachedPlan: initially create a one-shot plan cache entry. * @@ -289,6 +309,7 @@ CreateOneShotCachedPlan(RawStmt *raw_parse_tree, plansource = (CachedPlanSource *) palloc0(sizeof(CachedPlanSource)); plansource->magic = CACHEDPLANSOURCE_MAGIC; plansource->raw_parse_tree = raw_parse_tree; + plansource->analyzed_parse_tree = NULL; plansource->query_string = query_string; plansource->commandTag = commandTag; plansource->param_types = NULL; @@ -566,6 +587,42 @@ ReleaseGenericPlan(CachedPlanSource *plansource) } } +/* + * We must skip "overhead" operations that involve database access when the + * cached plan's subject statement is a transaction control command or one + * that requires a snapshot not to be set yet (such as SET or LOCK). More + * generally, statements that do not require parse analysis/rewrite/plan + * activity never need to be revalidated, so we can treat them all like that. + * For the convenience of postgres.c, treat empty statements that way too. + */ +static bool +StmtPlanRequiresRevalidation(CachedPlanSource *plansource) +{ + if (plansource->raw_parse_tree != NULL) + return stmt_requires_parse_analysis(plansource->raw_parse_tree); + else if (plansource->analyzed_parse_tree != NULL) + return query_requires_rewrite_plan(plansource->analyzed_parse_tree); + /* empty query never needs revalidation */ + return false; +} + +/* + * Determine if creating a plan for this CachedPlanSource requires a snapshot. + * In fact this function matches StmtPlanRequiresRevalidation(), but we want + * to preserve the distinction between stmt_requires_parse_analysis() and + * analyze_requires_snapshot(). + */ +static bool +BuildingPlanRequiresSnapshot(CachedPlanSource *plansource) +{ + if (plansource->raw_parse_tree != NULL) + return analyze_requires_snapshot(plansource->raw_parse_tree); + else if (plansource->analyzed_parse_tree != NULL) + return query_requires_rewrite_plan(plansource->analyzed_parse_tree); + /* empty query never needs a snapshot */ + return false; +} + /* * RevalidateCachedQuery: ensure validity of analyzed-and-rewritten query tree. * @@ -592,7 +649,6 @@ RevalidateCachedQuery(CachedPlanSource *plansource, bool release_generic) { bool snapshot_set; - RawStmt *rawtree; List *tlist; /* transient query-tree list */ List *qlist; /* permanent query-tree list */ TupleDesc resultDesc; @@ -615,7 +671,10 @@ RevalidateCachedQuery(CachedPlanSource *plansource, /* * If the query is currently valid, we should have a saved search_path --- * check to see if that matches the current environment. If not, we want - * to force replan. + * to force replan. (We could almost ignore this consideration when + * working from an analyzed parse tree; but there are scenarios where + * planning can have search_path-dependent results, for example if it + * inlines an old-style SQL function.) */ if (plansource->is_valid) { @@ -662,9 +721,9 @@ RevalidateCachedQuery(CachedPlanSource *plansource, } /* - * Discard the no-longer-useful query tree. (Note: we don't want to do - * this any earlier, else we'd not have been able to release locks - * correctly in the race condition case.) + * Discard the no-longer-useful rewritten query tree. (Note: we don't + * want to do this any earlier, else we'd not have been able to release + * locks correctly in the race condition case.) */ plansource->is_valid = false; plansource->query_list = NIL; @@ -711,25 +770,48 @@ RevalidateCachedQuery(CachedPlanSource *plansource, } /* - * Run parse analysis and rule rewriting. The parser tends to scribble on - * its input, so we must copy the raw parse tree to prevent corruption of - * the cache. + * Run parse analysis (if needed) and rule rewriting. */ - rawtree = copyObject(plansource->raw_parse_tree); - if (rawtree == NULL) - tlist = NIL; - else if (plansource->parserSetup != NULL) - tlist = pg_analyze_and_rewrite_withcb(rawtree, - plansource->query_string, - plansource->parserSetup, - plansource->parserSetupArg, - queryEnv); + if (plansource->raw_parse_tree != NULL) + { + /* Source is raw parse tree */ + RawStmt *rawtree; + + /* + * The parser tends to scribble on its input, so we must copy the raw + * parse tree to prevent corruption of the cache. + */ + rawtree = copyObject(plansource->raw_parse_tree); + if (plansource->parserSetup != NULL) + tlist = pg_analyze_and_rewrite_withcb(rawtree, + plansource->query_string, + plansource->parserSetup, + plansource->parserSetupArg, + queryEnv); + else + tlist = pg_analyze_and_rewrite_fixedparams(rawtree, + plansource->query_string, + plansource->param_types, + plansource->num_params, + queryEnv); + } + else if (plansource->analyzed_parse_tree != NULL) + { + /* Source is pre-analyzed query, so we only need to rewrite */ + Query *analyzed_tree; + + /* The rewriter scribbles on its input, too, so copy */ + analyzed_tree = copyObject(plansource->analyzed_parse_tree); + /* Acquire locks needed before rewriting ... */ + AcquireRewriteLocks(analyzed_tree, true, false); + /* ... and do it */ + tlist = pg_rewrite_query(analyzed_tree); + } else - tlist = pg_analyze_and_rewrite_fixedparams(rawtree, - plansource->query_string, - plansource->param_types, - plansource->num_params, - queryEnv); + { + /* Empty query, nothing to do */ + tlist = NIL; + } /* Release snapshot if we got one */ if (snapshot_set) @@ -963,8 +1045,7 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist, */ snapshot_set = false; if (!ActiveSnapshotSet() && - plansource->raw_parse_tree && - analyze_requires_snapshot(plansource->raw_parse_tree)) + BuildingPlanRequiresSnapshot(plansource)) { PushActiveSnapshot(GetTransactionSnapshot()); snapshot_set = true; @@ -1703,6 +1784,7 @@ CopyCachedPlan(CachedPlanSource *plansource) newsource = (CachedPlanSource *) palloc0(sizeof(CachedPlanSource)); newsource->magic = CACHEDPLANSOURCE_MAGIC; newsource->raw_parse_tree = copyObject(plansource->raw_parse_tree); + newsource->analyzed_parse_tree = copyObject(plansource->analyzed_parse_tree); newsource->query_string = pstrdup(plansource->query_string); MemoryContextSetIdentifier(source_context, newsource->query_string); newsource->commandTag = plansource->commandTag; diff --git a/src/include/parser/analyze.h b/src/include/parser/analyze.h index f1bd18c49f2..f29ed03b476 100644 --- a/src/include/parser/analyze.h +++ b/src/include/parser/analyze.h @@ -52,6 +52,7 @@ extern Query *transformStmt(ParseState *pstate, Node *parseTree); extern bool stmt_requires_parse_analysis(RawStmt *parseTree); extern bool analyze_requires_snapshot(RawStmt *parseTree); +extern bool query_requires_rewrite_plan(Query *query); extern const char *LCS_asString(LockClauseStrength strength); extern void CheckSelectLocking(Query *qry, LockClauseStrength strength); diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h index 199cc323a28..5930fcb50f0 100644 --- a/src/include/utils/plancache.h +++ b/src/include/utils/plancache.h @@ -25,7 +25,8 @@ #include "utils/resowner.h" -/* Forward declaration, to avoid including parsenodes.h here */ +/* Forward declarations, to avoid including parsenodes.h here */ +struct Query; struct RawStmt; /* possible values for plan_cache_mode */ @@ -45,12 +46,22 @@ extern PGDLLIMPORT int plan_cache_mode; /* * CachedPlanSource (which might better have been called CachedQuery) - * represents a SQL query that we expect to use multiple times. It stores - * the query source text, the raw parse tree, and the analyzed-and-rewritten + * represents a SQL query that we expect to use multiple times. It stores the + * query source text, the source parse tree, and the analyzed-and-rewritten * query tree, as well as adjunct data. Cache invalidation can happen as a * result of DDL affecting objects used by the query. In that case we discard * the analyzed-and-rewritten query tree, and rebuild it when next needed. * + * There are two ways in which the source query can be represented: either + * as a raw parse tree, or as an analyzed-but-not-rewritten parse tree. + * In the latter case we expect that cache invalidation need not affect + * the parse-analysis results, only the rewriting and planning steps. + * Only one of raw_parse_tree and analyzed_parse_tree can be non-NULL. + * (If both are NULL, the CachedPlanSource represents an empty query.) + * Note that query_string is typically just an empty string when the + * source query is an analyzed parse tree; also, param_types, num_params, + * parserSetup, and parserSetupArg will not be used. + * * An actual execution plan, represented by CachedPlan, is derived from the * CachedPlanSource when we need to execute the query. The plan could be * either generic (usable with any set of plan parameters) or custom (for a @@ -78,7 +89,7 @@ extern PGDLLIMPORT int plan_cache_mode; * though it may be useful if the CachedPlan can be discarded early.) * * A CachedPlanSource has two associated memory contexts: one that holds the - * struct itself, the query source text and the raw parse tree, and another + * struct itself, the query source text and the source parse tree, and another * context that holds the rewritten query tree and associated data. This * allows the query tree to be discarded easily when it is invalidated. * @@ -94,6 +105,7 @@ typedef struct CachedPlanSource { int magic; /* should equal CACHEDPLANSOURCE_MAGIC */ struct RawStmt *raw_parse_tree; /* output of raw_parser(), or NULL */ + struct Query *analyzed_parse_tree; /* analyzed parse tree, or NULL */ const char *query_string; /* source text of query */ CommandTag commandTag; /* command tag for query */ Oid *param_types; /* array of parameter type OIDs, or NULL */ @@ -196,6 +208,9 @@ extern void ReleaseAllPlanCacheRefsInOwner(ResourceOwner owner); extern CachedPlanSource *CreateCachedPlan(struct RawStmt *raw_parse_tree, const char *query_string, CommandTag commandTag); +extern CachedPlanSource *CreateCachedPlanForQuery(struct Query *analyzed_parse_tree, + const char *query_string, + CommandTag commandTag); extern CachedPlanSource *CreateOneShotCachedPlan(struct RawStmt *raw_parse_tree, const char *query_string, CommandTag commandTag); -- 2.43.5 From eb9d28bd28ad8741cacb6edca83d30b6da3d6589 Mon Sep 17 00:00:00 2001 From: Tom Lane <tgl@sss.pgh.pa.us> Date: Fri, 14 Mar 2025 16:05:25 -0400 Subject: [PATCH v8 2/4] Use custom plan machinery for SQL function --- src/backend/executor/functions.c | 219 +++++++++++++++++----- src/test/regress/expected/rowsecurity.out | 51 +++++ src/test/regress/expected/rules.out | 35 ++++ src/test/regress/sql/rowsecurity.sql | 41 ++++ src/test/regress/sql/rules.sql | 24 +++ 5 files changed, 328 insertions(+), 42 deletions(-) diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c index 6aa8e9c4d8a..ae0425c050a 100644 --- a/src/backend/executor/functions.c +++ b/src/backend/executor/functions.c @@ -33,6 +33,7 @@ #include "utils/datum.h" #include "utils/lsyscache.h" #include "utils/memutils.h" +#include "utils/plancache.h" #include "utils/snapmgr.h" #include "utils/syscache.h" @@ -112,6 +113,12 @@ typedef struct JunkFilter *junkFilter; /* will be NULL if function returns VOID */ + /* Cached plans support */ + List *plansource_list; /* list of plansource */ + List *cplan_list; /* list of cached plans */ + int planning_stmt_number; /* the number of statement we are + * currently planning */ + /* * func_state is a List of execution_state records, each of which is the * first for its original parsetree, with any additional records chained @@ -122,6 +129,8 @@ typedef struct MemoryContext fcontext; /* memory context holding this struct and all * subsidiary data */ + MemoryContext planning_context; /* memory context which is used for + * planning */ LocalTransactionId lxid; /* lxid in which cache was made */ SubTransactionId subxid; /* subxid in which cache was made */ @@ -138,10 +147,9 @@ static Node *sql_fn_make_param(SQLFunctionParseInfoPtr pinfo, int paramno, int location); static Node *sql_fn_resolve_param_name(SQLFunctionParseInfoPtr pinfo, const char *paramname, int location); -static List *init_execution_state(List *queryTree_list, - SQLFunctionCachePtr fcache, +static List *init_execution_state(SQLFunctionCachePtr fcache, bool lazyEvalOK); -static void init_sql_fcache(FunctionCallInfo fcinfo, Oid collation, bool lazyEvalOK); +static void init_sql_fcache(FunctionCallInfo fcinfo, Oid collation, bool *lazyEvalOK); static void postquel_start(execution_state *es, SQLFunctionCachePtr fcache); static bool postquel_getnext(execution_state *es, SQLFunctionCachePtr fcache); static void postquel_end(execution_state *es); @@ -461,45 +469,52 @@ sql_fn_resolve_param_name(SQLFunctionParseInfoPtr pinfo, * querytrees. The sublist structure denotes the original query boundaries. */ static List * -init_execution_state(List *queryTree_list, - SQLFunctionCachePtr fcache, +init_execution_state(SQLFunctionCachePtr fcache, bool lazyEvalOK) { List *eslist = NIL; + List *cplan_list = NIL; execution_state *lasttages = NULL; ListCell *lc1; + MemoryContext oldcontext; + + /* + * Invalidate func_state prior to resetting - otherwise error callback can + * access it + */ + fcache->func_state = NIL; + MemoryContextReset(fcache->planning_context); + + oldcontext = MemoryContextSwitchTo(fcache->planning_context); - foreach(lc1, queryTree_list) + foreach(lc1, fcache->plansource_list) { - List *qtlist = lfirst_node(List, lc1); + CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1); execution_state *firstes = NULL; execution_state *preves = NULL; ListCell *lc2; + CachedPlan *cplan; + + /* Save statement number for error reporting */ + fcache->planning_stmt_number = foreach_current_index(lc1) + 1; + + /* + * Get plan for the query. If paramLI is set, we can get custom plan + */ + cplan = GetCachedPlan(plansource, + fcache->paramLI, + plansource->is_saved ? CurrentResourceOwner : NULL, + NULL); - foreach(lc2, qtlist) + /* Record cplan in plan list to be released on replanning */ + cplan_list = lappend(cplan_list, cplan); + + /* For each planned statement create execution state */ + foreach(lc2, cplan->stmt_list) { - Query *queryTree = lfirst_node(Query, lc2); - PlannedStmt *stmt; + PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2); execution_state *newes; - /* Plan the query if needed */ - if (queryTree->commandType == CMD_UTILITY) - { - /* Utility commands require no planning. */ - stmt = makeNode(PlannedStmt); - stmt->commandType = CMD_UTILITY; - stmt->canSetTag = queryTree->canSetTag; - stmt->utilityStmt = queryTree->utilityStmt; - stmt->stmt_location = queryTree->stmt_location; - stmt->stmt_len = queryTree->stmt_len; - stmt->queryId = queryTree->queryId; - } - else - stmt = pg_plan_query(queryTree, - fcache->src, - CURSOR_OPT_PARALLEL_OK, - NULL); - /* * Precheck all commands for validity in a function. This should * generally match the restrictions spi.c applies. @@ -541,7 +556,7 @@ init_execution_state(List *queryTree_list, newes->stmt = stmt; newes->qd = NULL; - if (queryTree->canSetTag) + if (stmt->canSetTag) lasttages = newes; preves = newes; @@ -573,6 +588,11 @@ init_execution_state(List *queryTree_list, fcache->lazyEval = lasttages->lazyEval = true; } + /* We've finished planning, reset planning statement number */ + fcache->planning_stmt_number = 0; + fcache->cplan_list = cplan_list; + + MemoryContextSwitchTo(oldcontext); return eslist; } @@ -580,7 +600,7 @@ init_execution_state(List *queryTree_list, * Initialize the SQLFunctionCache for a SQL function */ static void -init_sql_fcache(FunctionCallInfo fcinfo, Oid collation, bool lazyEvalOK) +init_sql_fcache(FunctionCallInfo fcinfo, Oid collation, bool *lazyEvalOK) { FmgrInfo *finfo = fcinfo->flinfo; Oid foid = finfo->fn_oid; @@ -596,6 +616,7 @@ init_sql_fcache(FunctionCallInfo fcinfo, Oid collation, bool lazyEvalOK) ListCell *lc; Datum tmp; bool isNull; + List *plansource_list; /* * Create memory context that holds all the SQLFunctionCache data. It @@ -614,6 +635,10 @@ init_sql_fcache(FunctionCallInfo fcinfo, Oid collation, bool lazyEvalOK) */ fcache = (SQLFunctionCachePtr) palloc0(sizeof(SQLFunctionCache)); fcache->fcontext = fcontext; + /* Create separate context for planning */ + fcache->planning_context = AllocSetContextCreate(fcache->fcontext, + "SQL language functions planning context", + ALLOCSET_SMALL_SIZES); finfo->fn_extra = fcache; /* @@ -680,6 +705,7 @@ init_sql_fcache(FunctionCallInfo fcinfo, Oid collation, bool lazyEvalOK) * plancache.c. */ queryTree_list = NIL; + plansource_list = NIL; if (!isNull) { Node *n; @@ -695,8 +721,13 @@ init_sql_fcache(FunctionCallInfo fcinfo, Oid collation, bool lazyEvalOK) { Query *parsetree = lfirst_node(Query, lc); List *queryTree_sublist; + CachedPlanSource *plansource; AcquireRewriteLocks(parsetree, true, false); + + plansource = CreateCachedPlanForQuery(parsetree, fcache->src, CreateCommandTag((Node *) parsetree)); + plansource_list = lappend(plansource_list, plansource); + queryTree_sublist = pg_rewrite_query(parsetree); queryTree_list = lappend(queryTree_list, queryTree_sublist); } @@ -711,6 +742,10 @@ init_sql_fcache(FunctionCallInfo fcinfo, Oid collation, bool lazyEvalOK) { RawStmt *parsetree = lfirst_node(RawStmt, lc); List *queryTree_sublist; + CachedPlanSource *plansource; + + plansource = CreateCachedPlan(parsetree, fcache->src, CreateCommandTag(parsetree->stmt)); + plansource_list = lappend(plansource_list, plansource); queryTree_sublist = pg_analyze_and_rewrite_withcb(parsetree, fcache->src, @@ -751,6 +786,33 @@ init_sql_fcache(FunctionCallInfo fcinfo, Oid collation, bool lazyEvalOK) false, &resulttlist); + /* + * Queries could be rewritten by check_sql_fn_retval(). Now when they have + * their final form, we can complete plan cache entry creation. + */ + if (plansource_list != NIL) + { + ListCell *qlc; + ListCell *plc; + + forboth(qlc, queryTree_list, plc, plansource_list) + { + List *queryTree_sublist = lfirst(qlc); + CachedPlanSource *plansource = lfirst(plc); + + /* Finish filling in the CachedPlanSource */ + CompleteCachedPlan(plansource, + queryTree_sublist, + NULL, + NULL, + 0, + (ParserSetupHook) sql_fn_parser_setup, + fcache->pinfo, + CURSOR_OPT_PARALLEL_OK | CURSOR_OPT_NO_SCROLL, + false); + } + } + /* * Construct a JunkFilter we can use to coerce the returned rowtype to the * desired form, unless the result type is VOID, in which case there's @@ -792,13 +854,10 @@ init_sql_fcache(FunctionCallInfo fcinfo, Oid collation, bool lazyEvalOK) * materialize mode, but to add more smarts in init_execution_state * about this, we'd probably need a three-way flag instead of bool. */ - lazyEvalOK = true; + *lazyEvalOK = true; } - /* Finally, plan the queries */ - fcache->func_state = init_execution_state(queryTree_list, - fcache, - lazyEvalOK); + fcache->plansource_list = plansource_list; /* Mark fcache with time of creation to show it's valid */ fcache->lxid = MyProc->vxid.lxid; @@ -971,7 +1030,12 @@ postquel_sub_params(SQLFunctionCachePtr fcache, prm->value = MakeExpandedObjectReadOnly(fcinfo->args[i].value, prm->isnull, get_typlen(argtypes[i])); - prm->pflags = 0; + + /* + * PARAM_FLAG_CONST is necessary to build efficient custom plan. + */ + prm->pflags = PARAM_FLAG_CONST; + prm->ptype = argtypes[i]; } } @@ -1024,6 +1088,33 @@ postquel_get_single_result(TupleTableSlot *slot, return value; } +/* + * Release plans. This function is called prior to planning + * statements with new parameters. When custom plans are generated + * for each function call in a statement, they can consume too much memory, so + * release them. Generic plans will survive it as plansource holds + * reference to a generic plan. + */ +static void +release_plans(List *cplans) +{ + ListCell *lc; + + /* + * We support separate plan list, so that we visit each plan here only + * once + */ + foreach(lc, cplans) + { + CachedPlan *cplan = lfirst(lc); + + ReleaseCachedPlan(cplan, cplan->is_saved ? CurrentResourceOwner : NULL); + } + + /* Cleanup the list itself */ + list_free(cplans); +} + /* * fmgr_sql: function call manager for SQL functions */ @@ -1042,6 +1133,7 @@ fmgr_sql(PG_FUNCTION_ARGS) Datum result; List *eslist; ListCell *eslc; + bool build_cached_plans = false; /* * Setup error traceback support for ereport() @@ -1097,7 +1189,11 @@ fmgr_sql(PG_FUNCTION_ARGS) if (fcache == NULL) { - init_sql_fcache(fcinfo, PG_GET_COLLATION(), lazyEvalOK); + /* + * init_sql_fcache() can set lazyEvalOK in additional cases when it + * determines that materialize won't work. + */ + init_sql_fcache(fcinfo, PG_GET_COLLATION(), &lazyEvalOK); fcache = (SQLFunctionCachePtr) fcinfo->flinfo->fn_extra; } @@ -1131,12 +1227,37 @@ fmgr_sql(PG_FUNCTION_ARGS) break; } + /* + * We skip actual planning for initial run, so in this case we have to + * build cached plans now. + */ + if (fcache->plansource_list != NIL && eslist == NIL) + build_cached_plans = true; + /* * Convert params to appropriate format if starting a fresh execution. (If * continuing execution, we can re-use prior params.) */ - if (is_first && es && es->status == F_EXEC_START) + if ((is_first && es && es->status == F_EXEC_START) || build_cached_plans) + { postquel_sub_params(fcache, fcinfo); + if (fcache->plansource_list) + { + /* replan the queries */ + fcache->func_state = init_execution_state(fcache, + lazyEvalOK); + /* restore execution state and eslist-related variables */ + eslist = fcache->func_state; + /* find the first non-NULL execution state */ + foreach(eslc, eslist) + { + es = (execution_state *) lfirst(eslc); + + if (es) + break; + } + } + } /* * Build tuplestore to hold results, if we don't have one already. Note @@ -1391,6 +1512,10 @@ fmgr_sql(PG_FUNCTION_ARGS) es = es->next; } } + + /* Release plans when functions stops executing */ + release_plans(fcache->cplan_list); + fcache->cplan_list = NULL; } error_context_stack = sqlerrcontext.previous; @@ -1430,13 +1555,19 @@ sql_exec_error_callback(void *arg) } /* - * Try to determine where in the function we failed. If there is a query - * with non-null QueryDesc, finger it. (We check this rather than looking - * for F_EXEC_RUN state, so that errors during ExecutorStart or + * Try to determine where in the function we failed. If failure happens + * while building plans, look at planning_stmt_number. Else if there is a + * query with non-null QueryDesc, finger it. (We check this rather than + * looking for F_EXEC_RUN state, so that errors during ExecutorStart or * ExecutorEnd are blamed on the appropriate query; see postquel_start and * postquel_end.) */ - if (fcache->func_state) + if (fcache->planning_stmt_number) + { + errcontext("SQL function \"%s\" statement %d", + fcache->fname, fcache->planning_stmt_number); + } + else if (fcache->func_state) { execution_state *es; int query_num; @@ -1522,6 +1653,10 @@ ShutdownSQLFunction(Datum arg) tuplestore_end(fcache->tstore); fcache->tstore = NULL; + /* Release plans when functions stops executing */ + release_plans(fcache->cplan_list); + fcache->cplan_list = NULL; + /* execUtils will deregister the callback... */ fcache->shutdown_reg = false; } diff --git a/src/test/regress/expected/rowsecurity.out b/src/test/regress/expected/rowsecurity.out index 87929191d06..438eaf69928 100644 --- a/src/test/regress/expected/rowsecurity.out +++ b/src/test/regress/expected/rowsecurity.out @@ -4695,6 +4695,57 @@ RESET ROLE; DROP FUNCTION rls_f(); DROP VIEW rls_v; DROP TABLE rls_t; +-- RLS changes invalidate cached function plans +create table rls_t (c text); +create table test_t (c text); +insert into rls_t values ('a'), ('b'), ('c'), ('d'); +insert into test_t values ('a'), ('b'); +alter table rls_t enable row level security; +grant select on rls_t to regress_rls_alice; +grant select on test_t to regress_rls_alice; +create policy p1 on rls_t for select to regress_rls_alice using (c = current_setting('rls_test.blah')); +-- Function changes row_security setting and so invalidates plan +create or replace function rls_f(text) + RETURNS text + LANGUAGE sql +BEGIN ATOMIC + select set_config('rls_test.blah', $1, true) || set_config('row_security', 'false', true) || string_agg(c, ',' order byc) from rls_t; +END; +-- Table owner bypasses RLS +select rls_f(c) from test_t order by rls_f; + rls_f +------------- + aoffa,b,c,d + boffa,b,c,d +(2 rows) + +set role regress_rls_alice; +-- For casual user changes in row_security setting lead +-- to error during query rewrite +select rls_f(c) from test_t order by rls_f; +ERROR: query would be affected by row-level security policy for table "rls_t" +CONTEXT: SQL function "rls_f" statement 1 +reset role; +set plan_cache_mode to force_generic_plan; +-- Table owner bypasses RLS, but cached plan invalidates +select rls_f(c) from test_t order by rls_f; + rls_f +------------- + aoffa,b,c,d + boffa,b,c,d +(2 rows) + +-- For casual user changes in row_security setting lead +-- to plan invalidation and error during query rewrite +set role regress_rls_alice; +select rls_f(c) from test_t order by rls_f; +ERROR: query would be affected by row-level security policy for table "rls_t" +CONTEXT: SQL function "rls_f" statement 1 +reset role; +reset plan_cache_mode; +reset rls_test.blah; +drop function rls_f; +drop table rls_t, test_t; -- -- Clean up objects -- diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out index 62f69ac20b2..b9fe71f391d 100644 --- a/src/test/regress/expected/rules.out +++ b/src/test/regress/expected/rules.out @@ -3878,3 +3878,38 @@ DROP TABLE ruletest_t3; DROP TABLE ruletest_t2; DROP TABLE ruletest_t1; DROP USER regress_rule_user1; +-- Test that SQL functions correctly handle DO NOTHING rule +CREATE TABLE some_data (i int, data text); +CREATE TABLE some_data_values (i int, data text); +CREATE FUNCTION insert_data(i int, data text) +RETURNS INT +AS $$ +INSERT INTO some_data VALUES ($1, $2); +SELECT 1; +$$ LANGUAGE SQL; +INSERT INTO some_data_values SELECT i , 'data'|| i FROM generate_series(1, 10) i; +CREATE RULE some_data_noinsert AS ON INSERT TO some_data DO INSTEAD NOTHING; +SELECT insert_data(i, data) FROM some_data_values; + insert_data +------------- + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 +(10 rows) + +SELECT * FROM some_data ORDER BY i; + i | data +---+------ +(0 rows) + +DROP RULE some_data_noinsert ON some_data; +DROP TABLE some_data_values; +DROP TABLE some_data; +DROP FUNCTION insert_data(int, text); diff --git a/src/test/regress/sql/rowsecurity.sql b/src/test/regress/sql/rowsecurity.sql index f61dbbf9581..9fe8f4b059c 100644 --- a/src/test/regress/sql/rowsecurity.sql +++ b/src/test/regress/sql/rowsecurity.sql @@ -2307,6 +2307,47 @@ DROP FUNCTION rls_f(); DROP VIEW rls_v; DROP TABLE rls_t; +-- RLS changes invalidate cached function plans +create table rls_t (c text); +create table test_t (c text); + +insert into rls_t values ('a'), ('b'), ('c'), ('d'); +insert into test_t values ('a'), ('b'); +alter table rls_t enable row level security; +grant select on rls_t to regress_rls_alice; +grant select on test_t to regress_rls_alice; +create policy p1 on rls_t for select to regress_rls_alice using (c = current_setting('rls_test.blah')); + +-- Function changes row_security setting and so invalidates plan +create or replace function rls_f(text) + RETURNS text + LANGUAGE sql +BEGIN ATOMIC + select set_config('rls_test.blah', $1, true) || set_config('row_security', 'false', true) || string_agg(c, ',' order byc) from rls_t; +END; + +-- Table owner bypasses RLS +select rls_f(c) from test_t order by rls_f; +set role regress_rls_alice; +-- For casual user changes in row_security setting lead +-- to error during query rewrite +select rls_f(c) from test_t order by rls_f; +reset role; + +set plan_cache_mode to force_generic_plan; +-- Table owner bypasses RLS, but cached plan invalidates +select rls_f(c) from test_t order by rls_f; +-- For casual user changes in row_security setting lead +-- to plan invalidation and error during query rewrite +set role regress_rls_alice; +select rls_f(c) from test_t order by rls_f; +reset role; +reset plan_cache_mode; +reset rls_test.blah; + +drop function rls_f; +drop table rls_t, test_t; + -- -- Clean up objects -- diff --git a/src/test/regress/sql/rules.sql b/src/test/regress/sql/rules.sql index fdd3ff1d161..505449452ee 100644 --- a/src/test/regress/sql/rules.sql +++ b/src/test/regress/sql/rules.sql @@ -1432,3 +1432,27 @@ DROP TABLE ruletest_t2; DROP TABLE ruletest_t1; DROP USER regress_rule_user1; + +-- Test that SQL functions correctly handle DO NOTHING rule +CREATE TABLE some_data (i int, data text); +CREATE TABLE some_data_values (i int, data text); + +CREATE FUNCTION insert_data(i int, data text) +RETURNS INT +AS $$ +INSERT INTO some_data VALUES ($1, $2); +SELECT 1; +$$ LANGUAGE SQL; + +INSERT INTO some_data_values SELECT i , 'data'|| i FROM generate_series(1, 10) i; + +CREATE RULE some_data_noinsert AS ON INSERT TO some_data DO INSTEAD NOTHING; + +SELECT insert_data(i, data) FROM some_data_values; + +SELECT * FROM some_data ORDER BY i; + +DROP RULE some_data_noinsert ON some_data; +DROP TABLE some_data_values; +DROP TABLE some_data; +DROP FUNCTION insert_data(int, text); -- 2.43.5 From c0dfeca675283aec80768d5cb82bc8dbcff50b13 Mon Sep 17 00:00:00 2001 From: Tom Lane <tgl@sss.pgh.pa.us> Date: Fri, 14 Mar 2025 16:14:51 -0400 Subject: [PATCH v8 3/4] Introduce SQL functions plan cache --- src/backend/executor/functions.c | 657 ++++++++++++++---- .../expected/test_extensions.out | 2 +- src/tools/pgindent/typedefs.list | 2 + 3 files changed, 526 insertions(+), 135 deletions(-) diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c index ae0425c050a..678ca55a026 100644 --- a/src/backend/executor/functions.c +++ b/src/backend/executor/functions.c @@ -18,6 +18,8 @@ #include "access/xact.h" #include "catalog/pg_proc.h" #include "catalog/pg_type.h" +#include "commands/trigger.h" +#include "commands/event_trigger.h" #include "executor/functions.h" #include "funcapi.h" #include "miscadmin.h" @@ -138,6 +140,46 @@ typedef struct typedef SQLFunctionCache *SQLFunctionCachePtr; +/* + * Plan cache-related structures + */ +typedef struct SQLFunctionPlanKey +{ + Oid fn_oid; + Oid inputCollation; + Oid argtypes[FUNC_MAX_ARGS]; +} SQLFunctionPlanKey; + +typedef struct SQLFunctionPlanEntry +{ + SQLFunctionPlanKey key; + + /* Fields required to invalidate a cache entry */ + TransactionId fn_xmin; + ItemPointerData fn_tid; + + /* + * result_tlist is required to recreate function execution state as well + * as to validate a cache entry + */ + List *result_tlist; + + bool returnsTuple; /* True if this function returns tuple */ + List *plansource_list; /* List of CachedPlanSource for this + * function */ + + /* + * SQLFunctionParseInfoPtr is used as hooks arguments, so should persist + * across calls. Fortunately, if it doesn't, this means that argtypes or + * collation mismatches and we get new cache entry. + */ + SQLFunctionParseInfoPtr pinfo; /* cached information about arguments */ + + MemoryContext entry_ctx; /* memory context for allocated fields of this + * entry */ +} SQLFunctionPlanEntry; + +static HTAB *sql_plan_cache_htab = NULL; /* non-export function prototypes */ static Node *sql_fn_param_ref(ParseState *pstate, ParamRef *pref); @@ -171,6 +213,48 @@ static bool sqlfunction_receive(TupleTableSlot *slot, DestReceiver *self); static void sqlfunction_shutdown(DestReceiver *self); static void sqlfunction_destroy(DestReceiver *self); +/* SQL-functions plan cache-related routines */ +static void compute_plan_entry_key(SQLFunctionPlanKey *hashkey, FunctionCallInfo fcinfo, Form_pg_proc procedureStruct); +static SQLFunctionPlanEntry *get_cached_plan_entry(SQLFunctionPlanKey *hashkey); +static void save_cached_plan_entry(SQLFunctionPlanKey *hashkey, HeapTuple procedureTuple, List *plansource_list, List *result_tlist,bool returnsTuple, SQLFunctionParseInfoPtr pinfo, MemoryContext alianable_context); +static void delete_cached_plan_entry(SQLFunctionPlanEntry *entry); + +static bool check_sql_fn_retval_matches(List *tlist, Oid rettype, TupleDesc rettupdesc, char prokind); +static bool target_entry_has_compatible_type(TargetEntry *tle, Oid res_type, int32 res_typmod); + +/* + * Fill array of arguments with actual function argument types oids + */ +static void +compute_argument_types(Oid *argOidVect, Form_pg_proc procedureStruct, Node *call_expr) +{ + int argnum; + int nargs; + + nargs = procedureStruct->pronargs; + if (nargs > 0) + { + memcpy(argOidVect, + procedureStruct->proargtypes.values, + nargs * sizeof(Oid)); + + for (argnum = 0; argnum < nargs; argnum++) + { + Oid argtype = argOidVect[argnum]; + + if (IsPolymorphicType(argtype)) + { + argtype = get_call_expr_argtype(call_expr, argnum); + if (argtype == InvalidOid) + ereport(ERROR, + (errcode(ERRCODE_DATATYPE_MISMATCH), + errmsg("could not determine actual type of argument declared %s", + format_type_be(argOidVect[argnum])))); + argOidVect[argnum] = argtype; + } + } + } +} /* * Prepare the SQLFunctionParseInfo struct for parsing a SQL function body @@ -204,31 +288,8 @@ prepare_sql_fn_parse_info(HeapTuple procedureTuple, pinfo->nargs = nargs = procedureStruct->pronargs; if (nargs > 0) { - Oid *argOidVect; - int argnum; - - argOidVect = (Oid *) palloc(nargs * sizeof(Oid)); - memcpy(argOidVect, - procedureStruct->proargtypes.values, - nargs * sizeof(Oid)); - - for (argnum = 0; argnum < nargs; argnum++) - { - Oid argtype = argOidVect[argnum]; - - if (IsPolymorphicType(argtype)) - { - argtype = get_call_expr_argtype(call_expr, argnum); - if (argtype == InvalidOid) - ereport(ERROR, - (errcode(ERRCODE_DATATYPE_MISMATCH), - errmsg("could not determine actual type of argument declared %s", - format_type_be(argOidVect[argnum])))); - argOidVect[argnum] = argtype; - } - } - - pinfo->argtypes = argOidVect; + pinfo->argtypes = (Oid *) palloc(nargs * sizeof(Oid)); + compute_argument_types(pinfo->argtypes, procedureStruct, call_expr); } /* @@ -596,6 +657,264 @@ init_execution_state(SQLFunctionCachePtr fcache, return eslist; } +/* + * Compute key for searching plan entry in backend cache + */ +static void +compute_plan_entry_key(SQLFunctionPlanKey *hashkey, FunctionCallInfo fcinfo, Form_pg_proc procedureStruct) +{ + MemSet(hashkey, 0, sizeof(SQLFunctionPlanKey)); + + hashkey->fn_oid = fcinfo->flinfo->fn_oid; + + /* set input collation, if known */ + hashkey->inputCollation = fcinfo->fncollation; + + if (procedureStruct->pronargs > 0) + { + /* get the argument types */ + compute_argument_types(hashkey->argtypes, procedureStruct, fcinfo->flinfo->fn_expr); + } +} + +/* + * Get cached plan by pre-computed key + */ +static SQLFunctionPlanEntry * +get_cached_plan_entry(SQLFunctionPlanKey *hashkey) +{ + SQLFunctionPlanEntry *plan_entry = NULL; + + if (sql_plan_cache_htab) + { + plan_entry = (SQLFunctionPlanEntry *) hash_search(sql_plan_cache_htab, + hashkey, + HASH_FIND, + NULL); + } + return plan_entry; +} + +/* + * Save function execution plan in cache + */ +static void +save_cached_plan_entry(SQLFunctionPlanKey *hashkey, HeapTuple procedureTuple, List *plansource_list, List *result_tlist,bool returnsTuple, SQLFunctionParseInfoPtr pinfo, MemoryContext alianable_context) +{ + MemoryContext oldcontext; + MemoryContext entry_context; + SQLFunctionPlanEntry *entry; + ListCell *lc; + bool found; + + if (sql_plan_cache_htab == NULL) + { + HASHCTL ctl; + + ctl.keysize = sizeof(SQLFunctionPlanKey); + ctl.entrysize = sizeof(SQLFunctionPlanEntry); + ctl.hcxt = CacheMemoryContext; + + sql_plan_cache_htab = hash_create("SQL function plan hash", + 100 /* arbitrary initial size */ , + &ctl, + HASH_ELEM | HASH_BLOBS | HASH_CONTEXT); + } + + entry = (SQLFunctionPlanEntry *) hash_search(sql_plan_cache_htab, + hashkey, + HASH_ENTER, + &found); + if (found) + elog(WARNING, "trying to insert a function that already exists"); + + /* + * Create long-lived memory context that holds entry fields + */ + entry_context = AllocSetContextCreate(CacheMemoryContext, + "SQL function plan entry context", + ALLOCSET_DEFAULT_SIZES); + + oldcontext = MemoryContextSwitchTo(entry_context); + + /* fill entry */ + memcpy(&entry->key, hashkey, sizeof(SQLFunctionPlanKey)); + + entry->entry_ctx = entry_context; + + /* Some generated data, like pinfo, should be reparented */ + MemoryContextSetParent(alianable_context, entry->entry_ctx); + + entry->pinfo = pinfo; + + /* Preserve list in long-lived context */ + if (plansource_list) + entry->plansource_list = list_copy(plansource_list); + else + entry->plansource_list = NULL; + + entry->result_tlist = copyObject(result_tlist); + + entry->returnsTuple = returnsTuple; + + /* Fill fields needed to invalidate cache entry */ + entry->fn_xmin = HeapTupleHeaderGetRawXmin(procedureTuple->t_data); + entry->fn_tid = procedureTuple->t_self; + + /* Save plans */ + foreach(lc, entry->plansource_list) + { + CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc); + + SaveCachedPlan(plansource); + } + MemoryContextSwitchTo(oldcontext); + +} + +/* + * Remove plan from cache + */ +static void +delete_cached_plan_entry(SQLFunctionPlanEntry *entry) +{ + ListCell *lc; + bool found; + + /* Release plans */ + foreach(lc, entry->plansource_list) + { + CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc); + + DropCachedPlan(plansource); + } + MemoryContextDelete(entry->entry_ctx); + + hash_search(sql_plan_cache_htab, &entry->key, HASH_REMOVE, &found); + Assert(found); +} + +/* + * Determine if TargetEntry is compatible to specified type + */ +static bool +target_entry_has_compatible_type(TargetEntry *tle, Oid res_type, int32 res_typmod) +{ + Var *var; + Node *cast_result; + bool result = true; + + /* Are types equivalent? */ + var = makeVarFromTargetEntry(1, tle); + + cast_result = coerce_to_target_type(NULL, + (Node *) var, + var->vartype, + res_type, res_typmod, + COERCION_ASSIGNMENT, + COERCE_IMPLICIT_CAST, + -1); + + /* + * If conversion is not possible or requires a cast, entry is incompatible + * with the type. + */ + if (cast_result == NULL || cast_result != (Node *) var) + result = false; + + if (cast_result && cast_result != (Node *) var) + pfree(cast_result); + pfree(var); + + return result; +} + +/* + * Check if result tlist would be changed by check_sql_fn_retval() + */ +static bool +check_sql_fn_retval_matches(List *tlist, Oid rettype, TupleDesc rettupdesc, char prokind) +{ + char fn_typtype; + int tlistlen; + + /* + * Count the non-junk entries in the result targetlist. + */ + tlistlen = ExecCleanTargetListLength(tlist); + + fn_typtype = get_typtype(rettype); + + if (fn_typtype == TYPTYPE_BASE || + fn_typtype == TYPTYPE_DOMAIN || + fn_typtype == TYPTYPE_ENUM || + fn_typtype == TYPTYPE_RANGE || + fn_typtype == TYPTYPE_MULTIRANGE) + { + TargetEntry *tle; + + /* Something unexpected, invalidate cached plan */ + if (tlistlen != 1) + return false; + + tle = (TargetEntry *) linitial(tlist); + + return target_entry_has_compatible_type(tle, rettype, -1); + } + else if (fn_typtype == TYPTYPE_COMPOSITE || rettype == RECORDOID) + { + ListCell *lc; + int colindex; + int tupnatts; + + if (tlistlen == 1 && prokind != PROKIND_PROCEDURE) + { + TargetEntry *tle = (TargetEntry *) linitial(tlist); + + return target_entry_has_compatible_type(tle, rettype, -1); + } + + /* We consider results comnpatible if there's no tupledesc */ + if (rettupdesc == NULL) + return true; + + /* + * Verify that saved targetlist matches the return tuple type. + */ + tupnatts = rettupdesc->natts; + colindex = 0; + foreach(lc, tlist) + { + TargetEntry *tle = (TargetEntry *) lfirst(lc); + Form_pg_attribute attr; + + /* resjunk columns can simply be ignored */ + if (tle->resjunk) + continue; + + do + { + colindex++; + if (colindex > tupnatts) + return false; + + attr = TupleDescAttr(rettupdesc, colindex - 1); + } while (attr->attisdropped); + + if (!target_entry_has_compatible_type(tle, attr->atttypid, attr->atttypmod)) + return false; + } + + /* remaining columns in rettupdesc had better all be dropped */ + for (colindex++; colindex <= tupnatts; colindex++) + { + if (!TupleDescCompactAttr(rettupdesc, colindex - 1)->attisdropped) + return false; + } + } + return true; +} + /* * Initialize the SQLFunctionCache for a SQL function */ @@ -617,6 +936,10 @@ init_sql_fcache(FunctionCallInfo fcinfo, Oid collation, bool *lazyEvalOK) Datum tmp; bool isNull; List *plansource_list; + SQLFunctionPlanEntry *cached_plan_entry = NULL; + SQLFunctionPlanKey plan_cache_entry_key; + bool use_plan_cache; + bool plan_cache_entry_valid; /* * Create memory context that holds all the SQLFunctionCache data. It @@ -674,15 +997,6 @@ init_sql_fcache(FunctionCallInfo fcinfo, Oid collation, bool *lazyEvalOK) fcache->readonly_func = (procedureStruct->provolatile != PROVOLATILE_VOLATILE); - /* - * We need the actual argument types to pass to the parser. Also make - * sure that parameter symbols are considered to have the function's - * resolved input collation. - */ - fcache->pinfo = prepare_sql_fn_parse_info(procedureTuple, - finfo->fn_expr, - collation); - /* * And of course we need the function body text. */ @@ -695,122 +1009,200 @@ init_sql_fcache(FunctionCallInfo fcinfo, Oid collation, bool *lazyEvalOK) Anum_pg_proc_prosqlbody, &isNull); + + use_plan_cache = true; + plan_cache_entry_valid = false; + /* - * Parse and rewrite the queries in the function text. Use sublists to - * keep track of the original query boundaries. - * - * Note: since parsing and planning is done in fcontext, we will generate - * a lot of cruft that lives as long as the fcache does. This is annoying - * but we'll not worry about it until the module is rewritten to use - * plancache.c. + * If function is trigger, we can see different rowtypes or transition + * table names. So don't use cache for such plans. */ - queryTree_list = NIL; - plansource_list = NIL; - if (!isNull) - { - Node *n; - List *stored_query_list; + if (CALLED_AS_TRIGGER(fcinfo) || CALLED_AS_EVENT_TRIGGER(fcinfo)) + use_plan_cache = false; - n = stringToNode(TextDatumGetCString(tmp)); - if (IsA(n, List)) - stored_query_list = linitial_node(List, castNode(List, n)); - else - stored_query_list = list_make1(n); + if (use_plan_cache) + { + compute_plan_entry_key(&plan_cache_entry_key, fcinfo, procedureStruct); - foreach(lc, stored_query_list) + cached_plan_entry = get_cached_plan_entry(&plan_cache_entry_key); + if (cached_plan_entry) { - Query *parsetree = lfirst_node(Query, lc); - List *queryTree_sublist; - CachedPlanSource *plansource; - - AcquireRewriteLocks(parsetree, true, false); - - plansource = CreateCachedPlanForQuery(parsetree, fcache->src, CreateCommandTag((Node *) parsetree)); - plansource_list = lappend(plansource_list, plansource); - - queryTree_sublist = pg_rewrite_query(parsetree); - queryTree_list = lappend(queryTree_list, queryTree_sublist); + if (cached_plan_entry->fn_xmin == HeapTupleHeaderGetRawXmin(procedureTuple->t_data) && + ItemPointerEquals(&cached_plan_entry->fn_tid, &procedureTuple->t_self)) + { + /* + * Avoid using plan if returned result type doesn't match the + * expected one. check_sql_fn_retval() in this case would + * change query to match expected result type. But we've + * already planned query, possibly modified to match another + * result type. So discard the cached entry and replan. + */ + if (check_sql_fn_retval_matches(cached_plan_entry->result_tlist, rettype, rettupdesc, procedureStruct->prokind)) + plan_cache_entry_valid = true; + } + if (!plan_cache_entry_valid) + delete_cached_plan_entry(cached_plan_entry); } } + + if (plan_cache_entry_valid) + { + plansource_list = cached_plan_entry->plansource_list; + resulttlist = copyObject(cached_plan_entry->result_tlist); + fcache->returnsTuple = cached_plan_entry->returnsTuple; + fcache->pinfo = cached_plan_entry->pinfo; + } else { - List *raw_parsetree_list; + MemoryContext alianable_context = fcontext; + + /* We need to preserve parse info */ + if (use_plan_cache) + { + alianable_context = AllocSetContextCreate(CurrentMemoryContext, + "SQL function plan entry alianable context", + ALLOCSET_DEFAULT_SIZES); + + MemoryContextSwitchTo(alianable_context); + } - raw_parsetree_list = pg_parse_query(fcache->src); + /* + * We need the actual argument types to pass to the parser. Also make + * sure that parameter symbols are considered to have the function's + * resolved input collation. + */ + fcache->pinfo = prepare_sql_fn_parse_info(procedureTuple, + finfo->fn_expr, + collation); + + if (use_plan_cache) + MemoryContextSwitchTo(fcontext); - foreach(lc, raw_parsetree_list) + /* + * Parse and rewrite the queries in the function text. Use sublists + * to keep track of the original query boundaries. + * + * Note: since parsing and planning is done in fcontext, we will + * generate a lot of cruft that lives as long as the fcache does. This + * is annoying but we'll not worry about it until the module is + * rewritten to use plancache.c. + */ + + plansource_list = NIL; + + queryTree_list = NIL; + if (!isNull) { - RawStmt *parsetree = lfirst_node(RawStmt, lc); - List *queryTree_sublist; - CachedPlanSource *plansource; - - plansource = CreateCachedPlan(parsetree, fcache->src, CreateCommandTag(parsetree->stmt)); - plansource_list = lappend(plansource_list, plansource); - - queryTree_sublist = pg_analyze_and_rewrite_withcb(parsetree, - fcache->src, - (ParserSetupHook) sql_fn_parser_setup, - fcache->pinfo, - NULL); - queryTree_list = lappend(queryTree_list, queryTree_sublist); + Node *n; + List *stored_query_list; + + n = stringToNode(TextDatumGetCString(tmp)); + if (IsA(n, List)) + stored_query_list = linitial_node(List, castNode(List, n)); + else + stored_query_list = list_make1(n); + + foreach(lc, stored_query_list) + { + Query *parsetree = lfirst_node(Query, lc); + List *queryTree_sublist; + CachedPlanSource *plansource; + + AcquireRewriteLocks(parsetree, true, false); + + plansource = CreateCachedPlanForQuery(parsetree, fcache->src, CreateCommandTag((Node *) parsetree)); + plansource_list = lappend(plansource_list, plansource); + + queryTree_sublist = pg_rewrite_query(parsetree); + queryTree_list = lappend(queryTree_list, queryTree_sublist); + } } - } + else + { + List *raw_parsetree_list; - /* - * Check that there are no statements we don't want to allow. - */ - check_sql_fn_statements(queryTree_list); + raw_parsetree_list = pg_parse_query(fcache->src); - /* - * Check that the function returns the type it claims to. Although in - * simple cases this was already done when the function was defined, we - * have to recheck because database objects used in the function's queries - * might have changed type. We'd have to recheck anyway if the function - * had any polymorphic arguments. Moreover, check_sql_fn_retval takes - * care of injecting any required column type coercions. (But we don't - * ask it to insert nulls for dropped columns; the junkfilter handles - * that.) - * - * Note: we set fcache->returnsTuple according to whether we are returning - * the whole tuple result or just a single column. In the latter case we - * clear returnsTuple because we need not act different from the scalar - * result case, even if it's a rowtype column. (However, we have to force - * lazy eval mode in that case; otherwise we'd need extra code to expand - * the rowtype column into multiple columns, since we have no way to - * notify the caller that it should do that.) - */ - fcache->returnsTuple = check_sql_fn_retval(queryTree_list, - rettype, - rettupdesc, - procedureStruct->prokind, - false, - &resulttlist); + foreach(lc, raw_parsetree_list) + { + RawStmt *parsetree = lfirst_node(RawStmt, lc); + List *queryTree_sublist; + CachedPlanSource *plansource; + + plansource = CreateCachedPlan(parsetree, fcache->src, CreateCommandTag(parsetree->stmt)); + plansource_list = lappend(plansource_list, plansource); + + queryTree_sublist = pg_analyze_and_rewrite_withcb(parsetree, + fcache->src, + (ParserSetupHook) sql_fn_parser_setup, + fcache->pinfo, + NULL); + queryTree_list = lappend(queryTree_list, queryTree_sublist); + } + } - /* - * Queries could be rewritten by check_sql_fn_retval(). Now when they have - * their final form, we can complete plan cache entry creation. - */ - if (plansource_list != NIL) - { - ListCell *qlc; - ListCell *plc; + /* + * Check that there are no statements we don't want to allow. + */ + check_sql_fn_statements(queryTree_list); + + /* + * Check that the function returns the type it claims to. Although in + * simple cases this was already done when the function was defined, + * we have to recheck because database objects used in the function's + * queries might have changed type. We'd have to recheck anyway if + * the function had any polymorphic arguments. Moreover, + * check_sql_fn_retval takes care of injecting any required column + * type coercions. (But we don't ask it to insert nulls for dropped + * columns; the junkfilter handles that.) + * + * Note: we set fcache->returnsTuple according to whether we are + * returning the whole tuple result or just a single column. In the + * latter case we clear returnsTuple because we need not act different + * from the scalar result case, even if it's a rowtype column. + * (However, we have to force lazy eval mode in that case; otherwise + * we'd need extra code to expand the rowtype column into multiple + * columns, since we have no way to notify the caller that it should + * do that.) + */ - forboth(qlc, queryTree_list, plc, plansource_list) + fcache->returnsTuple = check_sql_fn_retval(queryTree_list, + rettype, + rettupdesc, + procedureStruct->prokind, + false, + &resulttlist); + + /* + * Queries could be rewritten by check_sql_fn_retval(). Now when they + * have their final form, we can complete plan cache entry creation. + */ + if (plansource_list != NIL) { - List *queryTree_sublist = lfirst(qlc); - CachedPlanSource *plansource = lfirst(plc); - - /* Finish filling in the CachedPlanSource */ - CompleteCachedPlan(plansource, - queryTree_sublist, - NULL, - NULL, - 0, - (ParserSetupHook) sql_fn_parser_setup, - fcache->pinfo, - CURSOR_OPT_PARALLEL_OK | CURSOR_OPT_NO_SCROLL, - false); + ListCell *qlc; + ListCell *plc; + + forboth(qlc, queryTree_list, plc, plansource_list) + { + List *queryTree_sublist = lfirst(qlc); + CachedPlanSource *plansource = lfirst(plc); + + /* Finish filling in the CachedPlanSource */ + CompleteCachedPlan(plansource, + queryTree_sublist, + NULL, + NULL, + 0, + (ParserSetupHook) sql_fn_parser_setup, + fcache->pinfo, + CURSOR_OPT_PARALLEL_OK | CURSOR_OPT_NO_SCROLL, + false); + } } + + /* If we can possibly use cached plan entry, save it. */ + if (use_plan_cache) + save_cached_plan_entry(&plan_cache_entry_key, procedureTuple, plansource_list, resulttlist, fcache->returnsTuple,fcache->pinfo, alianable_context); } /* @@ -1110,9 +1502,6 @@ release_plans(List *cplans) ReleaseCachedPlan(cplan, cplan->is_saved ? CurrentResourceOwner : NULL); } - - /* Cleanup the list itself */ - list_free(cplans); } /* diff --git a/src/test/modules/test_extensions/expected/test_extensions.out b/src/test/modules/test_extensions/expected/test_extensions.out index d5388a1fecf..72bae1bf254 100644 --- a/src/test/modules/test_extensions/expected/test_extensions.out +++ b/src/test/modules/test_extensions/expected/test_extensions.out @@ -651,7 +651,7 @@ LINE 1: SELECT public.dep_req2() || ' req3b' ^ HINT: No function matches the given name and argument types. You might need to add explicit type casts. QUERY: SELECT public.dep_req2() || ' req3b' -CONTEXT: SQL function "dep_req3b" during startup +CONTEXT: SQL function "dep_req3b" statement 1 DROP EXTENSION test_ext_req_schema3; ALTER EXTENSION test_ext_req_schema1 SET SCHEMA test_s_dep2; -- now ok SELECT test_s_dep2.dep_req1(); diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list index 93339ef3c58..1671101cebb 100644 --- a/src/tools/pgindent/typedefs.list +++ b/src/tools/pgindent/typedefs.list @@ -2577,6 +2577,8 @@ SQLFunctionCache SQLFunctionCachePtr SQLFunctionParseInfo SQLFunctionParseInfoPtr +SQLFunctionPlanEntry +SQLFunctionPlanKey SQLValueFunction SQLValueFunctionOp SSL -- 2.43.5 From 8c6cee3bdae49c6282fd8a667a8bc9312e22df02 Mon Sep 17 00:00:00 2001 From: Tom Lane <tgl@sss.pgh.pa.us> Date: Fri, 14 Mar 2025 16:24:51 -0400 Subject: [PATCH v8 4/4] Handle SQL functions which result type is adjuisted Query could be modified between rewrite and plan stages by check_sql_fn_retval(). We move this step earlier, so that cached plans were created with already modified tlist. In this case if later revalidation is considered by RevalidateCachedQuery(), modifications, done by check_sql_fn_retval(), will not be lost. We consider that rewriting query cannot ever changes the targetlist results. Note that test_extensions result has changed as cached query can be revalidated after extension is moved to another schema - function oid in the query still matches the existing one. --- src/backend/executor/functions.c | 76 ++++++++++--------- .../expected/test_extensions.out | 11 ++- 2 files changed, 44 insertions(+), 43 deletions(-) diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c index 678ca55a026..f745458d3d0 100644 --- a/src/backend/executor/functions.c +++ b/src/backend/executor/functions.c @@ -25,6 +25,7 @@ #include "miscadmin.h" #include "nodes/makefuncs.h" #include "nodes/nodeFuncs.h" +#include "parser/analyze.h" #include "parser/parse_coerce.h" #include "parser/parse_collate.h" #include "parser/parse_func.h" @@ -940,6 +941,7 @@ init_sql_fcache(FunctionCallInfo fcinfo, Oid collation, bool *lazyEvalOK) SQLFunctionPlanKey plan_cache_entry_key; bool use_plan_cache; bool plan_cache_entry_valid; + List *query_list; /* * Create memory context that holds all the SQLFunctionCache data. It @@ -1090,32 +1092,17 @@ init_sql_fcache(FunctionCallInfo fcinfo, Oid collation, bool *lazyEvalOK) plansource_list = NIL; - queryTree_list = NIL; + /* Construct a list of analyzed parsetrees. */ + query_list = NIL; if (!isNull) { Node *n; - List *stored_query_list; n = stringToNode(TextDatumGetCString(tmp)); if (IsA(n, List)) - stored_query_list = linitial_node(List, castNode(List, n)); + query_list = linitial_node(List, castNode(List, n)); else - stored_query_list = list_make1(n); - - foreach(lc, stored_query_list) - { - Query *parsetree = lfirst_node(Query, lc); - List *queryTree_sublist; - CachedPlanSource *plansource; - - AcquireRewriteLocks(parsetree, true, false); - - plansource = CreateCachedPlanForQuery(parsetree, fcache->src, CreateCommandTag((Node *) parsetree)); - plansource_list = lappend(plansource_list, plansource); - - queryTree_sublist = pg_rewrite_query(parsetree); - queryTree_list = lappend(queryTree_list, queryTree_sublist); - } + query_list = list_make1(n); } else { @@ -1126,25 +1113,15 @@ init_sql_fcache(FunctionCallInfo fcinfo, Oid collation, bool *lazyEvalOK) foreach(lc, raw_parsetree_list) { RawStmt *parsetree = lfirst_node(RawStmt, lc); - List *queryTree_sublist; - CachedPlanSource *plansource; - - plansource = CreateCachedPlan(parsetree, fcache->src, CreateCommandTag(parsetree->stmt)); - plansource_list = lappend(plansource_list, plansource); - - queryTree_sublist = pg_analyze_and_rewrite_withcb(parsetree, - fcache->src, - (ParserSetupHook) sql_fn_parser_setup, - fcache->pinfo, - NULL); - queryTree_list = lappend(queryTree_list, queryTree_sublist); + Query *query; + + query = parse_analyze_withcb(parsetree, fcache->src, (ParserSetupHook) sql_fn_parser_setup, fcache->pinfo, + NULL); + + query_list = lappend(query_list, query); } } - /* - * Check that there are no statements we don't want to allow. - */ - check_sql_fn_statements(queryTree_list); /* * Check that the function returns the type it claims to. Although in @@ -1154,7 +1131,10 @@ init_sql_fcache(FunctionCallInfo fcinfo, Oid collation, bool *lazyEvalOK) * the function had any polymorphic arguments. Moreover, * check_sql_fn_retval takes care of injecting any required column * type coercions. (But we don't ask it to insert nulls for dropped - * columns; the junkfilter handles that.) + * columns; the junkfilter handles that.) As check_sql_fn_retval() can + * modify queries to match expected return types, we execute it prior + * to creating cached plans, so that if revalidation happens and + * triggers query rewriting, return type would be already correct. * * Note: we set fcache->returnsTuple according to whether we are * returning the whole tuple result or just a single column. In the @@ -1166,13 +1146,35 @@ init_sql_fcache(FunctionCallInfo fcinfo, Oid collation, bool *lazyEvalOK) * do that.) */ - fcache->returnsTuple = check_sql_fn_retval(queryTree_list, + fcache->returnsTuple = check_sql_fn_retval(list_make1(query_list), rettype, rettupdesc, procedureStruct->prokind, false, &resulttlist); + queryTree_list = NIL; + + foreach(lc, query_list) + { + Query *parsetree = lfirst_node(Query, lc); + List *queryTree_sublist; + CachedPlanSource *plansource; + + AcquireRewriteLocks(parsetree, true, false); + + plansource = CreateCachedPlanForQuery(parsetree, fcache->src, CreateCommandTag((Node *) parsetree)); + plansource_list = lappend(plansource_list, plansource); + + queryTree_sublist = pg_rewrite_query(parsetree); + queryTree_list = lappend(queryTree_list, queryTree_sublist); + } + + /* + * Check that there are no statements we don't want to allow. + */ + check_sql_fn_statements(queryTree_list); + /* * Queries could be rewritten by check_sql_fn_retval(). Now when they * have their final form, we can complete plan cache entry creation. diff --git a/src/test/modules/test_extensions/expected/test_extensions.out b/src/test/modules/test_extensions/expected/test_extensions.out index 72bae1bf254..ea3d4ca61d8 100644 --- a/src/test/modules/test_extensions/expected/test_extensions.out +++ b/src/test/modules/test_extensions/expected/test_extensions.out @@ -646,12 +646,11 @@ SELECT dep_req3(); (1 row) SELECT dep_req3b(); -- fails -ERROR: function public.dep_req2() does not exist -LINE 1: SELECT public.dep_req2() || ' req3b' - ^ -HINT: No function matches the given name and argument types. You might need to add explicit type casts. -QUERY: SELECT public.dep_req2() || ' req3b' -CONTEXT: SQL function "dep_req3b" statement 1 + dep_req3b +----------------- + req1 req2 req3b +(1 row) + DROP EXTENSION test_ext_req_schema3; ALTER EXTENSION test_ext_req_schema1 SET SCHEMA test_s_dep2; -- now ok SELECT test_s_dep2.dep_req1(); -- 2.43.5
Tom Lane писал(а) 2025-03-14 23:52: > I spent some time today going through the actual code in this patch. > I realized that there's no longer any point in 0001: the later patches > don't move or repeatedly-call that bit of code, so it can be left > as-is. > > What I think we could stand to split out, though, is the changes in > the plancache support. The new 0001 attached is just the plancache > and analyze.c changes. That could be committed separately, although > of course there's little point in pushing it till we're happy with > the rest. > Hi. Sorry that didn't reply immediately, was busy with another tasks. > In general, this patch series is paying far too little attention to > updating existing comments that it obsoletes or adding new ones > explaining what's going on. For example, the introductory comment > for struct SQLFunctionCache still says > > * Note that currently this has only the lifespan of the calling query. > * Someday we should rewrite this code to use plancache.c to save > parse/plan > * results for longer than that. > > and I wonder how much of the para after that is still accurate either. > The new structs aren't adequately documented either IMO. We now have > about three different structs that have something to do with caches > by their names, but the reader is left to guess how they fit together. > Another example is that the header comment for init_execution_state > still describes an argument list it hasn't got anymore. > > I tried to clean up the comment situation in the plancache in 0001, > but I've not done much of anything to functions.c. I've added some comments to functions.c. Modified comments you've spotted out. > > I'm fairly confused why 0002 and 0003 are separate patches, and the > commit messages for them do nothing to clarify that. It seems like > you're expecting reviewers to review a very transitory state of > affairs in 0002, and it's not clear why. Maybe the code is fine > and you just need to explain the change sequence a bit more > in the commit messages. 0002 could stand to explain the point > of the new test cases, too, especially since one of them seems to > be demonstrating the fixing of a pre-existing bug. Also merged introducing plan cache to sql functions and session-level plan cache support. Mostly they were separate for historic reasons. > > Something is very wrong in 0004: it should not be breaking that > test case in test_extensions. It seems to me we should already > have the necessary infrastructure for that, in that the plan > ought to have a PlanInvalItem referencing public.dep_req2(), > and the ALTER SET SCHEMA that gets done on that function should > result in an invalidation. So it looks to me like that patch > has somehow rearranged things so we miss an invalidation. > I've not tried to figure out why. Plan is invalidated in both cases (before and after the patch). What happens here is that earlier when revalidation happened, we couldn't find renamed function. Now function in Query is identified by its oid, and it didn't change. So, we still can find function by oid and rebuild cached plan. > I'm also sad that 0004 > doesn't appear to include any test cases showing it doing > something right: without that, why do it at all? I've added sample, which is fixed by this patch. It can happen that plan is adjusted and saved. Later it's invalidated and when revalidation happens, we miss modifications, added by check_sql_fn_retval(). Another interesting issue is that cached plan is checked for being valid before function starts executing (like earlier planning happened before function started executing). So, once we discard cached plans, plan for second query in function is not invalidated immediately, just on the second execution. And after rebuilding plan, it becomes wrong. -- Best regards, Alexander Pyhalov, Postgres Professional
Attachment
Alexander Pyhalov писал(а) 2025-03-28 15:22: > Tom Lane писал(а) 2025-03-14 23:52: >> I spent some time today going through the actual code in this patch. >> I realized that there's no longer any point in 0001: the later patches >> don't move or repeatedly-call that bit of code, so it can be left >> as-is. >> >> What I think we could stand to split out, though, is the changes in >> the plancache support. The new 0001 attached is just the plancache >> and analyze.c changes. That could be committed separately, although >> of course there's little point in pushing it till we're happy with >> the rest. >> > > Hi. > Sorry that didn't reply immediately, was busy with another tasks. > >> In general, this patch series is paying far too little attention to >> updating existing comments that it obsoletes or adding new ones >> explaining what's going on. For example, the introductory comment >> for struct SQLFunctionCache still says >> >> * Note that currently this has only the lifespan of the calling >> query. >> * Someday we should rewrite this code to use plancache.c to save >> parse/plan >> * results for longer than that. >> >> and I wonder how much of the para after that is still accurate either. >> The new structs aren't adequately documented either IMO. We now have >> about three different structs that have something to do with caches >> by their names, but the reader is left to guess how they fit together. >> Another example is that the header comment for init_execution_state >> still describes an argument list it hasn't got anymore. >> >> I tried to clean up the comment situation in the plancache in 0001, >> but I've not done much of anything to functions.c. > > I've added some comments to functions.c. Modified comments you've > spotted out. > >> >> I'm fairly confused why 0002 and 0003 are separate patches, and the >> commit messages for them do nothing to clarify that. It seems like >> you're expecting reviewers to review a very transitory state of >> affairs in 0002, and it's not clear why. Maybe the code is fine >> and you just need to explain the change sequence a bit more >> in the commit messages. 0002 could stand to explain the point >> of the new test cases, too, especially since one of them seems to >> be demonstrating the fixing of a pre-existing bug. > > Also merged introducing plan cache to sql functions and session-level > plan cache support. Mostly they were separate for historic reasons. > >> >> Something is very wrong in 0004: it should not be breaking that >> test case in test_extensions. It seems to me we should already >> have the necessary infrastructure for that, in that the plan >> ought to have a PlanInvalItem referencing public.dep_req2(), >> and the ALTER SET SCHEMA that gets done on that function should >> result in an invalidation. So it looks to me like that patch >> has somehow rearranged things so we miss an invalidation. >> I've not tried to figure out why. > > Plan is invalidated in both cases (before and after the patch). > What happens here is that earlier when revalidation happened, we > couldn't find renamed function. > Now function in Query is identified by its oid, and it didn't change. > So, we still can find function by oid and rebuild cached plan. > >> I'm also sad that 0004 >> doesn't appear to include any test cases showing it doing >> something right: without that, why do it at all? > > I've added sample, which is fixed by this patch. It can happen that > plan is adjusted and saved. Later it's invalidated and when > revalidation happens, > we miss modifications, added by check_sql_fn_retval(). Another > interesting issue > is that cached plan is checked for being valid before function starts > executing > (like earlier planning happened before function started executing). So, > once we > discard cached plans, plan for second query in function is not > invalidated immediately, > just on the second execution. And after rebuilding plan, it becomes > wrong. After writing some comments, looking at it once again, I've found that one assumption is wrong - function can be discarded from cache during its execution. For example, create or replace function recurse(anyelement) returns record as $$ begin if ($1 > 0) then if (mod($1, 2) = 0) then execute format($query$ create or replace function sql_recurse(anyelement) returns record as $q$ select recurse($1); select ($1,2); $q$ language sql; $query$); end if; return sql_recurse($1 - 1); else return row($1, 1::int); end if; end; $$ language plpgsql; create or replace function sql_recurse(anyelement) returns record as $$ select recurse($1); select ($1,2); $$ language sql; create table t1 (i int); insert into t1 values(2),(3),(4); select sql_recurse(i) from t1; leads to dropping cached plans while they are needed. Will look how better to handle it. Also one interesting note is as we don't use raw_parse_tree, it seems we don't need plansource->parserSetup and plansource->parserSetupArg. It seems we can avoid caching complete parse info. -- Best regards, Alexander Pyhalov, Postgres Professional
Alexander Pyhalov <a.pyhalov@postgrespro.ru> writes: > After writing some comments, looking at it once again, I've found that > one assumption is wrong - function can be discarded from cache during > its execution. Yeah. You really need a use-count on the shared cache object. I've been working on pulling out plpgsql's code that manages its function cache into a new module that can be shared with functions.c. That code is quite battle-tested and I don't see a good argument for reinventing the logic. It's not fit to share yet, but I hope to have something in a day or so. > Also one interesting note is as we don't use raw_parse_tree, it seems we > don't need plansource->parserSetup and plansource->parserSetupArg. It > seems we can avoid caching complete parse info. Well, you do need those when dealing with an old-style function (raw parse trees). regards, tom lane
I spent some time reading and reworking this code, and have arrived at a patch set that I'm pretty happy with. I'm not sure it's quite committable but it's close: 0001: Same as in v8, extend plancache.c to allow caching starting from a Query. 0002: Add a post-rewrite callback hook in plancache.c. There is no chance of getting check_sql_fn_retval to work correctly without that in the raw-parsetree case: we can't apply the transformation on what goes into the plancache, and we have to be able to re-apply it if plancache regenerates the plan from the raw parse trees. 0003: As I mentioned yesterday, I think we should use the same cache management logic that plpgsql does, and the best way for that to happen is to share code. So this invents a new module "funccache" that extracts the logic plpgsql was using. I did have to add one feature that plpgsql doesn't have, to allow part of the cache key to be the output rowtype. Otherwise cases like this one from rangefuncs.sql won't work: select * from array_to_set(array['one', 'two']) as t(f1 int,f2 text); select * from array_to_set(array['one', 'two']) as t(f1 numeric(4,2),f2 text); These have to have separate cached plans because check_sql_fn_retval will modify the plans differently. 0004: Restructure check_sql_fn_retval so that we can use it in the callback hook envisioned in 0002. There's an edge-case semantics change as explained in the commit message; perhaps that will be controversial? 0005: This extracts the RLS test case you had and commits it with the old non-failing behavior, just so that we can see that the new code does it differently. (I didn't adopt your test from rules.sql, because AFAICS it works the same with or without this patch set. What was the point of that one again?) 0006: The guts of the patch. I couldn't break this down any further. One big difference from what you had is that there is only one path of control: we always use the plan cache. The hack you had to not use it for triggers was only needed because you didn't include the right cache key items to distinguish different trigger usages, but the code coming from plpgsql has that right. Also, the memory management is done a bit differently. The "fcontext" memory context holding the SQLFunctionCache struct is now discarded at the end of each execution of the SQL function, which considerably alleviates worries about leaking memory there. I invented a small "SQLFunctionLink" struct that is what fn_extra points at, and it survives as long as the FmgrInfo does, so that's what saves us from redoing hash key computations in most cases. I also moved some code around -- notably, init_execution_state now builds plans for only one CachedPlanState at a time, and we don't try to construct the output JunkFilter until we plan the last CachedPlanState. Because of this change, there's no longer a List of execution_state sublists, but only a sublist matching the current CachedPlan. We track where we are in the function using an integer counter of the CachedPlanStates instead. There's more stuff that could be done, but I feel that all of this could be left for later: * I really wanted to do what I mentioned upthread and change things so we don't even parse the later queries until we've executed the ones before that. However that seems to be a bit of a mess to make happen, and the patch is large/complex enough already. * The execution_state sublist business seems quite vestigial now: we could probably drop it in favor of one set of those fields and a counter. But that would involve a lot of notational churn, much of it in code that doesn't need changes otherwise, and in the end it would not buy much except removing a tiny amount of transient space usage. Maybe some other day. * There's some duplication of effort between cache key computation and the callers, particularly that for SQL functions we end up doing get_call_result_type() twice during the initial call. This could probably be fixed with more refactoring, but it's not really expensive enough to get too excited about. I redid Pavel's tests from [1], and got these results in non-assert builds: master v10 patch fx: 50077.251 ms 21221.104 ms fx2: 8578.874 ms 8576.935 ms fx3: 66331.186 ms 21173.215 ms fx4: 56233.003 ms 22757.320 ms fx5: 13248.177 ms 12370.693 ms fx6: 13103.840 ms 12245.266 ms We get substantial wins on all of fx, fx3, fx4. fx2 is the case that gets inlined and never reaches functions.c, so the lack of change there is expected. What I found odd is that I saw a small speedup (~6%) on fx5 and fx6; those functions are in plpgsql so they really shouldn't change either. The only thing I can think of is that I made the hash key computation a tiny bit faster by omitting unused argtypes[] entries. That does avoid hashing several hundred bytes typically, but it's hard to believe it'd amount to any visible savings overall. Anyway, PFA v10. regards, tom lane [1] https://www.postgresql.org/message-id/CAFj8pRDWDeF2cC%2BpCjLHJno7KnK5kdtjYN-f933RHS7UneArFw%40mail.gmail.com From 5d2f5e092ef326f72100d6d47ba1b5cb207e62ba Mon Sep 17 00:00:00 2001 From: Tom Lane <tgl@sss.pgh.pa.us> Date: Sat, 29 Mar 2025 14:50:45 -0400 Subject: [PATCH v10 1/6] Support cached plans that work from a parse-analyzed Query. Up to now, plancache.c dealt only with raw parse trees as the starting point for a cached plan. However, we'd like to use this infrastructure for SQL functions, and in the case of a new-style SQL function we'll only have the stored querytree, which corresponds to an analyzed-but-not-rewritten Query. Fortunately, we can make plancache.c handle that scenario with only minor modifications; the biggest change is in RevalidateCachedQuery() where we will need to apply only pg_rewrite_query not pg_analyze_and_rewrite. This patch just installs the infrastructure; there's no caller as yet. Author: Alexander Pyhalov <a.pyhalov@postgrespro.ru> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/8216639.NyiUUSuA9g@aivenlaptop --- src/backend/parser/analyze.c | 39 +++++++ src/backend/utils/cache/plancache.c | 158 +++++++++++++++++++++------- src/include/parser/analyze.h | 1 + src/include/utils/plancache.h | 23 +++- 4 files changed, 179 insertions(+), 42 deletions(-) diff --git a/src/backend/parser/analyze.c b/src/backend/parser/analyze.c index 76f58b3aca3..1f4d6adda52 100644 --- a/src/backend/parser/analyze.c +++ b/src/backend/parser/analyze.c @@ -591,6 +591,45 @@ analyze_requires_snapshot(RawStmt *parseTree) return stmt_requires_parse_analysis(parseTree); } +/* + * query_requires_rewrite_plan() + * Returns true if rewriting or planning is non-trivial for this Query. + * + * This is much like stmt_requires_parse_analysis(), but applies one step + * further down the pipeline. + * + * We do not provide an equivalent of analyze_requires_snapshot(): callers + * can assume that any rewriting or planning activity needs a snapshot. + */ +bool +query_requires_rewrite_plan(Query *query) +{ + bool result; + + if (query->commandType != CMD_UTILITY) + { + /* All optimizable statements require rewriting/planning */ + result = true; + } + else + { + /* This list should match stmt_requires_parse_analysis() */ + switch (nodeTag(query->utilityStmt)) + { + case T_DeclareCursorStmt: + case T_ExplainStmt: + case T_CreateTableAsStmt: + case T_CallStmt: + result = true; + break; + default: + result = false; + break; + } + } + return result; +} + /* * transformDeleteStmt - * transforms a Delete Statement diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c index 6c2979d5c82..5983927a4c2 100644 --- a/src/backend/utils/cache/plancache.c +++ b/src/backend/utils/cache/plancache.c @@ -14,7 +14,7 @@ * Cache invalidation is driven off sinval events. Any CachedPlanSource * that matches the event is marked invalid, as is its generic CachedPlan * if it has one. When (and if) the next demand for a cached plan occurs, - * parse analysis and rewrite is repeated to build a new valid query tree, + * parse analysis and/or rewrite is repeated to build a new valid query tree, * and then planning is performed as normal. We also force re-analysis and * re-planning if the active search_path is different from the previous time * or, if RLS is involved, if the user changes or the RLS environment changes. @@ -63,6 +63,7 @@ #include "nodes/nodeFuncs.h" #include "optimizer/optimizer.h" #include "parser/analyze.h" +#include "rewrite/rewriteHandler.h" #include "storage/lmgr.h" #include "tcop/pquery.h" #include "tcop/utility.h" @@ -74,18 +75,6 @@ #include "utils/syscache.h" -/* - * We must skip "overhead" operations that involve database access when the - * cached plan's subject statement is a transaction control command or one - * that requires a snapshot not to be set yet (such as SET or LOCK). More - * generally, statements that do not require parse analysis/rewrite/plan - * activity never need to be revalidated, so we can treat them all like that. - * For the convenience of postgres.c, treat empty statements that way too. - */ -#define StmtPlanRequiresRevalidation(plansource) \ - ((plansource)->raw_parse_tree != NULL && \ - stmt_requires_parse_analysis((plansource)->raw_parse_tree)) - /* * This is the head of the backend's list of "saved" CachedPlanSources (i.e., * those that are in long-lived storage and are examined for sinval events). @@ -100,6 +89,8 @@ static dlist_head saved_plan_list = DLIST_STATIC_INIT(saved_plan_list); static dlist_head cached_expression_list = DLIST_STATIC_INIT(cached_expression_list); static void ReleaseGenericPlan(CachedPlanSource *plansource); +static bool StmtPlanRequiresRevalidation(CachedPlanSource *plansource); +static bool BuildingPlanRequiresSnapshot(CachedPlanSource *plansource); static List *RevalidateCachedQuery(CachedPlanSource *plansource, QueryEnvironment *queryEnv, bool release_generic); @@ -166,7 +157,7 @@ InitPlanCache(void) } /* - * CreateCachedPlan: initially create a plan cache entry. + * CreateCachedPlan: initially create a plan cache entry for a raw parse tree. * * Creation of a cached plan is divided into two steps, CreateCachedPlan and * CompleteCachedPlan. CreateCachedPlan should be called after running the @@ -220,6 +211,7 @@ CreateCachedPlan(RawStmt *raw_parse_tree, plansource = (CachedPlanSource *) palloc0(sizeof(CachedPlanSource)); plansource->magic = CACHEDPLANSOURCE_MAGIC; plansource->raw_parse_tree = copyObject(raw_parse_tree); + plansource->analyzed_parse_tree = NULL; plansource->query_string = pstrdup(query_string); MemoryContextSetIdentifier(source_context, plansource->query_string); plansource->commandTag = commandTag; @@ -255,6 +247,34 @@ CreateCachedPlan(RawStmt *raw_parse_tree, return plansource; } +/* + * CreateCachedPlanForQuery: initially create a plan cache entry for a Query. + * + * This is used in the same way as CreateCachedPlan, except that the source + * query has already been through parse analysis, and the plancache will never + * try to re-do that step. + * + * Currently this is used only for new-style SQL functions, where we have a + * Query from the function's prosqlbody, but no source text. The query_string + * is typically empty, but is required anyway. + */ +CachedPlanSource * +CreateCachedPlanForQuery(Query *analyzed_parse_tree, + const char *query_string, + CommandTag commandTag) +{ + CachedPlanSource *plansource; + MemoryContext oldcxt; + + /* Rather than duplicating CreateCachedPlan, just do this: */ + plansource = CreateCachedPlan(NULL, query_string, commandTag); + oldcxt = MemoryContextSwitchTo(plansource->context); + plansource->analyzed_parse_tree = copyObject(analyzed_parse_tree); + MemoryContextSwitchTo(oldcxt); + + return plansource; +} + /* * CreateOneShotCachedPlan: initially create a one-shot plan cache entry. * @@ -289,6 +309,7 @@ CreateOneShotCachedPlan(RawStmt *raw_parse_tree, plansource = (CachedPlanSource *) palloc0(sizeof(CachedPlanSource)); plansource->magic = CACHEDPLANSOURCE_MAGIC; plansource->raw_parse_tree = raw_parse_tree; + plansource->analyzed_parse_tree = NULL; plansource->query_string = query_string; plansource->commandTag = commandTag; plansource->param_types = NULL; @@ -566,6 +587,42 @@ ReleaseGenericPlan(CachedPlanSource *plansource) } } +/* + * We must skip "overhead" operations that involve database access when the + * cached plan's subject statement is a transaction control command or one + * that requires a snapshot not to be set yet (such as SET or LOCK). More + * generally, statements that do not require parse analysis/rewrite/plan + * activity never need to be revalidated, so we can treat them all like that. + * For the convenience of postgres.c, treat empty statements that way too. + */ +static bool +StmtPlanRequiresRevalidation(CachedPlanSource *plansource) +{ + if (plansource->raw_parse_tree != NULL) + return stmt_requires_parse_analysis(plansource->raw_parse_tree); + else if (plansource->analyzed_parse_tree != NULL) + return query_requires_rewrite_plan(plansource->analyzed_parse_tree); + /* empty query never needs revalidation */ + return false; +} + +/* + * Determine if creating a plan for this CachedPlanSource requires a snapshot. + * In fact this function matches StmtPlanRequiresRevalidation(), but we want + * to preserve the distinction between stmt_requires_parse_analysis() and + * analyze_requires_snapshot(). + */ +static bool +BuildingPlanRequiresSnapshot(CachedPlanSource *plansource) +{ + if (plansource->raw_parse_tree != NULL) + return analyze_requires_snapshot(plansource->raw_parse_tree); + else if (plansource->analyzed_parse_tree != NULL) + return query_requires_rewrite_plan(plansource->analyzed_parse_tree); + /* empty query never needs a snapshot */ + return false; +} + /* * RevalidateCachedQuery: ensure validity of analyzed-and-rewritten query tree. * @@ -592,7 +649,6 @@ RevalidateCachedQuery(CachedPlanSource *plansource, bool release_generic) { bool snapshot_set; - RawStmt *rawtree; List *tlist; /* transient query-tree list */ List *qlist; /* permanent query-tree list */ TupleDesc resultDesc; @@ -615,7 +671,10 @@ RevalidateCachedQuery(CachedPlanSource *plansource, /* * If the query is currently valid, we should have a saved search_path --- * check to see if that matches the current environment. If not, we want - * to force replan. + * to force replan. (We could almost ignore this consideration when + * working from an analyzed parse tree; but there are scenarios where + * planning can have search_path-dependent results, for example if it + * inlines an old-style SQL function.) */ if (plansource->is_valid) { @@ -662,9 +721,9 @@ RevalidateCachedQuery(CachedPlanSource *plansource, } /* - * Discard the no-longer-useful query tree. (Note: we don't want to do - * this any earlier, else we'd not have been able to release locks - * correctly in the race condition case.) + * Discard the no-longer-useful rewritten query tree. (Note: we don't + * want to do this any earlier, else we'd not have been able to release + * locks correctly in the race condition case.) */ plansource->is_valid = false; plansource->query_list = NIL; @@ -711,25 +770,48 @@ RevalidateCachedQuery(CachedPlanSource *plansource, } /* - * Run parse analysis and rule rewriting. The parser tends to scribble on - * its input, so we must copy the raw parse tree to prevent corruption of - * the cache. + * Run parse analysis (if needed) and rule rewriting. */ - rawtree = copyObject(plansource->raw_parse_tree); - if (rawtree == NULL) - tlist = NIL; - else if (plansource->parserSetup != NULL) - tlist = pg_analyze_and_rewrite_withcb(rawtree, - plansource->query_string, - plansource->parserSetup, - plansource->parserSetupArg, - queryEnv); + if (plansource->raw_parse_tree != NULL) + { + /* Source is raw parse tree */ + RawStmt *rawtree; + + /* + * The parser tends to scribble on its input, so we must copy the raw + * parse tree to prevent corruption of the cache. + */ + rawtree = copyObject(plansource->raw_parse_tree); + if (plansource->parserSetup != NULL) + tlist = pg_analyze_and_rewrite_withcb(rawtree, + plansource->query_string, + plansource->parserSetup, + plansource->parserSetupArg, + queryEnv); + else + tlist = pg_analyze_and_rewrite_fixedparams(rawtree, + plansource->query_string, + plansource->param_types, + plansource->num_params, + queryEnv); + } + else if (plansource->analyzed_parse_tree != NULL) + { + /* Source is pre-analyzed query, so we only need to rewrite */ + Query *analyzed_tree; + + /* The rewriter scribbles on its input, too, so copy */ + analyzed_tree = copyObject(plansource->analyzed_parse_tree); + /* Acquire locks needed before rewriting ... */ + AcquireRewriteLocks(analyzed_tree, true, false); + /* ... and do it */ + tlist = pg_rewrite_query(analyzed_tree); + } else - tlist = pg_analyze_and_rewrite_fixedparams(rawtree, - plansource->query_string, - plansource->param_types, - plansource->num_params, - queryEnv); + { + /* Empty query, nothing to do */ + tlist = NIL; + } /* Release snapshot if we got one */ if (snapshot_set) @@ -963,8 +1045,7 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist, */ snapshot_set = false; if (!ActiveSnapshotSet() && - plansource->raw_parse_tree && - analyze_requires_snapshot(plansource->raw_parse_tree)) + BuildingPlanRequiresSnapshot(plansource)) { PushActiveSnapshot(GetTransactionSnapshot()); snapshot_set = true; @@ -1703,6 +1784,7 @@ CopyCachedPlan(CachedPlanSource *plansource) newsource = (CachedPlanSource *) palloc0(sizeof(CachedPlanSource)); newsource->magic = CACHEDPLANSOURCE_MAGIC; newsource->raw_parse_tree = copyObject(plansource->raw_parse_tree); + newsource->analyzed_parse_tree = copyObject(plansource->analyzed_parse_tree); newsource->query_string = pstrdup(plansource->query_string); MemoryContextSetIdentifier(source_context, newsource->query_string); newsource->commandTag = plansource->commandTag; diff --git a/src/include/parser/analyze.h b/src/include/parser/analyze.h index f1bd18c49f2..f29ed03b476 100644 --- a/src/include/parser/analyze.h +++ b/src/include/parser/analyze.h @@ -52,6 +52,7 @@ extern Query *transformStmt(ParseState *pstate, Node *parseTree); extern bool stmt_requires_parse_analysis(RawStmt *parseTree); extern bool analyze_requires_snapshot(RawStmt *parseTree); +extern bool query_requires_rewrite_plan(Query *query); extern const char *LCS_asString(LockClauseStrength strength); extern void CheckSelectLocking(Query *qry, LockClauseStrength strength); diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h index 199cc323a28..5930fcb50f0 100644 --- a/src/include/utils/plancache.h +++ b/src/include/utils/plancache.h @@ -25,7 +25,8 @@ #include "utils/resowner.h" -/* Forward declaration, to avoid including parsenodes.h here */ +/* Forward declarations, to avoid including parsenodes.h here */ +struct Query; struct RawStmt; /* possible values for plan_cache_mode */ @@ -45,12 +46,22 @@ extern PGDLLIMPORT int plan_cache_mode; /* * CachedPlanSource (which might better have been called CachedQuery) - * represents a SQL query that we expect to use multiple times. It stores - * the query source text, the raw parse tree, and the analyzed-and-rewritten + * represents a SQL query that we expect to use multiple times. It stores the + * query source text, the source parse tree, and the analyzed-and-rewritten * query tree, as well as adjunct data. Cache invalidation can happen as a * result of DDL affecting objects used by the query. In that case we discard * the analyzed-and-rewritten query tree, and rebuild it when next needed. * + * There are two ways in which the source query can be represented: either + * as a raw parse tree, or as an analyzed-but-not-rewritten parse tree. + * In the latter case we expect that cache invalidation need not affect + * the parse-analysis results, only the rewriting and planning steps. + * Only one of raw_parse_tree and analyzed_parse_tree can be non-NULL. + * (If both are NULL, the CachedPlanSource represents an empty query.) + * Note that query_string is typically just an empty string when the + * source query is an analyzed parse tree; also, param_types, num_params, + * parserSetup, and parserSetupArg will not be used. + * * An actual execution plan, represented by CachedPlan, is derived from the * CachedPlanSource when we need to execute the query. The plan could be * either generic (usable with any set of plan parameters) or custom (for a @@ -78,7 +89,7 @@ extern PGDLLIMPORT int plan_cache_mode; * though it may be useful if the CachedPlan can be discarded early.) * * A CachedPlanSource has two associated memory contexts: one that holds the - * struct itself, the query source text and the raw parse tree, and another + * struct itself, the query source text and the source parse tree, and another * context that holds the rewritten query tree and associated data. This * allows the query tree to be discarded easily when it is invalidated. * @@ -94,6 +105,7 @@ typedef struct CachedPlanSource { int magic; /* should equal CACHEDPLANSOURCE_MAGIC */ struct RawStmt *raw_parse_tree; /* output of raw_parser(), or NULL */ + struct Query *analyzed_parse_tree; /* analyzed parse tree, or NULL */ const char *query_string; /* source text of query */ CommandTag commandTag; /* command tag for query */ Oid *param_types; /* array of parameter type OIDs, or NULL */ @@ -196,6 +208,9 @@ extern void ReleaseAllPlanCacheRefsInOwner(ResourceOwner owner); extern CachedPlanSource *CreateCachedPlan(struct RawStmt *raw_parse_tree, const char *query_string, CommandTag commandTag); +extern CachedPlanSource *CreateCachedPlanForQuery(struct Query *analyzed_parse_tree, + const char *query_string, + CommandTag commandTag); extern CachedPlanSource *CreateOneShotCachedPlan(struct RawStmt *raw_parse_tree, const char *query_string, CommandTag commandTag); -- 2.43.5 From 0394a86c27cf5e82ce0574f951392428f80fe4b7 Mon Sep 17 00:00:00 2001 From: Tom Lane <tgl@sss.pgh.pa.us> Date: Sat, 29 Mar 2025 14:51:37 -0400 Subject: [PATCH v10 2/6] Provide a post-rewrite callback hook in plancache.c. SQL-language functions sometimes want to modify the targetlist of the query that returns their result. If they're to use the plan cache, it needs to be possible to do that over again when a replan occurs. Invent a callback hook to make that happen. I chose to provide a separate function SetPostRewriteHook to install such hooks. An alternative API could be to add two more arguments to CompleteCachedPlan. I didn't do so because I felt that few callers will want this, but there's a case that that way would be cleaner. Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/8216639.NyiUUSuA9g@aivenlaptop --- src/backend/utils/cache/plancache.c | 33 +++++++++++++++++++++++++++++ src/include/utils/plancache.h | 8 +++++++ src/tools/pgindent/typedefs.list | 1 + 3 files changed, 42 insertions(+) diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c index 5983927a4c2..3b681647060 100644 --- a/src/backend/utils/cache/plancache.c +++ b/src/backend/utils/cache/plancache.c @@ -219,6 +219,8 @@ CreateCachedPlan(RawStmt *raw_parse_tree, plansource->num_params = 0; plansource->parserSetup = NULL; plansource->parserSetupArg = NULL; + plansource->postRewrite = NULL; + plansource->postRewriteArg = NULL; plansource->cursor_options = 0; plansource->fixed_result = false; plansource->resultDesc = NULL; @@ -316,6 +318,8 @@ CreateOneShotCachedPlan(RawStmt *raw_parse_tree, plansource->num_params = 0; plansource->parserSetup = NULL; plansource->parserSetupArg = NULL; + plansource->postRewrite = NULL; + plansource->postRewriteArg = NULL; plansource->cursor_options = 0; plansource->fixed_result = false; plansource->resultDesc = NULL; @@ -485,6 +489,29 @@ CompleteCachedPlan(CachedPlanSource *plansource, plansource->is_valid = true; } +/* + * SetPostRewriteHook: set a hook to modify post-rewrite query trees + * + * Some callers have a need to modify the query trees between rewriting and + * planning. In the initial call to CompleteCachedPlan, it's assumed such + * work was already done on the querytree_list. However, if we're forced + * to replan, it will need to be done over. The caller can set this hook + * to provide code to make that happen. + * + * postRewriteArg is just passed verbatim to the hook. As with parserSetupArg, + * it is caller's responsibility that the referenced data remains + * valid for as long as the CachedPlanSource exists. + */ +void +SetPostRewriteHook(CachedPlanSource *plansource, + PostRewriteHook postRewrite, + void *postRewriteArg) +{ + Assert(plansource->magic == CACHEDPLANSOURCE_MAGIC); + plansource->postRewrite = postRewrite; + plansource->postRewriteArg = postRewriteArg; +} + /* * SaveCachedPlan: save a cached plan permanently * @@ -813,6 +840,10 @@ RevalidateCachedQuery(CachedPlanSource *plansource, tlist = NIL; } + /* Apply post-rewrite callback if there is one */ + if (plansource->postRewrite != NULL) + plansource->postRewrite(tlist, plansource->postRewriteArg); + /* Release snapshot if we got one */ if (snapshot_set) PopActiveSnapshot(); @@ -1800,6 +1831,8 @@ CopyCachedPlan(CachedPlanSource *plansource) newsource->num_params = plansource->num_params; newsource->parserSetup = plansource->parserSetup; newsource->parserSetupArg = plansource->parserSetupArg; + newsource->postRewrite = plansource->postRewrite; + newsource->postRewriteArg = plansource->postRewriteArg; newsource->cursor_options = plansource->cursor_options; newsource->fixed_result = plansource->fixed_result; if (plansource->resultDesc) diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h index 5930fcb50f0..07ec5318db7 100644 --- a/src/include/utils/plancache.h +++ b/src/include/utils/plancache.h @@ -40,6 +40,9 @@ typedef enum /* GUC parameter */ extern PGDLLIMPORT int plan_cache_mode; +/* Optional callback to editorialize on rewritten parse trees */ +typedef void (*PostRewriteHook) (List *querytree_list, void *arg); + #define CACHEDPLANSOURCE_MAGIC 195726186 #define CACHEDPLAN_MAGIC 953717834 #define CACHEDEXPR_MAGIC 838275847 @@ -112,6 +115,8 @@ typedef struct CachedPlanSource int num_params; /* length of param_types array */ ParserSetupHook parserSetup; /* alternative parameter spec method */ void *parserSetupArg; + PostRewriteHook postRewrite; /* see SetPostRewriteHook */ + void *postRewriteArg; int cursor_options; /* cursor options used for planning */ bool fixed_result; /* disallow change in result tupdesc? */ TupleDesc resultDesc; /* result type; NULL = doesn't return tuples */ @@ -223,6 +228,9 @@ extern void CompleteCachedPlan(CachedPlanSource *plansource, void *parserSetupArg, int cursor_options, bool fixed_result); +extern void SetPostRewriteHook(CachedPlanSource *plansource, + PostRewriteHook postRewrite, + void *postRewriteArg); extern void SaveCachedPlan(CachedPlanSource *plansource); extern void DropCachedPlan(CachedPlanSource *plansource); diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list index b66cecd8799..ff75a508876 100644 --- a/src/tools/pgindent/typedefs.list +++ b/src/tools/pgindent/typedefs.list @@ -2266,6 +2266,7 @@ PortalHashEnt PortalStatus PortalStrategy PostParseColumnRefHook +PostRewriteHook PostgresPollingStatusType PostingItem PreParseColumnRefHook -- 2.43.5 From 8911e13329809ce2c44aecd82a460e54ad164d25 Mon Sep 17 00:00:00 2001 From: Tom Lane <tgl@sss.pgh.pa.us> Date: Sat, 29 Mar 2025 16:56:23 -0400 Subject: [PATCH v10 3/6] Factor out plpgsql's management of its function cache. SQL-language functions need precisely this same functionality to manage a long-lived cache of functions. Rather than duplicating or reinventing that code, let's split it out into a new module funccache.c so that it is available for any language that wants to use it. This is mostly an exercise in moving and renaming code, and should not change any behavior. I have added one feature that plpgsql doesn't use but SQL functions will need: the cache lookup key can include the output tuple descriptor when the function returns composite. Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/8216639.NyiUUSuA9g@aivenlaptop --- src/backend/utils/cache/Makefile | 1 + src/backend/utils/cache/funccache.c | 612 ++++++++++++++++++++++++++++ src/backend/utils/cache/meson.build | 1 + src/include/utils/funccache.h | 134 ++++++ src/pl/plpgsql/src/pl_comp.c | 433 ++------------------ src/pl/plpgsql/src/pl_funcs.c | 9 +- src/pl/plpgsql/src/pl_handler.c | 15 +- src/pl/plpgsql/src/plpgsql.h | 45 +- src/tools/pgindent/typedefs.list | 5 + 9 files changed, 811 insertions(+), 444 deletions(-) create mode 100644 src/backend/utils/cache/funccache.c create mode 100644 src/include/utils/funccache.h diff --git a/src/backend/utils/cache/Makefile b/src/backend/utils/cache/Makefile index 5105018cb79..77b3e1a037b 100644 --- a/src/backend/utils/cache/Makefile +++ b/src/backend/utils/cache/Makefile @@ -16,6 +16,7 @@ OBJS = \ attoptcache.o \ catcache.o \ evtcache.o \ + funccache.o \ inval.o \ lsyscache.o \ partcache.o \ diff --git a/src/backend/utils/cache/funccache.c b/src/backend/utils/cache/funccache.c new file mode 100644 index 00000000000..203d17f2459 --- /dev/null +++ b/src/backend/utils/cache/funccache.c @@ -0,0 +1,612 @@ +/*------------------------------------------------------------------------- + * + * funccache.c + * Function cache management. + * + * funccache.c manages a cache of function execution data. The cache + * is used by SQL-language and PL/pgSQL functions, and could be used by + * other function languages. Each cache entry is specific to the execution + * of a particular function (identified by OID) with specific input data + * types; so a polymorphic function could have many associated cache entries. + * Trigger functions similarly have a cache entry per trigger. These rules + * allow the cached data to be specific to the particular data types the + * function call will be dealing with. + * + * + * Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group + * Portions Copyright (c) 1994, Regents of the University of California + * + * IDENTIFICATION + * src/backend/utils/cache/funccache.c + * + *------------------------------------------------------------------------- + */ +#include "postgres.h" + +#include "commands/event_trigger.h" +#include "commands/trigger.h" +#include "common/hashfn.h" +#include "funcapi.h" +#include "catalog/pg_proc.h" +#include "utils/funccache.h" +#include "utils/hsearch.h" +#include "utils/syscache.h" + + +/* + * Hash table for cached functions + */ +static HTAB *cfunc_hashtable = NULL; + +typedef struct CachedFunctionHashEntry +{ + CachedFunctionHashKey key; /* hash key, must be first */ + CachedFunction *function; /* points to data of language-specific size */ +} CachedFunctionHashEntry; + +#define FUNCS_PER_USER 128 /* initial table size */ + +static uint32 cfunc_hash(const void *key, Size keysize); +static int cfunc_match(const void *key1, const void *key2, Size keysize); + + +/* + * Initialize the hash table on first use. + * + * The hash table will be in TopMemoryContext regardless of caller's context. + */ +static void +cfunc_hashtable_init(void) +{ + HASHCTL ctl; + + /* don't allow double-initialization */ + Assert(cfunc_hashtable == NULL); + + ctl.keysize = sizeof(CachedFunctionHashKey); + ctl.entrysize = sizeof(CachedFunctionHashEntry); + ctl.hash = cfunc_hash; + ctl.match = cfunc_match; + cfunc_hashtable = hash_create("Cached function hash", + FUNCS_PER_USER, + &ctl, + HASH_ELEM | HASH_FUNCTION | HASH_COMPARE); +} + +/* + * cfunc_hash: hash function for cfunc hash table + * + * We need special hash and match functions to deal with the optional + * presence of a TupleDesc in the hash keys. As long as we have to do + * that, we might as well also be smart about not comparing unused + * elements of the argtypes arrays. + */ +static uint32 +cfunc_hash(const void *key, Size keysize) +{ + const CachedFunctionHashKey *k = (const CachedFunctionHashKey *) key; + uint32 h; + + Assert(keysize == sizeof(CachedFunctionHashKey)); + /* Hash all the fixed fields except callResultType */ + h = DatumGetUInt32(hash_any((const unsigned char *) k, + offsetof(CachedFunctionHashKey, callResultType))); + /* Incorporate input argument types */ + if (k->nargs > 0) + h = hash_combine(h, + DatumGetUInt32(hash_any((const unsigned char *) k->argtypes, + k->nargs * sizeof(Oid)))); + /* Incorporate callResultType if present */ + if (k->callResultType) + h = hash_combine(h, hashRowType(k->callResultType)); + return h; +} + +/* + * cfunc_match: match function to use with cfunc_hash + */ +static int +cfunc_match(const void *key1, const void *key2, Size keysize) +{ + const CachedFunctionHashKey *k1 = (const CachedFunctionHashKey *) key1; + const CachedFunctionHashKey *k2 = (const CachedFunctionHashKey *) key2; + + Assert(keysize == sizeof(CachedFunctionHashKey)); + /* Compare all the fixed fields except callResultType */ + if (memcmp(k1, k2, offsetof(CachedFunctionHashKey, callResultType)) != 0) + return 1; /* not equal */ + /* Compare input argument types (we just verified that nargs matches) */ + if (k1->nargs > 0 && + memcmp(k1->argtypes, k2->argtypes, k1->nargs * sizeof(Oid)) != 0) + return 1; /* not equal */ + /* Compare callResultType */ + if (k1->callResultType) + { + if (k2->callResultType) + { + if (!equalRowTypes(k1->callResultType, k2->callResultType)) + return 1; /* not equal */ + } + else + return 1; /* not equal */ + } + else + { + if (k2->callResultType) + return 1; /* not equal */ + } + return 0; /* equal */ +} + +/* + * Look up the CachedFunction for the given hash key. + * Returns NULL if not present. + */ +static CachedFunction * +cfunc_hashtable_lookup(CachedFunctionHashKey *func_key) +{ + CachedFunctionHashEntry *hentry; + + if (cfunc_hashtable == NULL) + return NULL; + + hentry = (CachedFunctionHashEntry *) hash_search(cfunc_hashtable, + func_key, + HASH_FIND, + NULL); + if (hentry) + return hentry->function; + else + return NULL; +} + +/* + * Insert a hash table entry. + */ +static void +cfunc_hashtable_insert(CachedFunction *function, + CachedFunctionHashKey *func_key) +{ + CachedFunctionHashEntry *hentry; + bool found; + + if (cfunc_hashtable == NULL) + cfunc_hashtable_init(); + + hentry = (CachedFunctionHashEntry *) hash_search(cfunc_hashtable, + func_key, + HASH_ENTER, + &found); + if (found) + elog(WARNING, "trying to insert a function that already exists"); + + /* + * If there's a callResultType, copy it into TopMemoryContext. If we're + * unlucky enough for that to fail, leave the entry with null + * callResultType, which will probably never match anything. + */ + if (func_key->callResultType) + { + MemoryContext oldcontext = MemoryContextSwitchTo(TopMemoryContext); + + hentry->key.callResultType = NULL; + hentry->key.callResultType = CreateTupleDescCopy(func_key->callResultType); + MemoryContextSwitchTo(oldcontext); + } + + hentry->function = function; + + /* Set back-link from function to hashtable key */ + function->fn_hashkey = &hentry->key; +} + +/* + * Delete a hash table entry. + */ +static void +cfunc_hashtable_delete(CachedFunction *function) +{ + CachedFunctionHashEntry *hentry; + TupleDesc tupdesc; + + /* do nothing if not in table */ + if (function->fn_hashkey == NULL) + return; + + /* + * We need to free the callResultType if present, which is slightly tricky + * because it has to be valid during the hashtable search. Fortunately, + * because we have the hashkey back-link, we can grab that pointer before + * deleting the hashtable entry. + */ + tupdesc = function->fn_hashkey->callResultType; + + hentry = (CachedFunctionHashEntry *) hash_search(cfunc_hashtable, + function->fn_hashkey, + HASH_REMOVE, + NULL); + if (hentry == NULL) + elog(WARNING, "trying to delete function that does not exist"); + + /* Remove back link, which no longer points to allocated storage */ + function->fn_hashkey = NULL; + + /* Release the callResultType if present */ + if (tupdesc) + FreeTupleDesc(tupdesc); +} + +/* + * Compute the hashkey for a given function invocation + * + * The hashkey is returned into the caller-provided storage at *hashkey. + * Note however that if a callResultType is incorporated, we've not done + * anything about copying that. + */ +static void +compute_function_hashkey(FunctionCallInfo fcinfo, + Form_pg_proc procStruct, + CachedFunctionHashKey *hashkey, + Size cacheEntrySize, + bool includeResultType, + bool forValidator) +{ + /* Make sure pad bytes within fixed part of the struct are zero */ + memset(hashkey, 0, offsetof(CachedFunctionHashKey, argtypes)); + + /* get function OID */ + hashkey->funcOid = fcinfo->flinfo->fn_oid; + + /* get call context */ + hashkey->isTrigger = CALLED_AS_TRIGGER(fcinfo); + hashkey->isEventTrigger = CALLED_AS_EVENT_TRIGGER(fcinfo); + + /* record cacheEntrySize so multiple languages can share hash table */ + hashkey->cacheEntrySize = cacheEntrySize; + + /* + * If DML trigger, include trigger's OID in the hash, so that each trigger + * usage gets a different hash entry, allowing for e.g. different relation + * rowtypes or transition table names. In validation mode we do not know + * what relation or transition table names are intended to be used, so we + * leave trigOid zero; the hash entry built in this case will never be + * used for any actual calls. + * + * We don't currently need to distinguish different event trigger usages + * in the same way, since the special parameter variables don't vary in + * type in that case. + */ + if (hashkey->isTrigger && !forValidator) + { + TriggerData *trigdata = (TriggerData *) fcinfo->context; + + hashkey->trigOid = trigdata->tg_trigger->tgoid; + } + + /* get input collation, if known */ + hashkey->inputCollation = fcinfo->fncollation; + + /* + * We include only input arguments in the hash key, since output argument + * types can be deduced from those, and it would require extra cycles to + * include the output arguments. But we have to resolve any polymorphic + * argument types to the real types for the call. + */ + if (procStruct->pronargs > 0) + { + hashkey->nargs = procStruct->pronargs; + memcpy(hashkey->argtypes, procStruct->proargtypes.values, + procStruct->pronargs * sizeof(Oid)); + cfunc_resolve_polymorphic_argtypes(procStruct->pronargs, + hashkey->argtypes, + NULL, /* all args are inputs */ + fcinfo->flinfo->fn_expr, + forValidator, + NameStr(procStruct->proname)); + } + + /* + * While regular OUT arguments are sufficiently represented by the + * resolved input arguments, a function returning composite has additional + * variability: ALTER TABLE/ALTER TYPE could affect what it returns. Also, + * a function returning RECORD may depend on a column definition list to + * determine its output rowtype. If the caller needs the exact result + * type to be part of the hash lookup key, we must run + * get_call_result_type() to find that out. + */ + if (includeResultType) + { + Oid resultTypeId; + TupleDesc tupdesc; + + switch (get_call_result_type(fcinfo, &resultTypeId, &tupdesc)) + { + case TYPEFUNC_COMPOSITE: + case TYPEFUNC_COMPOSITE_DOMAIN: + hashkey->callResultType = tupdesc; + break; + default: + /* scalar result, or indeterminate rowtype */ + break; + } + } +} + +/* + * This is the same as the standard resolve_polymorphic_argtypes() function, + * except that: + * 1. We go ahead and report the error if we can't resolve the types. + * 2. We treat RECORD-type input arguments (not output arguments) as if + * they were polymorphic, replacing their types with the actual input + * types if we can determine those. This allows us to create a separate + * function cache entry for each named composite type passed to such an + * argument. + * 3. In validation mode, we have no inputs to look at, so assume that + * polymorphic arguments are integer, integer-array or integer-range. + */ +void +cfunc_resolve_polymorphic_argtypes(int numargs, + Oid *argtypes, char *argmodes, + Node *call_expr, bool forValidator, + const char *proname) +{ + int i; + + if (!forValidator) + { + int inargno; + + /* normal case, pass to standard routine */ + if (!resolve_polymorphic_argtypes(numargs, argtypes, argmodes, + call_expr)) + ereport(ERROR, + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("could not determine actual argument " + "type for polymorphic function \"%s\"", + proname))); + /* also, treat RECORD inputs (but not outputs) as polymorphic */ + inargno = 0; + for (i = 0; i < numargs; i++) + { + char argmode = argmodes ? argmodes[i] : PROARGMODE_IN; + + if (argmode == PROARGMODE_OUT || argmode == PROARGMODE_TABLE) + continue; + if (argtypes[i] == RECORDOID || argtypes[i] == RECORDARRAYOID) + { + Oid resolvedtype = get_call_expr_argtype(call_expr, + inargno); + + if (OidIsValid(resolvedtype)) + argtypes[i] = resolvedtype; + } + inargno++; + } + } + else + { + /* special validation case (no need to do anything for RECORD) */ + for (i = 0; i < numargs; i++) + { + switch (argtypes[i]) + { + case ANYELEMENTOID: + case ANYNONARRAYOID: + case ANYENUMOID: /* XXX dubious */ + case ANYCOMPATIBLEOID: + case ANYCOMPATIBLENONARRAYOID: + argtypes[i] = INT4OID; + break; + case ANYARRAYOID: + case ANYCOMPATIBLEARRAYOID: + argtypes[i] = INT4ARRAYOID; + break; + case ANYRANGEOID: + case ANYCOMPATIBLERANGEOID: + argtypes[i] = INT4RANGEOID; + break; + case ANYMULTIRANGEOID: + argtypes[i] = INT4MULTIRANGEOID; + break; + default: + break; + } + } + } +} + +/* + * delete_function - clean up as much as possible of a stale function cache + * + * We can't release the CachedFunction struct itself, because of the + * possibility that there are fn_extra pointers to it. We can release + * the subsidiary storage, but only if there are no active evaluations + * in progress. Otherwise we'll just leak that storage. Since the + * case would only occur if a pg_proc update is detected during a nested + * recursive call on the function, a leak seems acceptable. + * + * Note that this can be called more than once if there are multiple fn_extra + * pointers to the same function cache. Hence be careful not to do things + * twice. + */ +static void +delete_function(CachedFunction *func) +{ + /* remove function from hash table (might be done already) */ + cfunc_hashtable_delete(func); + + /* release the function's storage if safe and not done already */ + if (func->use_count == 0 && + func->dcallback != NULL) + { + func->dcallback(func); + func->dcallback = NULL; + } +} + +/* + * Compile a cached function, if no existing cache entry is suitable. + * + * fcinfo is the current call information. + * + * function should be NULL or the result of a previous call of + * cached_function_compile() for the same fcinfo. The caller will + * typically save the result in fcinfo->flinfo->fn_extra, or in a + * field of a struct pointed to by fn_extra, to re-use in later + * calls within the same query. + * + * ccallback and dcallback are function-language-specific callbacks to + * compile and delete a cached function entry. dcallback can be NULL + * if there's nothing for it to do. + * + * cacheEntrySize is the function-language-specific size of the cache entry + * (which embeds a CachedFunction struct and typically has many more fields + * after that). + * + * If includeResultType is true and the function returns composite, + * include the actual result descriptor in the cache lookup key. + * + * If forValidator is true, we're only compiling for validation purposes, + * and so some checks are skipped. + * + * Note: it's important for this to fall through quickly if the function + * has already been compiled. + * + * Note: this function leaves the "use_count" field as zero. The caller + * is expected to increment the use_count and decrement it when done with + * the cache entry. + */ +CachedFunction * +cached_function_compile(FunctionCallInfo fcinfo, + CachedFunction *function, + CachedFunctionCompileCallback ccallback, + CachedFunctionDeleteCallback dcallback, + Size cacheEntrySize, + bool includeResultType, + bool forValidator) +{ + Oid funcOid = fcinfo->flinfo->fn_oid; + HeapTuple procTup; + Form_pg_proc procStruct; + CachedFunctionHashKey hashkey; + bool function_valid = false; + bool hashkey_valid = false; + + /* + * Lookup the pg_proc tuple by Oid; we'll need it in any case + */ + procTup = SearchSysCache1(PROCOID, ObjectIdGetDatum(funcOid)); + if (!HeapTupleIsValid(procTup)) + elog(ERROR, "cache lookup failed for function %u", funcOid); + procStruct = (Form_pg_proc) GETSTRUCT(procTup); + + /* + * Do we already have a cache entry for the current FmgrInfo? If not, try + * to find one in the hash table. + */ +recheck: + if (!function) + { + /* Compute hashkey using function signature and actual arg types */ + compute_function_hashkey(fcinfo, procStruct, &hashkey, + cacheEntrySize, includeResultType, + forValidator); + hashkey_valid = true; + + /* And do the lookup */ + function = cfunc_hashtable_lookup(&hashkey); + } + + if (function) + { + /* We have a compiled function, but is it still valid? */ + if (function->fn_xmin == HeapTupleHeaderGetRawXmin(procTup->t_data) && + ItemPointerEquals(&function->fn_tid, &procTup->t_self)) + function_valid = true; + else + { + /* + * Nope, so remove it from hashtable and try to drop associated + * storage (if not done already). + */ + delete_function(function); + + /* + * If the function isn't in active use then we can overwrite the + * func struct with new data, allowing any other existing fn_extra + * pointers to make use of the new definition on their next use. + * If it is in use then just leave it alone and make a new one. + * (The active invocations will run to completion using the + * previous definition, and then the cache entry will just be + * leaked; doesn't seem worth adding code to clean it up, given + * what a corner case this is.) + * + * If we found the function struct via fn_extra then it's possible + * a replacement has already been made, so go back and recheck the + * hashtable. + */ + if (function->use_count != 0) + { + function = NULL; + if (!hashkey_valid) + goto recheck; + } + } + } + + /* + * If the function wasn't found or was out-of-date, we have to compile it. + */ + if (!function_valid) + { + /* + * Calculate hashkey if we didn't already; we'll need it to store the + * completed function. + */ + if (!hashkey_valid) + compute_function_hashkey(fcinfo, procStruct, &hashkey, + cacheEntrySize, includeResultType, + forValidator); + + /* + * Create the new function struct, if not done already. The function + * structs are never thrown away, so keep them in TopMemoryContext. + */ + Assert(cacheEntrySize >= sizeof(CachedFunction)); + if (function == NULL) + { + function = (CachedFunction *) + MemoryContextAllocZero(TopMemoryContext, cacheEntrySize); + } + else + { + /* re-using a previously existing struct, so clear it out */ + memset(function, 0, cacheEntrySize); + } + + /* + * Fill in the CachedFunction part. fn_hashkey and use_count remain + * zeroes for now. + */ + function->fn_xmin = HeapTupleHeaderGetRawXmin(procTup->t_data); + function->fn_tid = procTup->t_self; + function->dcallback = dcallback; + + /* + * Do the hard, language-specific part. + */ + ccallback(fcinfo, procTup, &hashkey, function, forValidator); + + /* + * Add the completed struct to the hash table. + */ + cfunc_hashtable_insert(function, &hashkey); + } + + ReleaseSysCache(procTup); + + /* + * Finally return the compiled function + */ + return function; +} diff --git a/src/backend/utils/cache/meson.build b/src/backend/utils/cache/meson.build index 104b28737d7..a1784dce585 100644 --- a/src/backend/utils/cache/meson.build +++ b/src/backend/utils/cache/meson.build @@ -4,6 +4,7 @@ backend_sources += files( 'attoptcache.c', 'catcache.c', 'evtcache.c', + 'funccache.c', 'inval.c', 'lsyscache.c', 'partcache.c', diff --git a/src/include/utils/funccache.h b/src/include/utils/funccache.h new file mode 100644 index 00000000000..e0112ebfa11 --- /dev/null +++ b/src/include/utils/funccache.h @@ -0,0 +1,134 @@ +/*------------------------------------------------------------------------- + * + * funccache.h + * Function cache definitions. + * + * See funccache.c for comments. + * + * Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group + * Portions Copyright (c) 1994, Regents of the University of California + * + * src/include/utils/funccache.h + * + *------------------------------------------------------------------------- + */ +#ifndef FUNCCACHE_H +#define FUNCCACHE_H + +#include "access/htup_details.h" +#include "fmgr.h" +#include "storage/itemptr.h" + +struct CachedFunctionHashKey; /* forward references */ +struct CachedFunction; + +/* + * Callback that cached_function_compile() invokes when it's necessary to + * compile a cached function. The callback must fill in *function (except + * for the fields of struct CachedFunction), or throw an error if trouble. + * fcinfo: current call information + * procTup: function's pg_proc row from catcache + * hashkey: hash key that will be used for the function + * function: pre-zeroed workspace, of size passed to cached_function_compile() + * forValidator: passed through from cached_function_compile() + */ +typedef void (*CachedFunctionCompileCallback) (FunctionCallInfo fcinfo, + HeapTuple procTup, + const struct CachedFunctionHashKey *hashkey, + struct CachedFunction *function, + bool forValidator); + +/* + * Callback called when discarding a cache entry. Free any free-able + * subsidiary data of cfunc, but not the struct CachedFunction itself. + */ +typedef void (*CachedFunctionDeleteCallback) (struct CachedFunction *cfunc); + +/* + * Hash lookup key for functions. This must account for all aspects + * of a specific call that might lead to different data types or + * collations being used within the function. + */ +typedef struct CachedFunctionHashKey +{ + Oid funcOid; + + bool isTrigger; /* true if called as a DML trigger */ + bool isEventTrigger; /* true if called as an event trigger */ + + /* be careful that pad bytes in this struct get zeroed! */ + + /* + * We include the language-specific size of the function's cache entry in + * the cache key. This covers the case where CREATE OR REPLACE FUNCTION + * is used to change the implementation language, and the new language + * also uses funccache.c but needs a different-sized cache entry. + */ + Size cacheEntrySize; + + /* + * For a trigger function, the OID of the trigger is part of the hash key + * --- we want to compile the trigger function separately for each trigger + * it is used with, in case the rowtype or transition table names are + * different. Zero if not called as a DML trigger. + */ + Oid trigOid; + + /* + * We must include the input collation as part of the hash key too, + * because we have to generate different plans (with different Param + * collations) for different collation settings. + */ + Oid inputCollation; + + /* Number of arguments (counting input arguments only, ie pronargs) */ + int nargs; + + /* If you change anything below here, fix hashing code in funccache.c! */ + + /* + * If relevant, the result descriptor for a function returning composite. + */ + TupleDesc callResultType; + + /* + * Input argument types, with any polymorphic types resolved to actual + * types. Only the first nargs entries are valid. + */ + Oid argtypes[FUNC_MAX_ARGS]; +} CachedFunctionHashKey; + +/* + * Representation of a compiled function. This struct contains just the + * fields that funccache.c needs to deal with. It will typically be + * embedded in a larger struct containing function-language-specific data. + */ +typedef struct CachedFunction +{ + /* back-link to hashtable entry, or NULL if not in hash table */ + CachedFunctionHashKey *fn_hashkey; + /* xmin and ctid of function's pg_proc row; used to detect invalidation */ + TransactionId fn_xmin; + ItemPointerData fn_tid; + /* deletion callback */ + CachedFunctionDeleteCallback dcallback; + + /* this field changes when the function is used: */ + uint64 use_count; +} CachedFunction; + +extern CachedFunction *cached_function_compile(FunctionCallInfo fcinfo, + CachedFunction *function, + CachedFunctionCompileCallback ccallback, + CachedFunctionDeleteCallback dcallback, + Size cacheEntrySize, + bool includeResultType, + bool forValidator); +extern void cfunc_resolve_polymorphic_argtypes(int numargs, + Oid *argtypes, + char *argmodes, + Node *call_expr, + bool forValidator, + const char *proname); + +#endif /* FUNCCACHE_H */ diff --git a/src/pl/plpgsql/src/pl_comp.c b/src/pl/plpgsql/src/pl_comp.c index 6fdba95962d..1a091d0c55f 100644 --- a/src/pl/plpgsql/src/pl_comp.c +++ b/src/pl/plpgsql/src/pl_comp.c @@ -52,20 +52,6 @@ PLpgSQL_function *plpgsql_curr_compile; /* A context appropriate for short-term allocs during compilation */ MemoryContext plpgsql_compile_tmp_cxt; -/* ---------- - * Hash table for compiled functions - * ---------- - */ -static HTAB *plpgsql_HashTable = NULL; - -typedef struct plpgsql_hashent -{ - PLpgSQL_func_hashkey key; - PLpgSQL_function *function; -} plpgsql_HashEnt; - -#define FUNCS_PER_USER 128 /* initial table size */ - /* ---------- * Lookup table for EXCEPTION condition names * ---------- @@ -86,11 +72,11 @@ static const ExceptionLabelMap exception_label_map[] = { * static prototypes * ---------- */ -static PLpgSQL_function *do_compile(FunctionCallInfo fcinfo, - HeapTuple procTup, - PLpgSQL_function *function, - PLpgSQL_func_hashkey *hashkey, - bool forValidator); +static void plpgsql_compile_callback(FunctionCallInfo fcinfo, + HeapTuple procTup, + const CachedFunctionHashKey *hashkey, + CachedFunction *cfunc, + bool forValidator); static void plpgsql_compile_error_callback(void *arg); static void add_parameter_name(PLpgSQL_nsitem_type itemtype, int itemno, const char *name); static void add_dummy_return(PLpgSQL_function *function); @@ -105,19 +91,6 @@ static PLpgSQL_type *build_datatype(HeapTuple typeTup, int32 typmod, Oid collation, TypeName *origtypname); static void plpgsql_start_datums(void); static void plpgsql_finish_datums(PLpgSQL_function *function); -static void compute_function_hashkey(FunctionCallInfo fcinfo, - Form_pg_proc procStruct, - PLpgSQL_func_hashkey *hashkey, - bool forValidator); -static void plpgsql_resolve_polymorphic_argtypes(int numargs, - Oid *argtypes, char *argmodes, - Node *call_expr, bool forValidator, - const char *proname); -static PLpgSQL_function *plpgsql_HashTableLookup(PLpgSQL_func_hashkey *func_key); -static void plpgsql_HashTableInsert(PLpgSQL_function *function, - PLpgSQL_func_hashkey *func_key); -static void plpgsql_HashTableDelete(PLpgSQL_function *function); -static void delete_function(PLpgSQL_function *func); /* ---------- * plpgsql_compile Make an execution tree for a PL/pgSQL function. @@ -132,97 +105,24 @@ static void delete_function(PLpgSQL_function *func); PLpgSQL_function * plpgsql_compile(FunctionCallInfo fcinfo, bool forValidator) { - Oid funcOid = fcinfo->flinfo->fn_oid; - HeapTuple procTup; - Form_pg_proc procStruct; PLpgSQL_function *function; - PLpgSQL_func_hashkey hashkey; - bool function_valid = false; - bool hashkey_valid = false; - - /* - * Lookup the pg_proc tuple by Oid; we'll need it in any case - */ - procTup = SearchSysCache1(PROCOID, ObjectIdGetDatum(funcOid)); - if (!HeapTupleIsValid(procTup)) - elog(ERROR, "cache lookup failed for function %u", funcOid); - procStruct = (Form_pg_proc) GETSTRUCT(procTup); - - /* - * See if there's already a cache entry for the current FmgrInfo. If not, - * try to find one in the hash table. - */ - function = (PLpgSQL_function *) fcinfo->flinfo->fn_extra; - -recheck: - if (!function) - { - /* Compute hashkey using function signature and actual arg types */ - compute_function_hashkey(fcinfo, procStruct, &hashkey, forValidator); - hashkey_valid = true; - - /* And do the lookup */ - function = plpgsql_HashTableLookup(&hashkey); - } - - if (function) - { - /* We have a compiled function, but is it still valid? */ - if (function->fn_xmin == HeapTupleHeaderGetRawXmin(procTup->t_data) && - ItemPointerEquals(&function->fn_tid, &procTup->t_self)) - function_valid = true; - else - { - /* - * Nope, so remove it from hashtable and try to drop associated - * storage (if not done already). - */ - delete_function(function); - - /* - * If the function isn't in active use then we can overwrite the - * func struct with new data, allowing any other existing fn_extra - * pointers to make use of the new definition on their next use. - * If it is in use then just leave it alone and make a new one. - * (The active invocations will run to completion using the - * previous definition, and then the cache entry will just be - * leaked; doesn't seem worth adding code to clean it up, given - * what a corner case this is.) - * - * If we found the function struct via fn_extra then it's possible - * a replacement has already been made, so go back and recheck the - * hashtable. - */ - if (function->use_count != 0) - { - function = NULL; - if (!hashkey_valid) - goto recheck; - } - } - } /* - * If the function wasn't found or was out-of-date, we have to compile it + * funccache.c manages re-use of existing PLpgSQL_function caches. + * + * In PL/pgSQL we use fn_extra directly as the pointer to the long-lived + * function cache entry; we have no need for any query-lifespan cache. + * Also, we don't need to make the cache key depend on composite result + * type (at least for now). */ - if (!function_valid) - { - /* - * Calculate hashkey if we didn't already; we'll need it to store the - * completed function. - */ - if (!hashkey_valid) - compute_function_hashkey(fcinfo, procStruct, &hashkey, - forValidator); - - /* - * Do the hard part. - */ - function = do_compile(fcinfo, procTup, function, - &hashkey, forValidator); - } - - ReleaseSysCache(procTup); + function = (PLpgSQL_function *) + cached_function_compile(fcinfo, + fcinfo->flinfo->fn_extra, + plpgsql_compile_callback, + plpgsql_delete_callback, + sizeof(PLpgSQL_function), + false, + forValidator); /* * Save pointer in FmgrInfo to avoid search on subsequent calls @@ -244,8 +144,8 @@ struct compile_error_callback_arg /* * This is the slow part of plpgsql_compile(). * - * The passed-in "function" pointer is either NULL or an already-allocated - * function struct to overwrite. + * The passed-in "cfunc" struct is expected to be zeroes, except + * for the CachedFunction fields, which we don't touch here. * * While compiling a function, the CurrentMemoryContext is the * per-function memory context of the function we are compiling. That @@ -263,13 +163,14 @@ struct compile_error_callback_arg * NB: this code is not re-entrant. We assume that nothing we do here could * result in the invocation of another plpgsql function. */ -static PLpgSQL_function * -do_compile(FunctionCallInfo fcinfo, - HeapTuple procTup, - PLpgSQL_function *function, - PLpgSQL_func_hashkey *hashkey, - bool forValidator) +static void +plpgsql_compile_callback(FunctionCallInfo fcinfo, + HeapTuple procTup, + const CachedFunctionHashKey *hashkey, + CachedFunction *cfunc, + bool forValidator) { + PLpgSQL_function *function = (PLpgSQL_function *) cfunc; Form_pg_proc procStruct = (Form_pg_proc) GETSTRUCT(procTup); bool is_dml_trigger = CALLED_AS_TRIGGER(fcinfo); bool is_event_trigger = CALLED_AS_EVENT_TRIGGER(fcinfo); @@ -320,21 +221,6 @@ do_compile(FunctionCallInfo fcinfo, * reasons. */ plpgsql_check_syntax = forValidator; - - /* - * Create the new function struct, if not done already. The function - * structs are never thrown away, so keep them in TopMemoryContext. - */ - if (function == NULL) - { - function = (PLpgSQL_function *) - MemoryContextAllocZero(TopMemoryContext, sizeof(PLpgSQL_function)); - } - else - { - /* re-using a previously existing struct, so clear it out */ - memset(function, 0, sizeof(PLpgSQL_function)); - } plpgsql_curr_compile = function; /* @@ -349,8 +235,6 @@ do_compile(FunctionCallInfo fcinfo, function->fn_signature = format_procedure(fcinfo->flinfo->fn_oid); MemoryContextSetIdentifier(func_cxt, function->fn_signature); function->fn_oid = fcinfo->flinfo->fn_oid; - function->fn_xmin = HeapTupleHeaderGetRawXmin(procTup->t_data); - function->fn_tid = procTup->t_self; function->fn_input_collation = fcinfo->fncollation; function->fn_cxt = func_cxt; function->out_param_varno = -1; /* set up for no OUT param */ @@ -400,10 +284,14 @@ do_compile(FunctionCallInfo fcinfo, numargs = get_func_arg_info(procTup, &argtypes, &argnames, &argmodes); - plpgsql_resolve_polymorphic_argtypes(numargs, argtypes, argmodes, - fcinfo->flinfo->fn_expr, - forValidator, - plpgsql_error_funcname); + /* + * XXX can't we get rid of this in favor of using funccache.c's + * results? But why are we considering argmodes here not there?? + */ + cfunc_resolve_polymorphic_argtypes(numargs, argtypes, argmodes, + fcinfo->flinfo->fn_expr, + forValidator, + plpgsql_error_funcname); in_arg_varnos = (int *) palloc(numargs * sizeof(int)); out_arg_variables = (PLpgSQL_variable **) palloc(numargs * sizeof(PLpgSQL_variable *)); @@ -819,11 +707,6 @@ do_compile(FunctionCallInfo fcinfo, if (plpgsql_DumpExecTree) plpgsql_dumptree(function); - /* - * add it to the hash table - */ - plpgsql_HashTableInsert(function, hashkey); - /* * Pop the error context stack */ @@ -834,14 +717,13 @@ do_compile(FunctionCallInfo fcinfo, MemoryContextSwitchTo(plpgsql_compile_tmp_cxt); plpgsql_compile_tmp_cxt = NULL; - return function; } /* ---------- * plpgsql_compile_inline Make an execution tree for an anonymous code block. * - * Note: this is generally parallel to do_compile(); is it worth trying to - * merge the two? + * Note: this is generally parallel to plpgsql_compile_callback(); is it worth + * trying to merge the two? * * Note: we assume the block will be thrown away so there is no need to build * persistent data structures. @@ -2437,242 +2319,3 @@ plpgsql_add_initdatums(int **varnos) datums_last = plpgsql_nDatums; return n; } - - -/* - * Compute the hashkey for a given function invocation - * - * The hashkey is returned into the caller-provided storage at *hashkey. - */ -static void -compute_function_hashkey(FunctionCallInfo fcinfo, - Form_pg_proc procStruct, - PLpgSQL_func_hashkey *hashkey, - bool forValidator) -{ - /* Make sure any unused bytes of the struct are zero */ - MemSet(hashkey, 0, sizeof(PLpgSQL_func_hashkey)); - - /* get function OID */ - hashkey->funcOid = fcinfo->flinfo->fn_oid; - - /* get call context */ - hashkey->isTrigger = CALLED_AS_TRIGGER(fcinfo); - hashkey->isEventTrigger = CALLED_AS_EVENT_TRIGGER(fcinfo); - - /* - * If DML trigger, include trigger's OID in the hash, so that each trigger - * usage gets a different hash entry, allowing for e.g. different relation - * rowtypes or transition table names. In validation mode we do not know - * what relation or transition table names are intended to be used, so we - * leave trigOid zero; the hash entry built in this case will never be - * used for any actual calls. - * - * We don't currently need to distinguish different event trigger usages - * in the same way, since the special parameter variables don't vary in - * type in that case. - */ - if (hashkey->isTrigger && !forValidator) - { - TriggerData *trigdata = (TriggerData *) fcinfo->context; - - hashkey->trigOid = trigdata->tg_trigger->tgoid; - } - - /* get input collation, if known */ - hashkey->inputCollation = fcinfo->fncollation; - - if (procStruct->pronargs > 0) - { - /* get the argument types */ - memcpy(hashkey->argtypes, procStruct->proargtypes.values, - procStruct->pronargs * sizeof(Oid)); - - /* resolve any polymorphic argument types */ - plpgsql_resolve_polymorphic_argtypes(procStruct->pronargs, - hashkey->argtypes, - NULL, - fcinfo->flinfo->fn_expr, - forValidator, - NameStr(procStruct->proname)); - } -} - -/* - * This is the same as the standard resolve_polymorphic_argtypes() function, - * except that: - * 1. We go ahead and report the error if we can't resolve the types. - * 2. We treat RECORD-type input arguments (not output arguments) as if - * they were polymorphic, replacing their types with the actual input - * types if we can determine those. This allows us to create a separate - * function cache entry for each named composite type passed to such an - * argument. - * 3. In validation mode, we have no inputs to look at, so assume that - * polymorphic arguments are integer, integer-array or integer-range. - */ -static void -plpgsql_resolve_polymorphic_argtypes(int numargs, - Oid *argtypes, char *argmodes, - Node *call_expr, bool forValidator, - const char *proname) -{ - int i; - - if (!forValidator) - { - int inargno; - - /* normal case, pass to standard routine */ - if (!resolve_polymorphic_argtypes(numargs, argtypes, argmodes, - call_expr)) - ereport(ERROR, - (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), - errmsg("could not determine actual argument " - "type for polymorphic function \"%s\"", - proname))); - /* also, treat RECORD inputs (but not outputs) as polymorphic */ - inargno = 0; - for (i = 0; i < numargs; i++) - { - char argmode = argmodes ? argmodes[i] : PROARGMODE_IN; - - if (argmode == PROARGMODE_OUT || argmode == PROARGMODE_TABLE) - continue; - if (argtypes[i] == RECORDOID || argtypes[i] == RECORDARRAYOID) - { - Oid resolvedtype = get_call_expr_argtype(call_expr, - inargno); - - if (OidIsValid(resolvedtype)) - argtypes[i] = resolvedtype; - } - inargno++; - } - } - else - { - /* special validation case (no need to do anything for RECORD) */ - for (i = 0; i < numargs; i++) - { - switch (argtypes[i]) - { - case ANYELEMENTOID: - case ANYNONARRAYOID: - case ANYENUMOID: /* XXX dubious */ - case ANYCOMPATIBLEOID: - case ANYCOMPATIBLENONARRAYOID: - argtypes[i] = INT4OID; - break; - case ANYARRAYOID: - case ANYCOMPATIBLEARRAYOID: - argtypes[i] = INT4ARRAYOID; - break; - case ANYRANGEOID: - case ANYCOMPATIBLERANGEOID: - argtypes[i] = INT4RANGEOID; - break; - case ANYMULTIRANGEOID: - argtypes[i] = INT4MULTIRANGEOID; - break; - default: - break; - } - } - } -} - -/* - * delete_function - clean up as much as possible of a stale function cache - * - * We can't release the PLpgSQL_function struct itself, because of the - * possibility that there are fn_extra pointers to it. We can release - * the subsidiary storage, but only if there are no active evaluations - * in progress. Otherwise we'll just leak that storage. Since the - * case would only occur if a pg_proc update is detected during a nested - * recursive call on the function, a leak seems acceptable. - * - * Note that this can be called more than once if there are multiple fn_extra - * pointers to the same function cache. Hence be careful not to do things - * twice. - */ -static void -delete_function(PLpgSQL_function *func) -{ - /* remove function from hash table (might be done already) */ - plpgsql_HashTableDelete(func); - - /* release the function's storage if safe and not done already */ - if (func->use_count == 0) - plpgsql_free_function_memory(func); -} - -/* exported so we can call it from _PG_init() */ -void -plpgsql_HashTableInit(void) -{ - HASHCTL ctl; - - /* don't allow double-initialization */ - Assert(plpgsql_HashTable == NULL); - - ctl.keysize = sizeof(PLpgSQL_func_hashkey); - ctl.entrysize = sizeof(plpgsql_HashEnt); - plpgsql_HashTable = hash_create("PLpgSQL function hash", - FUNCS_PER_USER, - &ctl, - HASH_ELEM | HASH_BLOBS); -} - -static PLpgSQL_function * -plpgsql_HashTableLookup(PLpgSQL_func_hashkey *func_key) -{ - plpgsql_HashEnt *hentry; - - hentry = (plpgsql_HashEnt *) hash_search(plpgsql_HashTable, - func_key, - HASH_FIND, - NULL); - if (hentry) - return hentry->function; - else - return NULL; -} - -static void -plpgsql_HashTableInsert(PLpgSQL_function *function, - PLpgSQL_func_hashkey *func_key) -{ - plpgsql_HashEnt *hentry; - bool found; - - hentry = (plpgsql_HashEnt *) hash_search(plpgsql_HashTable, - func_key, - HASH_ENTER, - &found); - if (found) - elog(WARNING, "trying to insert a function that already exists"); - - hentry->function = function; - /* prepare back link from function to hashtable key */ - function->fn_hashkey = &hentry->key; -} - -static void -plpgsql_HashTableDelete(PLpgSQL_function *function) -{ - plpgsql_HashEnt *hentry; - - /* do nothing if not in table */ - if (function->fn_hashkey == NULL) - return; - - hentry = (plpgsql_HashEnt *) hash_search(plpgsql_HashTable, - function->fn_hashkey, - HASH_REMOVE, - NULL); - if (hentry == NULL) - elog(WARNING, "trying to delete function that does not exist"); - - /* remove back link, which no longer points to allocated storage */ - function->fn_hashkey = NULL; -} diff --git a/src/pl/plpgsql/src/pl_funcs.c b/src/pl/plpgsql/src/pl_funcs.c index 6b5394fc5fa..bc7a61feb4d 100644 --- a/src/pl/plpgsql/src/pl_funcs.c +++ b/src/pl/plpgsql/src/pl_funcs.c @@ -718,7 +718,7 @@ plpgsql_free_function_memory(PLpgSQL_function *func) int i; /* Better not call this on an in-use function */ - Assert(func->use_count == 0); + Assert(func->cfunc.use_count == 0); /* Release plans associated with variable declarations */ for (i = 0; i < func->ndatums; i++) @@ -767,6 +767,13 @@ plpgsql_free_function_memory(PLpgSQL_function *func) func->fn_cxt = NULL; } +/* Deletion callback used by funccache.c */ +void +plpgsql_delete_callback(CachedFunction *cfunc) +{ + plpgsql_free_function_memory((PLpgSQL_function *) cfunc); +} + /********************************************************************** * Debug functions for analyzing the compiled code diff --git a/src/pl/plpgsql/src/pl_handler.c b/src/pl/plpgsql/src/pl_handler.c index 1bf12232862..e9a72929947 100644 --- a/src/pl/plpgsql/src/pl_handler.c +++ b/src/pl/plpgsql/src/pl_handler.c @@ -202,7 +202,6 @@ _PG_init(void) MarkGUCPrefixReserved("plpgsql"); - plpgsql_HashTableInit(); RegisterXactCallback(plpgsql_xact_cb, NULL); RegisterSubXactCallback(plpgsql_subxact_cb, NULL); @@ -247,7 +246,7 @@ plpgsql_call_handler(PG_FUNCTION_ARGS) save_cur_estate = func->cur_estate; /* Mark the function as busy, so it can't be deleted from under us */ - func->use_count++; + func->cfunc.use_count++; /* * If we'll need a procedure-lifespan resowner to execute any CALL or DO @@ -284,7 +283,7 @@ plpgsql_call_handler(PG_FUNCTION_ARGS) PG_FINALLY(); { /* Decrement use-count, restore cur_estate */ - func->use_count--; + func->cfunc.use_count--; func->cur_estate = save_cur_estate; /* Be sure to release the procedure resowner if any */ @@ -334,7 +333,7 @@ plpgsql_inline_handler(PG_FUNCTION_ARGS) func = plpgsql_compile_inline(codeblock->source_text); /* Mark the function as busy, just pro forma */ - func->use_count++; + func->cfunc.use_count++; /* * Set up a fake fcinfo with just enough info to satisfy @@ -398,8 +397,8 @@ plpgsql_inline_handler(PG_FUNCTION_ARGS) ResourceOwnerDelete(simple_eval_resowner); /* Function should now have no remaining use-counts ... */ - func->use_count--; - Assert(func->use_count == 0); + func->cfunc.use_count--; + Assert(func->cfunc.use_count == 0); /* ... so we can free subsidiary storage */ plpgsql_free_function_memory(func); @@ -415,8 +414,8 @@ plpgsql_inline_handler(PG_FUNCTION_ARGS) ResourceOwnerDelete(simple_eval_resowner); /* Function should now have no remaining use-counts ... */ - func->use_count--; - Assert(func->use_count == 0); + func->cfunc.use_count--; + Assert(func->cfunc.use_count == 0); /* ... so we can free subsidiary storage */ plpgsql_free_function_memory(func); diff --git a/src/pl/plpgsql/src/plpgsql.h b/src/pl/plpgsql/src/plpgsql.h index b67847b5111..41e52b8ce71 100644 --- a/src/pl/plpgsql/src/plpgsql.h +++ b/src/pl/plpgsql/src/plpgsql.h @@ -21,6 +21,7 @@ #include "commands/trigger.h" #include "executor/spi.h" #include "utils/expandedrecord.h" +#include "utils/funccache.h" #include "utils/typcache.h" @@ -941,40 +942,6 @@ typedef struct PLpgSQL_stmt_dynexecute List *params; /* USING expressions */ } PLpgSQL_stmt_dynexecute; -/* - * Hash lookup key for functions - */ -typedef struct PLpgSQL_func_hashkey -{ - Oid funcOid; - - bool isTrigger; /* true if called as a DML trigger */ - bool isEventTrigger; /* true if called as an event trigger */ - - /* be careful that pad bytes in this struct get zeroed! */ - - /* - * For a trigger function, the OID of the trigger is part of the hash key - * --- we want to compile the trigger function separately for each trigger - * it is used with, in case the rowtype or transition table names are - * different. Zero if not called as a DML trigger. - */ - Oid trigOid; - - /* - * We must include the input collation as part of the hash key too, - * because we have to generate different plans (with different Param - * collations) for different collation settings. - */ - Oid inputCollation; - - /* - * We include actual argument types in the hash key to support polymorphic - * PLpgSQL functions. Be careful that extra positions are zeroed! - */ - Oid argtypes[FUNC_MAX_ARGS]; -} PLpgSQL_func_hashkey; - /* * Trigger type */ @@ -990,13 +957,12 @@ typedef enum PLpgSQL_trigtype */ typedef struct PLpgSQL_function { + CachedFunction cfunc; /* fields managed by funccache.c */ + char *fn_signature; Oid fn_oid; - TransactionId fn_xmin; - ItemPointerData fn_tid; PLpgSQL_trigtype fn_is_trigger; Oid fn_input_collation; - PLpgSQL_func_hashkey *fn_hashkey; /* back-link to hashtable key */ MemoryContext fn_cxt; Oid fn_rettype; @@ -1036,9 +1002,8 @@ typedef struct PLpgSQL_function bool requires_procedure_resowner; /* contains CALL or DO? */ bool has_exception_block; /* contains BEGIN...EXCEPTION? */ - /* these fields change when the function is used */ + /* this field changes when the function is used */ struct PLpgSQL_execstate *cur_estate; - unsigned long use_count; } PLpgSQL_function; /* @@ -1287,7 +1252,6 @@ extern PGDLLEXPORT int plpgsql_recognize_err_condition(const char *condname, extern PLpgSQL_condition *plpgsql_parse_err_condition(char *condname); extern void plpgsql_adddatum(PLpgSQL_datum *newdatum); extern int plpgsql_add_initdatums(int **varnos); -extern void plpgsql_HashTableInit(void); /* * Functions in pl_exec.c @@ -1335,6 +1299,7 @@ extern PGDLLEXPORT const char *plpgsql_stmt_typename(PLpgSQL_stmt *stmt); extern const char *plpgsql_getdiag_kindname(PLpgSQL_getdiag_kind kind); extern void plpgsql_mark_local_assignment_targets(PLpgSQL_function *func); extern void plpgsql_free_function_memory(PLpgSQL_function *func); +extern void plpgsql_delete_callback(CachedFunction *cfunc); extern void plpgsql_dumptree(PLpgSQL_function *func); /* diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list index ff75a508876..144c4e9662c 100644 --- a/src/tools/pgindent/typedefs.list +++ b/src/tools/pgindent/typedefs.list @@ -381,6 +381,11 @@ CURLM CURLoption CV CachedExpression +CachedFunction +CachedFunctionCompileCallback +CachedFunctionDeleteCallback +CachedFunctionHashEntry +CachedFunctionHashKey CachedPlan CachedPlanSource CallContext -- 2.43.5 From f0b52fe1844bf1be0a3fcc718069c7c1a41393e1 Mon Sep 17 00:00:00 2001 From: Tom Lane <tgl@sss.pgh.pa.us> Date: Sat, 29 Mar 2025 16:56:57 -0400 Subject: [PATCH v10 4/6] Restructure check_sql_fn_retval(). To support using the plan cache for SQL functions, we'll need to be able to redo the work of check_sql_fn_retval() on just one query's list-of-rewritten-queries at a time, since the plan cache will treat each query independently. This would be simple enough, except for a bizarre historical behavior: the existing code will take the last canSetTag query in the function as determining the result, even if it came from not-the-last original query. (The case is only possible when the last original query(s) are deleted by a DO INSTEAD NOTHING rule.) This behavior is undocumented except in source code comments, and it seems hard to believe that anyone's relying on it. It would be a mess to support with the plan cache, because a change in the rules applicable to some table could change which CachedPlanSource is supposed to produce the function result, even if the function itself has not changed. Let's just get rid of that silliness and insist that the last source query in the function is the one that must produce the result. Having mandated that, we can refactor check_sql_fn_retval() into an outer and an inner function where the inner one considers only a single list-of-rewritten-queries; the inner one will be usable in a post-rewrite callback hook as contemplated by the previous commit. Likewise refactor check_sql_fn_statements() so that we have a version that can be applied to just one list of Queries. (As things stand, it's not really necessary to recheck that during a replan, but maybe future changes in the rule system would create cases where it matters.) Also remove check_sql_fn_retval()'s targetlist output argument, putting the equivalent functionality into a separate function. This is needed because the plan cache would be in the way of passing that data directly. No outside caller needed that anyway. Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/8216639.NyiUUSuA9g@aivenlaptop --- src/backend/catalog/pg_proc.c | 2 +- src/backend/executor/functions.c | 176 ++++++++++++++++++--------- src/backend/optimizer/util/clauses.c | 4 +- src/include/executor/functions.h | 3 +- 4 files changed, 121 insertions(+), 64 deletions(-) diff --git a/src/backend/catalog/pg_proc.c b/src/backend/catalog/pg_proc.c index fe0490259e9..880b597fb3a 100644 --- a/src/backend/catalog/pg_proc.c +++ b/src/backend/catalog/pg_proc.c @@ -960,7 +960,7 @@ fmgr_sql_validator(PG_FUNCTION_ARGS) (void) check_sql_fn_retval(querytree_list, rettype, rettupdesc, proc->prokind, - false, NULL); + false); } error_context_stack = sqlerrcontext.previous; diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c index 6aa8e9c4d8a..5b06df84335 100644 --- a/src/backend/executor/functions.c +++ b/src/backend/executor/functions.c @@ -153,11 +153,16 @@ static Datum postquel_get_single_result(TupleTableSlot *slot, MemoryContext resultcontext); static void sql_exec_error_callback(void *arg); static void ShutdownSQLFunction(Datum arg); +static void check_sql_fn_statement(List *queryTreeList); +static bool check_sql_stmt_retval(List *queryTreeList, + Oid rettype, TupleDesc rettupdesc, + char prokind, bool insertDroppedCols); static bool coerce_fn_result_column(TargetEntry *src_tle, Oid res_type, int32 res_typmod, bool tlist_is_modifiable, List **upper_tlist, bool *upper_tlist_nontrivial); +static List *get_sql_fn_result_tlist(List *queryTreeList); static void sqlfunction_startup(DestReceiver *self, int operation, TupleDesc typeinfo); static bool sqlfunction_receive(TupleTableSlot *slot, DestReceiver *self); static void sqlfunction_shutdown(DestReceiver *self); @@ -592,7 +597,6 @@ init_sql_fcache(FunctionCallInfo fcinfo, Oid collation, bool lazyEvalOK) Form_pg_proc procedureStruct; SQLFunctionCachePtr fcache; List *queryTree_list; - List *resulttlist; ListCell *lc; Datum tmp; bool isNull; @@ -748,8 +752,7 @@ init_sql_fcache(FunctionCallInfo fcinfo, Oid collation, bool lazyEvalOK) rettype, rettupdesc, procedureStruct->prokind, - false, - &resulttlist); + false); /* * Construct a JunkFilter we can use to coerce the returned rowtype to the @@ -762,6 +765,14 @@ init_sql_fcache(FunctionCallInfo fcinfo, Oid collation, bool lazyEvalOK) { TupleTableSlot *slot = MakeSingleTupleTableSlot(NULL, &TTSOpsMinimalTuple); + List *resulttlist; + + /* + * Re-fetch the (possibly modified) output tlist of the final + * statement. By this point, we should have thrown an error if there + * is not one. + */ + resulttlist = get_sql_fn_result_tlist(llast_node(List, queryTree_list)); /* * If the result is composite, *and* we are returning the whole tuple @@ -1541,29 +1552,39 @@ check_sql_fn_statements(List *queryTreeLists) foreach(lc, queryTreeLists) { List *sublist = lfirst_node(List, lc); - ListCell *lc2; - foreach(lc2, sublist) - { - Query *query = lfirst_node(Query, lc2); + check_sql_fn_statement(sublist); + } +} - /* - * Disallow calling procedures with output arguments. The current - * implementation would just throw the output values away, unless - * the statement is the last one. Per SQL standard, we should - * assign the output values by name. By disallowing this here, we - * preserve an opportunity for future improvement. - */ - if (query->commandType == CMD_UTILITY && - IsA(query->utilityStmt, CallStmt)) - { - CallStmt *stmt = (CallStmt *) query->utilityStmt; +/* + * As above, for a single sublist of Queries. + */ +static void +check_sql_fn_statement(List *queryTreeList) +{ + ListCell *lc; - if (stmt->outargs != NIL) - ereport(ERROR, - (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), - errmsg("calling procedures with output arguments is not supported in SQL functions"))); - } + foreach(lc, queryTreeList) + { + Query *query = lfirst_node(Query, lc); + + /* + * Disallow calling procedures with output arguments. The current + * implementation would just throw the output values away, unless the + * statement is the last one. Per SQL standard, we should assign the + * output values by name. By disallowing this here, we preserve an + * opportunity for future improvement. + */ + if (query->commandType == CMD_UTILITY && + IsA(query->utilityStmt, CallStmt)) + { + CallStmt *stmt = (CallStmt *) query->utilityStmt; + + if (stmt->outargs != NIL) + ereport(ERROR, + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("calling procedures with output arguments is not supported in SQL functions"))); } } } @@ -1602,17 +1623,45 @@ check_sql_fn_statements(List *queryTreeLists) * In addition to coercing individual output columns, we can modify the * output to include dummy NULL columns for any dropped columns appearing * in rettupdesc. This is done only if the caller asks for it. - * - * If resultTargetList isn't NULL, then *resultTargetList is set to the - * targetlist that defines the final statement's result. Exception: if the - * function is defined to return VOID then *resultTargetList is set to NIL. */ bool check_sql_fn_retval(List *queryTreeLists, Oid rettype, TupleDesc rettupdesc, char prokind, - bool insertDroppedCols, - List **resultTargetList) + bool insertDroppedCols) +{ + List *queryTreeList; + + /* + * We consider only the last sublist of Query nodes, so that only the last + * original statement is a candidate to produce the result. This is a + * change from pre-v18 versions, which would back up to the last statement + * that includes a canSetTag query, thus ignoring any ending statement(s) + * that rewrite to DO INSTEAD NOTHING. That behavior was undocumented and + * there seems no good reason for it, except that it was an artifact of + * the original coding. + * + * If the function body is completely empty, handle that the same as if + * the last query had rewritten to nothing. + */ + if (queryTreeLists != NIL) + queryTreeList = llast_node(List, queryTreeLists); + else + queryTreeList = NIL; + + return check_sql_stmt_retval(queryTreeList, + rettype, rettupdesc, + prokind, insertDroppedCols); +} + +/* + * As for check_sql_fn_retval, but we are given just the last query's + * rewritten-queries list. + */ +static bool +check_sql_stmt_retval(List *queryTreeList, + Oid rettype, TupleDesc rettupdesc, + char prokind, bool insertDroppedCols) { bool is_tuple_result = false; Query *parse; @@ -1625,9 +1674,6 @@ check_sql_fn_retval(List *queryTreeLists, bool upper_tlist_nontrivial = false; ListCell *lc; - if (resultTargetList) - *resultTargetList = NIL; /* initialize in case of VOID result */ - /* * If it's declared to return VOID, we don't care what's in the function. * (This takes care of procedures with no output parameters, as well.) @@ -1636,30 +1682,20 @@ check_sql_fn_retval(List *queryTreeLists, return false; /* - * Find the last canSetTag query in the function body (which is presented - * to us as a list of sublists of Query nodes). This isn't necessarily - * the last parsetree, because rule rewriting can insert queries after - * what the user wrote. Note that it might not even be in the last - * sublist, for example if the last query rewrites to DO INSTEAD NOTHING. - * (It might not be unreasonable to throw an error in such a case, but - * this is the historical behavior and it doesn't seem worth changing.) + * Find the last canSetTag query in the list of Query nodes. This isn't + * necessarily the last parsetree, because rule rewriting can insert + * queries after what the user wrote. */ parse = NULL; parse_cell = NULL; - foreach(lc, queryTreeLists) + foreach(lc, queryTreeList) { - List *sublist = lfirst_node(List, lc); - ListCell *lc2; + Query *q = lfirst_node(Query, lc); - foreach(lc2, sublist) + if (q->canSetTag) { - Query *q = lfirst_node(Query, lc2); - - if (q->canSetTag) - { - parse = q; - parse_cell = lc2; - } + parse = q; + parse_cell = lc; } } @@ -1812,12 +1848,7 @@ check_sql_fn_retval(List *queryTreeLists, * further checking. Assume we're returning the whole tuple. */ if (rettupdesc == NULL) - { - /* Return tlist if requested */ - if (resultTargetList) - *resultTargetList = tlist; return true; - } /* * Verify that the targetlist matches the return tuple type. We scan @@ -1984,10 +2015,6 @@ tlist_coercion_finished: lfirst(parse_cell) = newquery; } - /* Return tlist (possibly modified) if requested */ - if (resultTargetList) - *resultTargetList = upper_tlist; - return is_tuple_result; } @@ -2063,6 +2090,37 @@ coerce_fn_result_column(TargetEntry *src_tle, return true; } +/* + * Extract the targetlist of the last canSetTag query in the given list + * of parsed-and-rewritten Queries. Returns NIL if there is none. + */ +static List * +get_sql_fn_result_tlist(List *queryTreeList) +{ + Query *parse = NULL; + ListCell *lc; + + foreach(lc, queryTreeList) + { + Query *q = lfirst_node(Query, lc); + + if (q->canSetTag) + parse = q; + } + if (parse && + parse->commandType == CMD_SELECT) + return parse->targetList; + else if (parse && + (parse->commandType == CMD_INSERT || + parse->commandType == CMD_UPDATE || + parse->commandType == CMD_DELETE || + parse->commandType == CMD_MERGE) && + parse->returningList) + return parse->returningList; + else + return NIL; +} + /* * CreateSQLFunctionDestReceiver -- create a suitable DestReceiver object diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c index 43dfecfb47f..816536ab865 100644 --- a/src/backend/optimizer/util/clauses.c +++ b/src/backend/optimizer/util/clauses.c @@ -4742,7 +4742,7 @@ inline_function(Oid funcid, Oid result_type, Oid result_collid, if (check_sql_fn_retval(list_make1(querytree_list), result_type, rettupdesc, funcform->prokind, - false, NULL)) + false)) goto fail; /* reject whole-tuple-result cases */ /* @@ -5288,7 +5288,7 @@ inline_set_returning_function(PlannerInfo *root, RangeTblEntry *rte) if (!check_sql_fn_retval(list_make1(querytree_list), fexpr->funcresulttype, rettupdesc, funcform->prokind, - true, NULL) && + true) && (functypclass == TYPEFUNC_COMPOSITE || functypclass == TYPEFUNC_COMPOSITE_DOMAIN || functypclass == TYPEFUNC_RECORD)) diff --git a/src/include/executor/functions.h b/src/include/executor/functions.h index a6ae2e72d79..58bdff9b039 100644 --- a/src/include/executor/functions.h +++ b/src/include/executor/functions.h @@ -48,8 +48,7 @@ extern void check_sql_fn_statements(List *queryTreeLists); extern bool check_sql_fn_retval(List *queryTreeLists, Oid rettype, TupleDesc rettupdesc, char prokind, - bool insertDroppedCols, - List **resultTargetList); + bool insertDroppedCols); extern DestReceiver *CreateSQLFunctionDestReceiver(void); -- 2.43.5 From d85adbeae4f83ce0355c8f626f7c8b0aa961735f Mon Sep 17 00:00:00 2001 From: Tom Lane <tgl@sss.pgh.pa.us> Date: Sat, 29 Mar 2025 20:31:50 -0400 Subject: [PATCH v10 5/6] Add a test case showing undesirable RLS behavior in SQL functions. In the historical implementation of SQL functions, once we have built a set of plans for a SQL function we'll continue to use them during subsequent function invocations in the same query. This isn't ideal, and this somewhat-contrived test case shows one reason why not: we don't notice changes in RLS-relevant state. I'm putting this as a separate patch in the series so that the change in behavior will be apparent. Author: Alexander Pyhalov <a.pyhalov@postgrespro.ru> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/8216639.NyiUUSuA9g@aivenlaptop --- src/test/regress/expected/rowsecurity.out | 59 +++++++++++++++++++++++ src/test/regress/sql/rowsecurity.sql | 44 +++++++++++++++++ 2 files changed, 103 insertions(+) diff --git a/src/test/regress/expected/rowsecurity.out b/src/test/regress/expected/rowsecurity.out index 87929191d06..8f2c8319172 100644 --- a/src/test/regress/expected/rowsecurity.out +++ b/src/test/regress/expected/rowsecurity.out @@ -4695,6 +4695,65 @@ RESET ROLE; DROP FUNCTION rls_f(); DROP VIEW rls_v; DROP TABLE rls_t; +-- Check that RLS changes invalidate SQL function plans +create table rls_t (c text); +create table test_t (c text); +insert into rls_t values ('a'), ('b'), ('c'), ('d'); +insert into test_t values ('a'), ('b'); +alter table rls_t enable row level security; +grant select on rls_t to regress_rls_alice; +grant select on test_t to regress_rls_alice; +create policy p1 on rls_t for select to regress_rls_alice + using (c = current_setting('rls_test.blah')); +-- Function changes row_security setting and so invalidates plan +create function rls_f(text) returns text +begin atomic + select set_config('rls_test.blah', $1, true) || set_config('row_security', 'false', true) || string_agg(c, ',' order byc) from rls_t; +end; +set plan_cache_mode to force_custom_plan; +-- Table owner bypasses RLS +select rls_f(c) from test_t order by rls_f; + rls_f +------------- + aoffa,b,c,d + boffa,b,c,d +(2 rows) + +-- For other users, changes in row_security setting +-- should lead to RLS error during query rewrite +set role regress_rls_alice; +select rls_f(c) from test_t order by rls_f; + rls_f +------- + boffa + +(2 rows) + +reset role; +set plan_cache_mode to force_generic_plan; +-- Table owner bypasses RLS, although cached plan will be invalidated +select rls_f(c) from test_t order by rls_f; + rls_f +------------- + aoffa,b,c,d + boffa,b,c,d +(2 rows) + +-- For other users, changes in row_security setting +-- should lead to plan invalidation and RLS error during query rewrite +set role regress_rls_alice; +select rls_f(c) from test_t order by rls_f; + rls_f +------- + boffa + +(2 rows) + +reset role; +reset plan_cache_mode; +reset rls_test.blah; +drop function rls_f(text); +drop table rls_t, test_t; -- -- Clean up objects -- diff --git a/src/test/regress/sql/rowsecurity.sql b/src/test/regress/sql/rowsecurity.sql index f61dbbf9581..9da967a9ef2 100644 --- a/src/test/regress/sql/rowsecurity.sql +++ b/src/test/regress/sql/rowsecurity.sql @@ -2307,6 +2307,50 @@ DROP FUNCTION rls_f(); DROP VIEW rls_v; DROP TABLE rls_t; +-- Check that RLS changes invalidate SQL function plans +create table rls_t (c text); +create table test_t (c text); +insert into rls_t values ('a'), ('b'), ('c'), ('d'); +insert into test_t values ('a'), ('b'); +alter table rls_t enable row level security; +grant select on rls_t to regress_rls_alice; +grant select on test_t to regress_rls_alice; +create policy p1 on rls_t for select to regress_rls_alice + using (c = current_setting('rls_test.blah')); + +-- Function changes row_security setting and so invalidates plan +create function rls_f(text) returns text +begin atomic + select set_config('rls_test.blah', $1, true) || set_config('row_security', 'false', true) || string_agg(c, ',' order byc) from rls_t; +end; + +set plan_cache_mode to force_custom_plan; + +-- Table owner bypasses RLS +select rls_f(c) from test_t order by rls_f; + +-- For other users, changes in row_security setting +-- should lead to RLS error during query rewrite +set role regress_rls_alice; +select rls_f(c) from test_t order by rls_f; +reset role; + +set plan_cache_mode to force_generic_plan; + +-- Table owner bypasses RLS, although cached plan will be invalidated +select rls_f(c) from test_t order by rls_f; + +-- For other users, changes in row_security setting +-- should lead to plan invalidation and RLS error during query rewrite +set role regress_rls_alice; +select rls_f(c) from test_t order by rls_f; +reset role; + +reset plan_cache_mode; +reset rls_test.blah; +drop function rls_f(text); +drop table rls_t, test_t; + -- -- Clean up objects -- -- 2.43.5 From 7427db23bb4de73dddfdbbf9d6ed902079894aa1 Mon Sep 17 00:00:00 2001 From: Tom Lane <tgl@sss.pgh.pa.us> Date: Sat, 29 Mar 2025 21:57:50 -0400 Subject: [PATCH v10 6/6] Change SQL-language functions to use the plan cache. In the historical implementation of SQL functions (when they don't get inlined), we built plans for the contained queries at first call within an outer query, and then re-used those plans for the duration of the outer query, and then forgot everything. This was not ideal, not least because the plans could not be customized to specific values of the function's parameters. Our plancache infrastructure seems mature enough to be used here. That will solve both the problem with not being able to build custom plans and the problem with not being able to share work across successive outer queries. Moreover, this reimplementation will react to events that should cause a replan at the next entry to the SQL function. This is illustrated in the change in the rowsecurity test, where we now detect an RLS context change that was previously ignored. (I also added a test in create_function_sql that exercises ShutdownSQLFunction(), after noting from coverage results that that wasn't getting reached.) Author: Alexander Pyhalov <a.pyhalov@postgrespro.ru> Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/8216639.NyiUUSuA9g@aivenlaptop --- src/backend/executor/functions.c | 1082 +++++++++++------ .../expected/test_extensions.out | 2 +- .../regress/expected/create_function_sql.out | 17 +- src/test/regress/expected/rowsecurity.out | 16 +- src/test/regress/sql/create_function_sql.sql | 9 + src/tools/pgindent/typedefs.list | 2 + 6 files changed, 738 insertions(+), 390 deletions(-) diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c index 5b06df84335..b5a9ecea637 100644 --- a/src/backend/executor/functions.c +++ b/src/backend/executor/functions.c @@ -31,6 +31,7 @@ #include "tcop/utility.h" #include "utils/builtins.h" #include "utils/datum.h" +#include "utils/funccache.h" #include "utils/lsyscache.h" #include "utils/memutils.h" #include "utils/snapmgr.h" @@ -50,7 +51,7 @@ typedef struct /* * We have an execution_state record for each query in a function. Each - * record contains a plantree for its query. If the query is currently in + * record references a plantree for its query. If the query is currently in * F_EXEC_RUN state then there's a QueryDesc too. * * The "next" fields chain together all the execution_state records generated @@ -74,24 +75,43 @@ typedef struct execution_state /* - * An SQLFunctionCache record is built during the first call, - * and linked to from the fn_extra field of the FmgrInfo struct. + * Data associated with a SQL-language function is kept in three main + * data structures: * - * Note that currently this has only the lifespan of the calling query. - * Someday we should rewrite this code to use plancache.c to save parse/plan - * results for longer than that. + * 1. SQLFunctionHashEntry is a long-lived (potentially session-lifespan) + * struct that holds all the info we need out of the function's pg_proc row. + * In addition it holds pointers to CachedPlanSource(s) that manage creation + * of plans for the query(s) within the function. A SQLFunctionHashEntry is + * potentially shared across multiple concurrent executions of the function, + * so it must contain no execution-specific state; but its use_count must + * reflect the number of SQLFunctionLink structs pointing at it. + * If the function's pg_proc row is updated, we throw away and regenerate + * the SQLFunctionHashEntry and subsidiary data. (Also note that if the + * function is polymorphic or used as a trigger, there is a separate + * SQLFunctionHashEntry for each usage, so that we need consider only one + * set of relevant data types.) The struct itself is in memory managed by + * funccache.c, and its subsidiary data is kept in hcontext ("hash context"). * - * Physically, though, the data has the lifespan of the FmgrInfo that's used - * to call the function, and there are cases (particularly with indexes) - * where the FmgrInfo might survive across transactions. We cannot assume - * that the parse/plan trees are good for longer than the (sub)transaction in - * which parsing was done, so we must mark the record with the LXID/subxid of - * its creation time, and regenerate everything if that's obsolete. To avoid - * memory leakage when we do have to regenerate things, all the data is kept - * in a sub-context of the FmgrInfo's fn_mcxt. + * 2. SQLFunctionCache lasts for the duration of a single execution of + * the SQL function. (In "lazyEval" mode, this might span multiple calls of + * fmgr_sql.) It holds a reference to the CachedPlan for the current query, + * and other data that is execution-specific. The SQLFunctionCache itself + * as well as its subsidiary data are kept in fcontext ("function context"), + * which we free at completion. In non-returnsSet mode, this is just a child + * of the call-time context. In returnsSet mode, it is made a child of the + * FmgrInfo's fn_mcxt so that it will survive between fmgr_sql calls. + * + * 3. SQLFunctionLink is a tiny struct that just holds pointers to + * the SQLFunctionHashEntry and the current SQLFunctionCache (if any). + * It is pointed to by the fn_extra field of the FmgrInfo struct, and is + * always allocated in the FmgrInfo's fn_mcxt. Its purpose is to reduce + * the cost of repeat lookups of the SQLFunctionHashEntry. */ -typedef struct + +typedef struct SQLFunctionHashEntry { + CachedFunction cfunc; /* fields managed by funccache.c */ + char *fname; /* function name (for error msgs) */ char *src; /* function body text (for error msgs) */ @@ -102,8 +122,25 @@ typedef struct bool typbyval; /* true if return type is pass by value */ bool returnsSet; /* true if returning multiple rows */ bool returnsTuple; /* true if returning whole tuple result */ - bool shutdown_reg; /* true if registered shutdown callback */ bool readonly_func; /* true to run in "read only" mode */ + char prokind; /* prokind from pg_proc row */ + + TupleDesc rettupdesc; /* result tuple descriptor */ + + List *plansource_list; /* CachedPlanSources for fn's queries */ + + /* if positive, this is the index of the query we're parsing */ + int error_query_index; + + MemoryContext hcontext; /* memory context holding all above */ +} SQLFunctionHashEntry; + +typedef struct SQLFunctionCache +{ + SQLFunctionHashEntry *func; /* associated SQLFunctionHashEntry */ + + bool lazyEvalOK; /* true if lazyEval is safe */ + bool shutdown_reg; /* true if registered shutdown callback */ bool lazyEval; /* true if using lazyEval for result query */ ParamListInfo paramLI; /* Param list representing current args */ @@ -112,23 +149,40 @@ typedef struct JunkFilter *junkFilter; /* will be NULL if function returns VOID */ + /* if positive, this is the index of the query we're executing */ + int error_query_index; + /* - * func_state is a List of execution_state records, each of which is the - * first for its original parsetree, with any additional records chained - * to it via the "next" fields. This sublist structure is needed to keep - * track of where the original query boundaries are. + * While executing a particular query within the function, cplan is the + * CachedPlan we've obtained for that query, and eslist is a list of + * execution_state records for the individual plans within the CachedPlan. + * next_query_index is the 0-based index of the next CachedPlanSource to + * get a CachedPlan from. */ - List *func_state; + CachedPlan *cplan; /* Plan for current query, if any */ + ResourceOwner cowner; /* CachedPlan is registered with this owner */ + execution_state *eslist; /* execution_state records */ + int next_query_index; /* index of next CachedPlanSource to run */ MemoryContext fcontext; /* memory context holding this struct and all * subsidiary data */ - - LocalTransactionId lxid; /* lxid in which cache was made */ - SubTransactionId subxid; /* subxid in which cache was made */ } SQLFunctionCache; typedef SQLFunctionCache *SQLFunctionCachePtr; +/* Struct pointed to by FmgrInfo.fn_extra for a SQL function */ +typedef struct SQLFunctionLink +{ + /* Permanent pointer to associated SQLFunctionHashEntry */ + SQLFunctionHashEntry *func; + + /* Transient pointer to SQLFunctionCache, used only if returnsSet */ + SQLFunctionCache *fcache; + + /* Callback to release our use-count on the SQLFunctionHashEntry */ + MemoryContextCallback mcb; +} SQLFunctionLink; + /* non-export function prototypes */ static Node *sql_fn_param_ref(ParseState *pstate, ParamRef *pref); @@ -138,10 +192,10 @@ static Node *sql_fn_make_param(SQLFunctionParseInfoPtr pinfo, int paramno, int location); static Node *sql_fn_resolve_param_name(SQLFunctionParseInfoPtr pinfo, const char *paramname, int location); -static List *init_execution_state(List *queryTree_list, - SQLFunctionCachePtr fcache, - bool lazyEvalOK); -static void init_sql_fcache(FunctionCallInfo fcinfo, Oid collation, bool lazyEvalOK); +static bool init_execution_state(SQLFunctionCachePtr fcache); +static void sql_postrewrite_callback(List *querytree_list, void *arg); +static SQLFunctionCache *init_sql_fcache(FunctionCallInfo fcinfo, + bool lazyEvalOK); static void postquel_start(execution_state *es, SQLFunctionCachePtr fcache); static bool postquel_getnext(execution_state *es, SQLFunctionCachePtr fcache); static void postquel_end(execution_state *es); @@ -151,8 +205,10 @@ static Datum postquel_get_single_result(TupleTableSlot *slot, FunctionCallInfo fcinfo, SQLFunctionCachePtr fcache, MemoryContext resultcontext); +static void sql_compile_error_callback(void *arg); static void sql_exec_error_callback(void *arg); static void ShutdownSQLFunction(Datum arg); +static void RemoveSQLFunctionLink(void *arg); static void check_sql_fn_statement(List *queryTreeList); static bool check_sql_stmt_retval(List *queryTreeList, Oid rettype, TupleDesc rettupdesc, @@ -460,99 +516,172 @@ sql_fn_resolve_param_name(SQLFunctionParseInfoPtr pinfo, } /* - * Set up the per-query execution_state records for a SQL function. + * Set up the per-query execution_state records for the next query within + * the SQL function. * - * The input is a List of Lists of parsed and rewritten, but not planned, - * querytrees. The sublist structure denotes the original query boundaries. + * Returns true if successful, false if there are no more queries. */ -static List * -init_execution_state(List *queryTree_list, - SQLFunctionCachePtr fcache, - bool lazyEvalOK) +static bool +init_execution_state(SQLFunctionCachePtr fcache) { - List *eslist = NIL; + CachedPlanSource *plansource; + execution_state *preves = NULL; execution_state *lasttages = NULL; - ListCell *lc1; + ListCell *lc; - foreach(lc1, queryTree_list) + /* + * Clean up after previous query, if there was one. Note that we just + * leak the old execution_state records until end of function execution; + * there aren't likely to be enough of them to matter. + */ + if (fcache->cplan) { - List *qtlist = lfirst_node(List, lc1); - execution_state *firstes = NULL; - execution_state *preves = NULL; - ListCell *lc2; + ReleaseCachedPlan(fcache->cplan, fcache->cowner); + fcache->cplan = NULL; + } + fcache->eslist = NULL; - foreach(lc2, qtlist) - { - Query *queryTree = lfirst_node(Query, lc2); - PlannedStmt *stmt; - execution_state *newes; + /* + * Get the next CachedPlanSource, or stop if there are no more. + */ + if (fcache->next_query_index >= list_length(fcache->func->plansource_list)) + return false; + plansource = (CachedPlanSource *) list_nth(fcache->func->plansource_list, + fcache->next_query_index); + fcache->next_query_index++; - /* Plan the query if needed */ - if (queryTree->commandType == CMD_UTILITY) - { - /* Utility commands require no planning. */ - stmt = makeNode(PlannedStmt); - stmt->commandType = CMD_UTILITY; - stmt->canSetTag = queryTree->canSetTag; - stmt->utilityStmt = queryTree->utilityStmt; - stmt->stmt_location = queryTree->stmt_location; - stmt->stmt_len = queryTree->stmt_len; - stmt->queryId = queryTree->queryId; - } - else - stmt = pg_plan_query(queryTree, - fcache->src, - CURSOR_OPT_PARALLEL_OK, - NULL); + /* Count source queries for sql_exec_error_callback */ + fcache->error_query_index++; - /* - * Precheck all commands for validity in a function. This should - * generally match the restrictions spi.c applies. - */ - if (stmt->commandType == CMD_UTILITY) - { - if (IsA(stmt->utilityStmt, CopyStmt) && - ((CopyStmt *) stmt->utilityStmt)->filename == NULL) - ereport(ERROR, - (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), - errmsg("cannot COPY to/from client in an SQL function"))); + /* + * Generate plans for the query or queries within this CachedPlanSource. + * Register the CachedPlan with the current resource owner. (Saving + * cowner here is mostly paranoia, but this way we needn't assume that + * CurrentResourceOwner will be the same when ShutdownSQLFunction runs.) + */ + fcache->cowner = CurrentResourceOwner; + fcache->cplan = GetCachedPlan(plansource, + fcache->paramLI, + fcache->cowner, + NULL); - if (IsA(stmt->utilityStmt, TransactionStmt)) - ereport(ERROR, - (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), - /* translator: %s is a SQL statement name */ - errmsg("%s is not allowed in an SQL function", - CreateCommandName(stmt->utilityStmt)))); - } + /* + * Build execution_state list to match the number of contained plans. + */ + foreach(lc, fcache->cplan->stmt_list) + { + PlannedStmt *stmt = lfirst_node(PlannedStmt, lc); + execution_state *newes; + + /* + * Precheck all commands for validity in a function. This should + * generally match the restrictions spi.c applies. + */ + if (stmt->commandType == CMD_UTILITY) + { + if (IsA(stmt->utilityStmt, CopyStmt) && + ((CopyStmt *) stmt->utilityStmt)->filename == NULL) + ereport(ERROR, + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("cannot COPY to/from client in an SQL function"))); - if (fcache->readonly_func && !CommandIsReadOnly(stmt)) + if (IsA(stmt->utilityStmt, TransactionStmt)) ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), /* translator: %s is a SQL statement name */ - errmsg("%s is not allowed in a non-volatile function", - CreateCommandName((Node *) stmt)))); + errmsg("%s is not allowed in an SQL function", + CreateCommandName(stmt->utilityStmt)))); + } - /* OK, build the execution_state for this query */ - newes = (execution_state *) palloc(sizeof(execution_state)); - if (preves) - preves->next = newes; - else - firstes = newes; + if (fcache->func->readonly_func && !CommandIsReadOnly(stmt)) + ereport(ERROR, + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + /* translator: %s is a SQL statement name */ + errmsg("%s is not allowed in a non-volatile function", + CreateCommandName((Node *) stmt)))); + + /* OK, build the execution_state for this query */ + newes = (execution_state *) palloc(sizeof(execution_state)); + if (preves) + preves->next = newes; + else + fcache->eslist = newes; - newes->next = NULL; - newes->status = F_EXEC_START; - newes->setsResult = false; /* might change below */ - newes->lazyEval = false; /* might change below */ - newes->stmt = stmt; - newes->qd = NULL; + newes->next = NULL; + newes->status = F_EXEC_START; + newes->setsResult = false; /* might change below */ + newes->lazyEval = false; /* might change below */ + newes->stmt = stmt; + newes->qd = NULL; - if (queryTree->canSetTag) - lasttages = newes; + if (stmt->canSetTag) + lasttages = newes; - preves = newes; - } + preves = newes; + } + + /* + * If this isn't the last CachedPlanSource, we're done here. Otherwise, + * we need to prepare information about how to return the results. + */ + if (fcache->next_query_index < list_length(fcache->func->plansource_list)) + return true; + + /* + * Construct a JunkFilter we can use to coerce the returned rowtype to the + * desired form, unless the result type is VOID, in which case there's + * nothing to coerce to. (XXX Frequently, the JunkFilter isn't doing + * anything very interesting, but much of this module expects it to be + * there anyway.) + */ + if (fcache->func->rettype != VOIDOID) + { + TupleTableSlot *slot = MakeSingleTupleTableSlot(NULL, + &TTSOpsMinimalTuple); + List *resulttlist; + + /* + * Re-fetch the (possibly modified) output tlist of the final + * statement. By this point, we should have thrown an error if there + * is not one. + */ + resulttlist = get_sql_fn_result_tlist(plansource->query_list); - eslist = lappend(eslist, firstes); + /* + * We need to make a copy to ensure that it doesn't disappear + * underneath us due to plancache invalidation. + */ + resulttlist = copyObject(resulttlist); + + /* + * If the result is composite, *and* we are returning the whole tuple + * result, we need to insert nulls for any dropped columns. In the + * single-column-result case, there might be dropped columns within + * the composite column value, but it's not our problem here. There + * should be no resjunk entries in resulttlist, so in the second case + * the JunkFilter is certainly a no-op. + */ + if (fcache->func->rettupdesc && fcache->func->returnsTuple) + fcache->junkFilter = ExecInitJunkFilterConversion(resulttlist, + fcache->func->rettupdesc, + slot); + else + fcache->junkFilter = ExecInitJunkFilter(resulttlist, slot); + } + + if (fcache->func->returnsTuple) + { + /* Make sure output rowtype is properly blessed */ + BlessTupleDesc(fcache->junkFilter->jf_resultSlot->tts_tupleDescriptor); + } + else if (fcache->func->returnsSet && type_is_rowtype(fcache->func->rettype)) + { + /* + * Returning rowtype as if it were scalar --- materialize won't work. + * Right now it's sufficient to override any caller preference for + * materialize mode, but this might need more work in future. + */ + fcache->lazyEvalOK = true; } /* @@ -572,68 +701,69 @@ init_execution_state(List *queryTree_list, if (lasttages && fcache->junkFilter) { lasttages->setsResult = true; - if (lazyEvalOK && + if (fcache->lazyEvalOK && lasttages->stmt->commandType == CMD_SELECT && !lasttages->stmt->hasModifyingCTE) fcache->lazyEval = lasttages->lazyEval = true; } - return eslist; + return true; } /* - * Initialize the SQLFunctionCache for a SQL function + * Fill a new SQLFunctionHashEntry. + * + * The passed-in "cfunc" struct is expected to be zeroes, except + * for the CachedFunction fields, which we don't touch here. + * + * We expect to be called in a short-lived memory context (typically a + * query's per-tuple context). Data that is to be part of the hash entry + * must be copied into the hcontext, or put into a CachedPlanSource. */ static void -init_sql_fcache(FunctionCallInfo fcinfo, Oid collation, bool lazyEvalOK) +sql_compile_callback(FunctionCallInfo fcinfo, + HeapTuple procedureTuple, + const CachedFunctionHashKey *hashkey, + CachedFunction *cfunc, + bool forValidator) { - FmgrInfo *finfo = fcinfo->flinfo; - Oid foid = finfo->fn_oid; - MemoryContext fcontext; - MemoryContext oldcontext; + SQLFunctionHashEntry *func = (SQLFunctionHashEntry *) cfunc; + Form_pg_proc procedureStruct = (Form_pg_proc) GETSTRUCT(procedureTuple); + ErrorContextCallback comperrcontext; + MemoryContext hcontext; + MemoryContext oldcontext = CurrentMemoryContext; Oid rettype; TupleDesc rettupdesc; - HeapTuple procedureTuple; - Form_pg_proc procedureStruct; - SQLFunctionCachePtr fcache; - List *queryTree_list; - ListCell *lc; Datum tmp; bool isNull; + List *queryTree_list; + List *plansource_list; + ListCell *qlc; + ListCell *plc; /* - * Create memory context that holds all the SQLFunctionCache data. It - * must be a child of whatever context holds the FmgrInfo. - */ - fcontext = AllocSetContextCreate(finfo->fn_mcxt, - "SQL function", - ALLOCSET_DEFAULT_SIZES); - - oldcontext = MemoryContextSwitchTo(fcontext); - - /* - * Create the struct proper, link it to fcontext and fn_extra. Once this - * is done, we'll be able to recover the memory after failure, even if the - * FmgrInfo is long-lived. + * Setup error traceback support for ereport() during compile */ - fcache = (SQLFunctionCachePtr) palloc0(sizeof(SQLFunctionCache)); - fcache->fcontext = fcontext; - finfo->fn_extra = fcache; + comperrcontext.callback = sql_compile_error_callback; + comperrcontext.arg = func; + comperrcontext.previous = error_context_stack; + error_context_stack = &comperrcontext; /* - * get the procedure tuple corresponding to the given function Oid + * Create the hash entry's memory context. For now it's a child of the + * caller's context, so that it will go away if we fail partway through. */ - procedureTuple = SearchSysCache1(PROCOID, ObjectIdGetDatum(foid)); - if (!HeapTupleIsValid(procedureTuple)) - elog(ERROR, "cache lookup failed for function %u", foid); - procedureStruct = (Form_pg_proc) GETSTRUCT(procedureTuple); + hcontext = AllocSetContextCreate(CurrentMemoryContext, + "SQL function", + ALLOCSET_SMALL_SIZES); /* * copy function name immediately for use by error reporting callback, and * for use as memory context identifier */ - fcache->fname = pstrdup(NameStr(procedureStruct->proname)); - MemoryContextSetIdentifier(fcontext, fcache->fname); + func->fname = MemoryContextStrdup(hcontext, + NameStr(procedureStruct->proname)); + MemoryContextSetIdentifier(hcontext, func->fname); /* * Resolve any polymorphism, obtaining the actual result type, and the @@ -641,32 +771,44 @@ init_sql_fcache(FunctionCallInfo fcinfo, Oid collation, bool lazyEvalOK) */ (void) get_call_result_type(fcinfo, &rettype, &rettupdesc); - fcache->rettype = rettype; + func->rettype = rettype; + if (rettupdesc) + { + MemoryContextSwitchTo(hcontext); + func->rettupdesc = CreateTupleDescCopy(rettupdesc); + MemoryContextSwitchTo(oldcontext); + } /* Fetch the typlen and byval info for the result type */ - get_typlenbyval(rettype, &fcache->typlen, &fcache->typbyval); + get_typlenbyval(rettype, &func->typlen, &func->typbyval); /* Remember whether we're returning setof something */ - fcache->returnsSet = procedureStruct->proretset; + func->returnsSet = procedureStruct->proretset; /* Remember if function is STABLE/IMMUTABLE */ - fcache->readonly_func = + func->readonly_func = (procedureStruct->provolatile != PROVOLATILE_VOLATILE); + /* Remember routine kind */ + func->prokind = procedureStruct->prokind; + /* * We need the actual argument types to pass to the parser. Also make * sure that parameter symbols are considered to have the function's * resolved input collation. */ - fcache->pinfo = prepare_sql_fn_parse_info(procedureTuple, - finfo->fn_expr, - collation); + MemoryContextSwitchTo(hcontext); + func->pinfo = prepare_sql_fn_parse_info(procedureTuple, + fcinfo->flinfo->fn_expr, + PG_GET_COLLATION()); + MemoryContextSwitchTo(oldcontext); /* * And of course we need the function body text. */ tmp = SysCacheGetAttrNotNull(PROCOID, procedureTuple, Anum_pg_proc_prosrc); - fcache->src = TextDatumGetCString(tmp); + func->src = MemoryContextStrdup(hcontext, + TextDatumGetCString(tmp)); /* If we have prosqlbody, pay attention to that not prosrc. */ tmp = SysCacheGetAttr(PROCOID, @@ -675,19 +817,20 @@ init_sql_fcache(FunctionCallInfo fcinfo, Oid collation, bool lazyEvalOK) &isNull); /* - * Parse and rewrite the queries in the function text. Use sublists to - * keep track of the original query boundaries. - * - * Note: since parsing and planning is done in fcontext, we will generate - * a lot of cruft that lives as long as the fcache does. This is annoying - * but we'll not worry about it until the module is rewritten to use - * plancache.c. + * Now we must parse and rewrite the function's queries, and create + * CachedPlanSources. Note that we apply CreateCachedPlan[ForQuery] + * immediately so that it captures the original state of the parsetrees, + * but we don't do CompleteCachedPlan until after fixing up the final + * query's targetlist. */ queryTree_list = NIL; + plansource_list = NIL; if (!isNull) { + /* Source queries are already parse-analyzed */ Node *n; List *stored_query_list; + ListCell *lc; n = stringToNode(TextDatumGetCString(tmp)); if (IsA(n, List)) @@ -698,8 +841,17 @@ init_sql_fcache(FunctionCallInfo fcinfo, Oid collation, bool lazyEvalOK) foreach(lc, stored_query_list) { Query *parsetree = lfirst_node(Query, lc); + CachedPlanSource *plansource; List *queryTree_sublist; + /* Count source queries for sql_compile_error_callback */ + func->error_query_index++; + + plansource = CreateCachedPlanForQuery(parsetree, + func->src, + CreateCommandTag((Node *) parsetree)); + plansource_list = lappend(plansource_list, plansource); + AcquireRewriteLocks(parsetree, true, false); queryTree_sublist = pg_rewrite_query(parsetree); queryTree_list = lappend(queryTree_list, queryTree_sublist); @@ -707,24 +859,38 @@ init_sql_fcache(FunctionCallInfo fcinfo, Oid collation, bool lazyEvalOK) } else { + /* Source queries are raw parsetrees */ List *raw_parsetree_list; + ListCell *lc; - raw_parsetree_list = pg_parse_query(fcache->src); + raw_parsetree_list = pg_parse_query(func->src); foreach(lc, raw_parsetree_list) { RawStmt *parsetree = lfirst_node(RawStmt, lc); + CachedPlanSource *plansource; List *queryTree_sublist; + /* Count source queries for sql_compile_error_callback */ + func->error_query_index++; + + plansource = CreateCachedPlan(parsetree, + func->src, + CreateCommandTag(parsetree->stmt)); + plansource_list = lappend(plansource_list, plansource); + queryTree_sublist = pg_analyze_and_rewrite_withcb(parsetree, - fcache->src, + func->src, (ParserSetupHook) sql_fn_parser_setup, - fcache->pinfo, + func->pinfo, NULL); queryTree_list = lappend(queryTree_list, queryTree_sublist); } } + /* Failures below here are reported as "during startup" */ + func->error_query_index = 0; + /* * Check that there are no statements we don't want to allow. */ @@ -740,7 +906,7 @@ init_sql_fcache(FunctionCallInfo fcinfo, Oid collation, bool lazyEvalOK) * ask it to insert nulls for dropped columns; the junkfilter handles * that.) * - * Note: we set fcache->returnsTuple according to whether we are returning + * Note: we set func->returnsTuple according to whether we are returning * the whole tuple result or just a single column. In the latter case we * clear returnsTuple because we need not act different from the scalar * result case, even if it's a rowtype column. (However, we have to force @@ -748,76 +914,244 @@ init_sql_fcache(FunctionCallInfo fcinfo, Oid collation, bool lazyEvalOK) * the rowtype column into multiple columns, since we have no way to * notify the caller that it should do that.) */ - fcache->returnsTuple = check_sql_fn_retval(queryTree_list, - rettype, - rettupdesc, - procedureStruct->prokind, - false); + func->returnsTuple = check_sql_fn_retval(queryTree_list, + rettype, + rettupdesc, + procedureStruct->prokind, + false); /* - * Construct a JunkFilter we can use to coerce the returned rowtype to the - * desired form, unless the result type is VOID, in which case there's - * nothing to coerce to. (XXX Frequently, the JunkFilter isn't doing - * anything very interesting, but much of this module expects it to be - * there anyway.) + * Now that check_sql_fn_retval has done its thing, we can complete plan + * cache entry creation. */ - if (rettype != VOIDOID) + forboth(qlc, queryTree_list, plc, plansource_list) { - TupleTableSlot *slot = MakeSingleTupleTableSlot(NULL, - &TTSOpsMinimalTuple); - List *resulttlist; + List *queryTree_sublist = lfirst(qlc); + CachedPlanSource *plansource = lfirst(plc); + bool islast; + + /* Finish filling in the CachedPlanSource */ + CompleteCachedPlan(plansource, + queryTree_sublist, + NULL, + NULL, + 0, + (ParserSetupHook) sql_fn_parser_setup, + func->pinfo, + CURSOR_OPT_PARALLEL_OK | CURSOR_OPT_NO_SCROLL, + false); /* - * Re-fetch the (possibly modified) output tlist of the final - * statement. By this point, we should have thrown an error if there - * is not one. + * Install post-rewrite hook. Its arg is the hash entry if this is + * the last statement, else NULL. */ - resulttlist = get_sql_fn_result_tlist(llast_node(List, queryTree_list)); + islast = (lnext(queryTree_list, qlc) == NULL); + SetPostRewriteHook(plansource, + sql_postrewrite_callback, + islast ? func : NULL); + } - /* - * If the result is composite, *and* we are returning the whole tuple - * result, we need to insert nulls for any dropped columns. In the - * single-column-result case, there might be dropped columns within - * the composite column value, but it's not our problem here. There - * should be no resjunk entries in resulttlist, so in the second case - * the JunkFilter is certainly a no-op. - */ - if (rettupdesc && fcache->returnsTuple) - fcache->junkFilter = ExecInitJunkFilterConversion(resulttlist, - rettupdesc, - slot); - else - fcache->junkFilter = ExecInitJunkFilter(resulttlist, slot); + /* + * While the CachedPlanSources can take care of themselves, our List + * pointing to them had better be in the hcontext. + */ + MemoryContextSwitchTo(hcontext); + plansource_list = list_copy(plansource_list); + MemoryContextSwitchTo(oldcontext); + + /* + * We have now completed building the hash entry, so reparent stuff under + * CacheMemoryContext to make all the subsidiary data long-lived. + * Importantly, this part can't fail partway through. + */ + foreach(plc, plansource_list) + { + CachedPlanSource *plansource = lfirst(plc); + + SaveCachedPlan(plansource); + } + MemoryContextSetParent(hcontext, CacheMemoryContext); + + /* And finally, arm sql_delete_callback to delete the stuff again */ + func->plansource_list = plansource_list; + func->hcontext = hcontext; + + error_context_stack = comperrcontext.previous; +} + +/* Deletion callback used by funccache.c */ +static void +sql_delete_callback(CachedFunction *cfunc) +{ + SQLFunctionHashEntry *func = (SQLFunctionHashEntry *) cfunc; + ListCell *lc; + + /* Release the CachedPlanSources */ + foreach(lc, func->plansource_list) + { + CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc); + + DropCachedPlan(plansource); } + func->plansource_list = NIL; + + /* + * If we have an hcontext, free it, thereby getting rid of all subsidiary + * data. + */ + if (func->hcontext) + MemoryContextDelete(func->hcontext); + func->hcontext = NULL; +} + +/* Post-rewrite callback used by plancache.c */ +static void +sql_postrewrite_callback(List *querytree_list, void *arg) +{ + /* + * Check that there are no statements we don't want to allow. (Presently, + * there's no real point in this because the result can't change from what + * we saw originally. But it's cheap and maybe someday it will matter.) + */ + check_sql_fn_statement(querytree_list); - if (fcache->returnsTuple) + /* + * If this is the last query, we must re-do what check_sql_fn_retval did + * to its targetlist. Also check that returnsTuple didn't change (it + * probably cannot, but be cautious). + */ + if (arg != NULL) { - /* Make sure output rowtype is properly blessed */ - BlessTupleDesc(fcache->junkFilter->jf_resultSlot->tts_tupleDescriptor); + SQLFunctionHashEntry *func = (SQLFunctionHashEntry *) arg; + bool returnsTuple; + + returnsTuple = check_sql_stmt_retval(querytree_list, + func->rettype, + func->rettupdesc, + func->prokind, + false); + if (returnsTuple != func->returnsTuple) + ereport(ERROR, + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("cached plan must not change result type"))); + } +} + +/* + * Initialize the SQLFunctionCache for a SQL function + */ +static SQLFunctionCache * +init_sql_fcache(FunctionCallInfo fcinfo, bool lazyEvalOK) +{ + FmgrInfo *finfo = fcinfo->flinfo; + SQLFunctionHashEntry *func; + SQLFunctionCache *fcache; + SQLFunctionLink *flink; + MemoryContext pcontext; + MemoryContext fcontext; + MemoryContext oldcontext; + + /* + * If this is the first execution for this FmgrInfo, set up a link struct + * (initially containing null pointers). The link must live as long as + * the FmgrInfo, so it goes in fn_mcxt. Also set up a memory context + * callback that will be invoked when fn_mcxt is deleted. + */ + flink = finfo->fn_extra; + if (flink == NULL) + { + flink = (SQLFunctionLink *) + MemoryContextAllocZero(finfo->fn_mcxt, sizeof(SQLFunctionLink)); + flink->mcb.func = RemoveSQLFunctionLink; + flink->mcb.arg = flink; + MemoryContextRegisterResetCallback(finfo->fn_mcxt, &flink->mcb); + finfo->fn_extra = flink; } - else if (fcache->returnsSet && type_is_rowtype(fcache->rettype)) + + /* + * If we are resuming execution of a set-returning function, just keep + * using the same cache. We do not ask funccache.c to re-validate the + * SQLFunctionHashEntry: we want to run to completion using the function's + * initial definition. + */ + if (flink->fcache != NULL) { - /* - * Returning rowtype as if it were scalar --- materialize won't work. - * Right now it's sufficient to override any caller preference for - * materialize mode, but to add more smarts in init_execution_state - * about this, we'd probably need a three-way flag instead of bool. - */ - lazyEvalOK = true; + Assert(flink->fcache->func == flink->func); + return flink->fcache; + } + + /* + * Look up, or re-validate, the long-lived hash entry. Make the hash key + * depend on the result of get_call_result_type() when that's composite, + * so that we can safely assume that we'll build a new hash entry if the + * composite rowtype changes. + */ + func = (SQLFunctionHashEntry *) + cached_function_compile(fcinfo, + (CachedFunction *) flink->func, + sql_compile_callback, + sql_delete_callback, + sizeof(SQLFunctionHashEntry), + true, + false); + + /* + * Install the hash pointer in the SQLFunctionLink, and increment its use + * count to reflect that. If cached_function_compile gave us back a + * different hash entry than we were using before, we must decrement that + * one's use count. + */ + if (func != flink->func) + { + if (flink->func != NULL) + { + Assert(flink->func->cfunc.use_count > 0); + flink->func->cfunc.use_count--; + } + flink->func = func; + func->cfunc.use_count++; } - /* Finally, plan the queries */ - fcache->func_state = init_execution_state(queryTree_list, - fcache, - lazyEvalOK); + /* + * Create memory context that holds all the SQLFunctionCache data. If we + * return a set, we must keep this in whatever context holds the FmgrInfo + * (anything shorter-lived risks leaving a dangling pointer in flink). But + * in a non-SRF we'll delete it before returning, and there's no need for + * it to outlive the caller's context. + */ + pcontext = func->returnsSet ? finfo->fn_mcxt : CurrentMemoryContext; + fcontext = AllocSetContextCreate(pcontext, + "SQL function execution", + ALLOCSET_DEFAULT_SIZES); + + oldcontext = MemoryContextSwitchTo(fcontext); - /* Mark fcache with time of creation to show it's valid */ - fcache->lxid = MyProc->vxid.lxid; - fcache->subxid = GetCurrentSubTransactionId(); + /* + * Create the struct proper, link it to func and fcontext. + */ + fcache = (SQLFunctionCache *) palloc0(sizeof(SQLFunctionCache)); + fcache->func = func; + fcache->fcontext = fcontext; + fcache->lazyEvalOK = lazyEvalOK; - ReleaseSysCache(procedureTuple); + /* + * If we return a set, we must link the fcache into fn_extra so that we + * can find it again during future calls. But in a non-SRF there is no + * need to link it into fn_extra at all. Not doing so removes the risk of + * having a dangling pointer in a long-lived FmgrInfo. + */ + if (func->returnsSet) + flink->fcache = fcache; + + /* + * We're beginning a new execution of the function, so convert params to + * appropriate format. + */ + postquel_sub_params(fcache, fcinfo); MemoryContextSwitchTo(oldcontext); + + return fcache; } /* Start up execution of one execution_state node */ @@ -852,7 +1186,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache) es->qd = CreateQueryDesc(es->stmt, NULL, - fcache->src, + fcache->func->src, GetActiveSnapshot(), InvalidSnapshot, dest, @@ -893,7 +1227,7 @@ postquel_getnext(execution_state *es, SQLFunctionCachePtr fcache) if (es->qd->operation == CMD_UTILITY) { ProcessUtility(es->qd->plannedstmt, - fcache->src, + fcache->func->src, true, /* protect function cache's parsetree */ PROCESS_UTILITY_QUERY, es->qd->params, @@ -949,7 +1283,7 @@ postquel_sub_params(SQLFunctionCachePtr fcache, if (nargs > 0) { ParamListInfo paramLI; - Oid *argtypes = fcache->pinfo->argtypes; + Oid *argtypes = fcache->func->pinfo->argtypes; if (fcache->paramLI == NULL) { @@ -982,7 +1316,8 @@ postquel_sub_params(SQLFunctionCachePtr fcache, prm->value = MakeExpandedObjectReadOnly(fcinfo->args[i].value, prm->isnull, get_typlen(argtypes[i])); - prm->pflags = 0; + /* Allow the value to be substituted into custom plans */ + prm->pflags = PARAM_FLAG_CONST; prm->ptype = argtypes[i]; } } @@ -1012,7 +1347,7 @@ postquel_get_single_result(TupleTableSlot *slot, */ oldcontext = MemoryContextSwitchTo(resultcontext); - if (fcache->returnsTuple) + if (fcache->func->returnsTuple) { /* We must return the whole tuple as a Datum. */ fcinfo->isnull = false; @@ -1027,7 +1362,7 @@ postquel_get_single_result(TupleTableSlot *slot, value = slot_getattr(slot, 1, &(fcinfo->isnull)); if (!fcinfo->isnull) - value = datumCopy(value, fcache->typbyval, fcache->typlen); + value = datumCopy(value, fcache->func->typbyval, fcache->func->typlen); } MemoryContextSwitchTo(oldcontext); @@ -1042,25 +1377,16 @@ Datum fmgr_sql(PG_FUNCTION_ARGS) { SQLFunctionCachePtr fcache; + SQLFunctionLink *flink; ErrorContextCallback sqlerrcontext; + MemoryContext tscontext; MemoryContext oldcontext; bool randomAccess; bool lazyEvalOK; - bool is_first; bool pushed_snapshot; execution_state *es; TupleTableSlot *slot; Datum result; - List *eslist; - ListCell *eslc; - - /* - * Setup error traceback support for ereport() - */ - sqlerrcontext.callback = sql_exec_error_callback; - sqlerrcontext.arg = fcinfo->flinfo; - sqlerrcontext.previous = error_context_stack; - error_context_stack = &sqlerrcontext; /* Check call context */ if (fcinfo->flinfo->fn_retset) @@ -1081,80 +1407,63 @@ fmgr_sql(PG_FUNCTION_ARGS) errmsg("set-valued function called in context that cannot accept a set"))); randomAccess = rsi->allowedModes & SFRM_Materialize_Random; lazyEvalOK = !(rsi->allowedModes & SFRM_Materialize_Preferred); + /* tuplestore must have query lifespan */ + tscontext = rsi->econtext->ecxt_per_query_memory; } else { randomAccess = false; lazyEvalOK = true; + /* tuplestore needn't outlive caller context */ + tscontext = CurrentMemoryContext; } /* - * Initialize fcache (build plans) if first time through; or re-initialize - * if the cache is stale. + * Initialize fcache if starting a fresh execution. */ - fcache = (SQLFunctionCachePtr) fcinfo->flinfo->fn_extra; + fcache = init_sql_fcache(fcinfo, lazyEvalOK); + /* init_sql_fcache also ensures we have a SQLFunctionLink */ + flink = fcinfo->flinfo->fn_extra; - if (fcache != NULL) - { - if (fcache->lxid != MyProc->vxid.lxid || - !SubTransactionIsActive(fcache->subxid)) - { - /* It's stale; unlink and delete */ - fcinfo->flinfo->fn_extra = NULL; - MemoryContextDelete(fcache->fcontext); - fcache = NULL; - } - } + /* + * Now we can set up error traceback support for ereport() + */ + sqlerrcontext.callback = sql_exec_error_callback; + sqlerrcontext.arg = fcache; + sqlerrcontext.previous = error_context_stack; + error_context_stack = &sqlerrcontext; - if (fcache == NULL) - { - init_sql_fcache(fcinfo, PG_GET_COLLATION(), lazyEvalOK); - fcache = (SQLFunctionCachePtr) fcinfo->flinfo->fn_extra; - } + /* + * Build tuplestore to hold results, if we don't have one already. Make + * sure it's in a suitable context. + */ + oldcontext = MemoryContextSwitchTo(tscontext); + + if (!fcache->tstore) + fcache->tstore = tuplestore_begin_heap(randomAccess, false, work_mem); /* - * Switch to context in which the fcache lives. This ensures that our - * tuplestore etc will have sufficient lifetime. The sub-executor is + * Switch to context in which the fcache lives. The sub-executor is * responsible for deleting per-tuple information. (XXX in the case of a - * long-lived FmgrInfo, this policy represents more memory leakage, but - * it's not entirely clear where to keep stuff instead.) + * long-lived FmgrInfo, this policy potentially causes memory leakage, but + * it's not very clear where we could keep stuff instead. Fortunately, + * there are few if any cases where set-returning functions are invoked + * via FmgrInfos that would outlive the calling query.) */ - oldcontext = MemoryContextSwitchTo(fcache->fcontext); + MemoryContextSwitchTo(fcache->fcontext); /* - * Find first unfinished query in function, and note whether it's the - * first query. + * Find first unfinished execution_state. If none, advance to the next + * query in function. */ - eslist = fcache->func_state; - es = NULL; - is_first = true; - foreach(eslc, eslist) + do { - es = (execution_state *) lfirst(eslc); - + es = fcache->eslist; while (es && es->status == F_EXEC_DONE) - { - is_first = false; es = es->next; - } - if (es) break; - } - - /* - * Convert params to appropriate format if starting a fresh execution. (If - * continuing execution, we can re-use prior params.) - */ - if (is_first && es && es->status == F_EXEC_START) - postquel_sub_params(fcache, fcinfo); - - /* - * Build tuplestore to hold results, if we don't have one already. Note - * it's in the query-lifespan context. - */ - if (!fcache->tstore) - fcache->tstore = tuplestore_begin_heap(randomAccess, false, work_mem); + } while (init_execution_state(fcache)); /* * Execute each command in the function one after another until we either @@ -1187,7 +1496,7 @@ fmgr_sql(PG_FUNCTION_ARGS) * visible. Take a new snapshot if we don't have one yet, * otherwise just bump the command ID in the existing snapshot. */ - if (!fcache->readonly_func) + if (!fcache->func->readonly_func) { CommandCounterIncrement(); if (!pushed_snapshot) @@ -1201,7 +1510,7 @@ fmgr_sql(PG_FUNCTION_ARGS) postquel_start(es, fcache); } - else if (!fcache->readonly_func && !pushed_snapshot) + else if (!fcache->func->readonly_func && !pushed_snapshot) { /* Re-establish active snapshot when re-entering function */ PushActiveSnapshot(es->qd->snapshot); @@ -1218,7 +1527,7 @@ fmgr_sql(PG_FUNCTION_ARGS) * set, we can shut it down anyway because it must be a SELECT and we * don't care about fetching any more result rows. */ - if (completed || !fcache->returnsSet) + if (completed || !fcache->func->returnsSet) postquel_end(es); /* @@ -1234,17 +1543,11 @@ fmgr_sql(PG_FUNCTION_ARGS) break; /* - * Advance to next execution_state, which might be in the next list. + * Advance to next execution_state, and perhaps next query. */ es = es->next; while (!es) { - eslc = lnext(eslist, eslc); - if (!eslc) - break; /* end of function */ - - es = (execution_state *) lfirst(eslc); - /* * Flush the current snapshot so that we will take a new one for * the new query list. This ensures that new snaps are taken at @@ -1256,13 +1559,18 @@ fmgr_sql(PG_FUNCTION_ARGS) PopActiveSnapshot(); pushed_snapshot = false; } + + if (!init_execution_state(fcache)) + break; /* end of function */ + + es = fcache->eslist; } } /* * The tuplestore now contains whatever row(s) we are supposed to return. */ - if (fcache->returnsSet) + if (fcache->func->returnsSet) { ReturnSetInfo *rsi = (ReturnSetInfo *) fcinfo->resultinfo; @@ -1298,7 +1606,7 @@ fmgr_sql(PG_FUNCTION_ARGS) { RegisterExprContextCallback(rsi->econtext, ShutdownSQLFunction, - PointerGetDatum(fcache)); + PointerGetDatum(flink)); fcache->shutdown_reg = true; } } @@ -1322,7 +1630,7 @@ fmgr_sql(PG_FUNCTION_ARGS) { UnregisterExprContextCallback(rsi->econtext, ShutdownSQLFunction, - PointerGetDatum(fcache)); + PointerGetDatum(flink)); fcache->shutdown_reg = false; } } @@ -1338,7 +1646,12 @@ fmgr_sql(PG_FUNCTION_ARGS) fcache->tstore = NULL; /* must copy desc because execSRF.c will free it */ if (fcache->junkFilter) + { + /* setDesc must be allocated in suitable context */ + MemoryContextSwitchTo(tscontext); rsi->setDesc = CreateTupleDescCopy(fcache->junkFilter->jf_cleanTupType); + MemoryContextSwitchTo(fcache->fcontext); + } fcinfo->isnull = true; result = (Datum) 0; @@ -1348,7 +1661,7 @@ fmgr_sql(PG_FUNCTION_ARGS) { UnregisterExprContextCallback(rsi->econtext, ShutdownSQLFunction, - PointerGetDatum(fcache)); + PointerGetDatum(flink)); fcache->shutdown_reg = false; } } @@ -1374,7 +1687,7 @@ fmgr_sql(PG_FUNCTION_ARGS) else { /* Should only get here for VOID functions and procedures */ - Assert(fcache->rettype == VOIDOID); + Assert(fcache->func->rettype == VOIDOID); fcinfo->isnull = true; result = (Datum) 0; } @@ -1387,154 +1700,171 @@ fmgr_sql(PG_FUNCTION_ARGS) if (pushed_snapshot) PopActiveSnapshot(); + MemoryContextSwitchTo(oldcontext); + /* - * If we've gone through every command in the function, we are done. Reset - * the execution states to start over again on next call. + * If we've gone through every command in the function, we are done. + * Release the cache to start over again on next call. */ if (es == NULL) { - foreach(eslc, fcache->func_state) - { - es = (execution_state *) lfirst(eslc); - while (es) - { - es->status = F_EXEC_START; - es = es->next; - } - } + if (fcache->tstore) + tuplestore_end(fcache->tstore); + Assert(fcache->cplan == NULL); + flink->fcache = NULL; + MemoryContextDelete(fcache->fcontext); } error_context_stack = sqlerrcontext.previous; - MemoryContextSwitchTo(oldcontext); - return result; } /* - * error context callback to let us supply a call-stack traceback + * error context callback to let us supply a traceback during compile */ static void -sql_exec_error_callback(void *arg) +sql_compile_error_callback(void *arg) { - FmgrInfo *flinfo = (FmgrInfo *) arg; - SQLFunctionCachePtr fcache = (SQLFunctionCachePtr) flinfo->fn_extra; + SQLFunctionHashEntry *func = (SQLFunctionHashEntry *) arg; int syntaxerrposition; /* - * We can do nothing useful if init_sql_fcache() didn't get as far as - * saving the function name + * We can do nothing useful if sql_compile_callback() didn't get as far as + * copying the function name */ - if (fcache == NULL || fcache->fname == NULL) + if (func->fname == NULL) return; /* * If there is a syntax error position, convert to internal syntax error */ syntaxerrposition = geterrposition(); - if (syntaxerrposition > 0 && fcache->src != NULL) + if (syntaxerrposition > 0 && func->src != NULL) { errposition(0); internalerrposition(syntaxerrposition); - internalerrquery(fcache->src); + internalerrquery(func->src); } /* - * Try to determine where in the function we failed. If there is a query - * with non-null QueryDesc, finger it. (We check this rather than looking - * for F_EXEC_RUN state, so that errors during ExecutorStart or - * ExecutorEnd are blamed on the appropriate query; see postquel_start and - * postquel_end.) + * If we failed while parsing an identifiable query within the function, + * report that. Otherwise say it was "during startup". */ - if (fcache->func_state) - { - execution_state *es; - int query_num; - ListCell *lc; - - es = NULL; - query_num = 1; - foreach(lc, fcache->func_state) - { - es = (execution_state *) lfirst(lc); - while (es) - { - if (es->qd) - { - errcontext("SQL function \"%s\" statement %d", - fcache->fname, query_num); - break; - } - es = es->next; - } - if (es) - break; - query_num++; - } - if (es == NULL) - { - /* - * couldn't identify a running query; might be function entry, - * function exit, or between queries. - */ - errcontext("SQL function \"%s\"", fcache->fname); - } - } + if (func->error_query_index > 0) + errcontext("SQL function \"%s\" statement %d", + func->fname, func->error_query_index); else + errcontext("SQL function \"%s\" during startup", func->fname); +} + +/* + * error context callback to let us supply a call-stack traceback at runtime + */ +static void +sql_exec_error_callback(void *arg) +{ + SQLFunctionCachePtr fcache = (SQLFunctionCachePtr) arg; + int syntaxerrposition; + + /* + * If there is a syntax error position, convert to internal syntax error + */ + syntaxerrposition = geterrposition(); + if (syntaxerrposition > 0 && fcache->func->src != NULL) { - /* - * Assume we failed during init_sql_fcache(). (It's possible that the - * function actually has an empty body, but in that case we may as - * well report all errors as being "during startup".) - */ - errcontext("SQL function \"%s\" during startup", fcache->fname); + errposition(0); + internalerrposition(syntaxerrposition); + internalerrquery(fcache->func->src); } + + /* + * If we failed while executing an identifiable query within the function, + * report that. Otherwise say it was "during startup". + */ + if (fcache->error_query_index > 0) + errcontext("SQL function \"%s\" statement %d", + fcache->func->fname, fcache->error_query_index); + else + errcontext("SQL function \"%s\" during startup", fcache->func->fname); } /* - * callback function in case a function-returning-set needs to be shut down - * before it has been run to completion + * ExprContext callback function + * + * We register this in the active ExprContext while a set-returning SQL + * function is running, in case the function needs to be shut down before it + * has been run to completion. Note that this will not be called during an + * error abort, but we don't need it because transaction abort will take care + * of releasing executor resources. */ static void ShutdownSQLFunction(Datum arg) { - SQLFunctionCachePtr fcache = (SQLFunctionCachePtr) DatumGetPointer(arg); - execution_state *es; - ListCell *lc; + SQLFunctionLink *flink = (SQLFunctionLink *) DatumGetPointer(arg); + SQLFunctionCachePtr fcache = flink->fcache; - foreach(lc, fcache->func_state) + if (fcache != NULL) { - es = (execution_state *) lfirst(lc); + execution_state *es; + + /* Make sure we don't somehow try to do this twice */ + flink->fcache = NULL; + + es = fcache->eslist; while (es) { /* Shut down anything still running */ if (es->status == F_EXEC_RUN) { /* Re-establish active snapshot for any called functions */ - if (!fcache->readonly_func) + if (!fcache->func->readonly_func) PushActiveSnapshot(es->qd->snapshot); postquel_end(es); - if (!fcache->readonly_func) + if (!fcache->func->readonly_func) PopActiveSnapshot(); } - - /* Reset states to START in case we're called again */ - es->status = F_EXEC_START; es = es->next; } - } - /* Release tuplestore if we have one */ - if (fcache->tstore) - tuplestore_end(fcache->tstore); - fcache->tstore = NULL; + /* Release tuplestore if we have one */ + if (fcache->tstore) + tuplestore_end(fcache->tstore); + /* Release CachedPlan if we have one */ + if (fcache->cplan) + ReleaseCachedPlan(fcache->cplan, fcache->cowner); + + /* Release the cache */ + MemoryContextDelete(fcache->fcontext); + } /* execUtils will deregister the callback... */ - fcache->shutdown_reg = false; +} + +/* + * MemoryContext callback function + * + * We register this in the memory context that contains a SQLFunctionLink + * struct. When the memory context is reset or deleted, we release the + * reference count (if any) that the link holds on the long-lived hash entry. + * Note that this will happen even during error aborts. + */ +static void +RemoveSQLFunctionLink(void *arg) +{ + SQLFunctionLink *flink = (SQLFunctionLink *) arg; + + if (flink->func != NULL) + { + Assert(flink->func->cfunc.use_count > 0); + flink->func->cfunc.use_count--; + /* This should be unnecessary, but let's just be sure: */ + flink->func = NULL; + } } /* diff --git a/src/test/modules/test_extensions/expected/test_extensions.out b/src/test/modules/test_extensions/expected/test_extensions.out index d5388a1fecf..72bae1bf254 100644 --- a/src/test/modules/test_extensions/expected/test_extensions.out +++ b/src/test/modules/test_extensions/expected/test_extensions.out @@ -651,7 +651,7 @@ LINE 1: SELECT public.dep_req2() || ' req3b' ^ HINT: No function matches the given name and argument types. You might need to add explicit type casts. QUERY: SELECT public.dep_req2() || ' req3b' -CONTEXT: SQL function "dep_req3b" during startup +CONTEXT: SQL function "dep_req3b" statement 1 DROP EXTENSION test_ext_req_schema3; ALTER EXTENSION test_ext_req_schema1 SET SCHEMA test_s_dep2; -- now ok SELECT test_s_dep2.dep_req1(); diff --git a/src/test/regress/expected/create_function_sql.out b/src/test/regress/expected/create_function_sql.out index 50aca5940ff..70ed5742b65 100644 --- a/src/test/regress/expected/create_function_sql.out +++ b/src/test/regress/expected/create_function_sql.out @@ -563,6 +563,20 @@ CREATE OR REPLACE PROCEDURE functest1(a int) LANGUAGE SQL AS 'SELECT $1'; ERROR: cannot change routine kind DETAIL: "functest1" is a function. DROP FUNCTION functest1(a int); +-- early shutdown of set-returning functions +CREATE FUNCTION functest_srf0() RETURNS SETOF int +LANGUAGE SQL +AS $$ SELECT i FROM generate_series(1, 100) i $$; +SELECT functest_srf0() LIMIT 5; + functest_srf0 +--------------- + 1 + 2 + 3 + 4 + 5 +(5 rows) + -- inlining of set-returning functions CREATE TABLE functest3 (a int); INSERT INTO functest3 VALUES (1), (2), (3); @@ -708,7 +722,7 @@ CREATE FUNCTION test1 (int) RETURNS int LANGUAGE SQL ERROR: only one AS item needed for language "sql" -- Cleanup DROP SCHEMA temp_func_test CASCADE; -NOTICE: drop cascades to 30 other objects +NOTICE: drop cascades to 31 other objects DETAIL: drop cascades to function functest_a_1(text,date) drop cascades to function functest_a_2(text[]) drop cascades to function functest_a_3() @@ -732,6 +746,7 @@ drop cascades to function functest_s_10(text,date) drop cascades to function functest_s_13() drop cascades to function functest_s_15(integer) drop cascades to function functest_b_2(bigint) +drop cascades to function functest_srf0() drop cascades to function functest_sri1() drop cascades to function voidtest1(integer) drop cascades to function voidtest2(integer,integer) diff --git a/src/test/regress/expected/rowsecurity.out b/src/test/regress/expected/rowsecurity.out index 8f2c8319172..1c4e37d2249 100644 --- a/src/test/regress/expected/rowsecurity.out +++ b/src/test/regress/expected/rowsecurity.out @@ -4723,12 +4723,8 @@ select rls_f(c) from test_t order by rls_f; -- should lead to RLS error during query rewrite set role regress_rls_alice; select rls_f(c) from test_t order by rls_f; - rls_f -------- - boffa - -(2 rows) - +ERROR: query would be affected by row-level security policy for table "rls_t" +CONTEXT: SQL function "rls_f" statement 1 reset role; set plan_cache_mode to force_generic_plan; -- Table owner bypasses RLS, although cached plan will be invalidated @@ -4743,12 +4739,8 @@ select rls_f(c) from test_t order by rls_f; -- should lead to plan invalidation and RLS error during query rewrite set role regress_rls_alice; select rls_f(c) from test_t order by rls_f; - rls_f -------- - boffa - -(2 rows) - +ERROR: query would be affected by row-level security policy for table "rls_t" +CONTEXT: SQL function "rls_f" statement 1 reset role; reset plan_cache_mode; reset rls_test.blah; diff --git a/src/test/regress/sql/create_function_sql.sql b/src/test/regress/sql/create_function_sql.sql index 89e9af3a499..1dd3c4a4e5f 100644 --- a/src/test/regress/sql/create_function_sql.sql +++ b/src/test/regress/sql/create_function_sql.sql @@ -328,6 +328,15 @@ CREATE OR REPLACE PROCEDURE functest1(a int) LANGUAGE SQL AS 'SELECT $1'; DROP FUNCTION functest1(a int); +-- early shutdown of set-returning functions + +CREATE FUNCTION functest_srf0() RETURNS SETOF int +LANGUAGE SQL +AS $$ SELECT i FROM generate_series(1, 100) i $$; + +SELECT functest_srf0() LIMIT 5; + + -- inlining of set-returning functions CREATE TABLE functest3 (a int); diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list index 144c4e9662c..2bbcb43055e 100644 --- a/src/tools/pgindent/typedefs.list +++ b/src/tools/pgindent/typedefs.list @@ -2613,6 +2613,8 @@ SPPageDesc SQLDropObject SQLFunctionCache SQLFunctionCachePtr +SQLFunctionHashEntry +SQLFunctionLink SQLFunctionParseInfo SQLFunctionParseInfoPtr SQLValueFunction -- 2.43.5
Hi
We get substantial wins on all of fx, fx3, fx4. fx2 is the
case that gets inlined and never reaches functions.c, so the
lack of change there is expected. What I found odd is that
I saw a small speedup (~6%) on fx5 and fx6; those functions
are in plpgsql so they really shouldn't change either.
The only thing I can think of is that I made the hash key
computation a tiny bit faster by omitting unused argtypes[]
entries. That does avoid hashing several hundred bytes
typically, but it's hard to believe it'd amount to any
visible savings overall.
Anyway, PFA v10.
I can confirm so all tests passed without problems
Regards
Pavel
regards, tom lane
[1] https://www.postgresql.org/message-id/CAFj8pRDWDeF2cC%2BpCjLHJno7KnK5kdtjYN-f933RHS7UneArFw%40mail.gmail.com
Hi. Tom Lane писал(а) 2025-03-30 19:10: > I spent some time reading and reworking this code, and have > arrived at a patch set that I'm pretty happy with. I'm not > sure it's quite committable but it's close: > > 0005: This extracts the RLS test case you had and commits it > with the old non-failing behavior, just so that we can see that > the new code does it differently. (I didn't adopt your test > from rules.sql, because AFAICS it works the same with or without > this patch set. What was the point of that one again?) The test was introduced after my error to handle case when execution_state is NULL in fcache->func_state list due to statement being completely removed by instead rule. After founding this issue, I've added a test to cover it. Not sure if it should be preserved. > > 0006: The guts of the patch. I couldn't break this down any > further. > > One big difference from what you had is that there is only one path > of control: we always use the plan cache. The hack you had to not > use it for triggers was only needed because you didn't include the > right cache key items to distinguish different trigger usages, but > the code coming from plpgsql has that right. > Yes, now it looks much more consistent. Still going through it. Will do some additional testing here. So far have checked all known corner cases and found no issues. > Also, the memory management is done a bit differently. The > "fcontext" memory context holding the SQLFunctionCache struct is > now discarded at the end of each execution of the SQL function, > which considerably alleviates worries about leaking memory there. > I invented a small "SQLFunctionLink" struct that is what fn_extra > points at, and it survives as long as the FmgrInfo does, so that's > what saves us from redoing hash key computations in most cases. I've looked through it and made some tests, including ones which caused me to create separate context for planing. Was a bit worried that it has gone, but now, as fcache->fcontext is deleted in the end of function execution, I don't see leaks, which were the initial reason for introducing it. -- Best regards, Alexander Pyhalov, Postgres Professional
Alexander Pyhalov <a.pyhalov@postgrespro.ru> writes: > I've looked through it and made some tests, including ones which > caused me to create separate context for planing. Was a bit worried > that it has gone, but now, as fcache->fcontext is deleted in the end > of function execution, I don't see leaks, which were the initial > reason for introducing it. Yeah. As it's set up in v10, we do parsing work in the caller's context (which is expected to be short-lived) when creating or recreating the long-lived cache entry. However, planning work (if needed) is done in the fcontext, since that will happen within init_execution_state which is called after fmgr_sql has switched into the fcontext. I thought about switching back to the caller's context but decided that it wouldn't really be worth the trouble. For a non-SRF there's no meaningful difference anyway. For a SRF, it'd mean that planning cruft survives till end of execution of the SRF rather than possibly going away when the first row is returned. But we aren't looping: any given query within the SRF is planned only once during an execution of the function. So there's no possibility of indefinite accumulation of leakage. If we wanted to improve that, my inclination would be to try to not switch into the fcontext for the whole of fmgr_sql, but only use it explicitly for allocations that need to survive. But I don't think that'll save much, so I think any such change would be best left for later. The patch is big enough already. regards, tom lane
I wrote: > There's more stuff that could be done, but I feel that all of this > could be left for later: > * I really wanted to do what I mentioned upthread and change things > so we don't even parse the later queries until we've executed the > ones before that. However that seems to be a bit of a mess to > make happen, and the patch is large/complex enough already. I felt more regret about that decision after reading the discussion at [1] and realizing that we already moved those goalposts part way. For example, historically this fails: set check_function_bodies to off; create or replace function create_and_insert() returns void language sql as $$ create table t1 (f1 int); insert into t1 values (1.2); $$; select create_and_insert(); You get "ERROR: relation "t1" does not exist" because we try to parse-analyze the INSERT before the CREATE has been executed. That's still true with patch v10. However, consider this: create table t1 (f1 int); create or replace function alter_and_insert() returns void language sql as $$ alter table t1 alter column f1 type numeric; insert into t1 values (1.2); $$; select alter_and_insert(); Historically that fails with ERROR: table row type and query-specified row type do not match DETAIL: Table has type numeric at ordinal position 1, but query expects integer. CONTEXT: SQL function "alter_and_insert" statement 2 because we built a plan for the INSERT before executing the ALTER. However, with v10 it works! The INSERT is parse-analyzed against the old table definition, but that's close enough that we can successfully make a CachedPlanSource. Then when we come to execute the INSERT, the plancache notices that what it has is already invalidated, so it re-does everything and the correct data type is used. So that left me quite unsatisfied. It's not great to modify edge-case semantics a little bit and then come back and modify them some more in the next release; better for it to all happen at once. Besides which, it'd be quite difficult to document v10's behavior in a fashion that makes any sense to users. I'd abandoned the idea of delaying parse analysis over the weekend because time was running out and I had enough things to worry about, but I resolved to take another look. And it turns out not to be very hard at all to get there from where we were in v10: we already did 90% of the restructuring needed to do the processing incrementally. The main missing thing is we need someplace to stash the raw parse trees or Queries we got from pg_proc until we are ready to do the analyze-rewrite-and-make-a-cached-plan business for them. I had already had the idea of making a second context inside the SQLFunctionHashEntry to keep those trees in until we don't need them anymore (which we don't once the last CachedPlanSource is made). So that part wasn't hard. I then discovered that the error reporting needed some tweaking, which wasn't hard either. So attached is v11. 0001-0006 are identical to v10, and then 0007 is the delta. (I'd plan to squash that in final commit, but I thought it'd be easier to review this way.) Compared to v10, this means that parse analysis work as well as planning work is done in fcontext, so there's a little bit more temporary leakage during the first run of a lazy-evaluation SRF. I still think that this is not going to amount to anything worth worrying about, but maybe it slightly raises the interest level of not doing everything in fcontext. Another point is that the order of the subroutines in functions.c is starting to feel rather random. I left them like this to minimize the amount of pure code motion involved in 0007, but I'm tempted to rearrange them into something closer to order-of-execution before final commit. Anyway, I feel pretty good about this patch now and am quite content to stop here for PG 18. regards, tom lane [1] https://www.postgresql.org/message-id/flat/4115257.1743441667%40sss.pgh.pa.us#ae47bbce3b234245c530ad06eb819d8c From 5d2f5e092ef326f72100d6d47ba1b5cb207e62ba Mon Sep 17 00:00:00 2001 From: Tom Lane <tgl@sss.pgh.pa.us> Date: Sat, 29 Mar 2025 14:50:45 -0400 Subject: [PATCH v11 1/7] Support cached plans that work from a parse-analyzed Query. Up to now, plancache.c dealt only with raw parse trees as the starting point for a cached plan. However, we'd like to use this infrastructure for SQL functions, and in the case of a new-style SQL function we'll only have the stored querytree, which corresponds to an analyzed-but-not-rewritten Query. Fortunately, we can make plancache.c handle that scenario with only minor modifications; the biggest change is in RevalidateCachedQuery() where we will need to apply only pg_rewrite_query not pg_analyze_and_rewrite. This patch just installs the infrastructure; there's no caller as yet. Author: Alexander Pyhalov <a.pyhalov@postgrespro.ru> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/8216639.NyiUUSuA9g@aivenlaptop --- src/backend/parser/analyze.c | 39 +++++++ src/backend/utils/cache/plancache.c | 158 +++++++++++++++++++++------- src/include/parser/analyze.h | 1 + src/include/utils/plancache.h | 23 +++- 4 files changed, 179 insertions(+), 42 deletions(-) diff --git a/src/backend/parser/analyze.c b/src/backend/parser/analyze.c index 76f58b3aca3..1f4d6adda52 100644 --- a/src/backend/parser/analyze.c +++ b/src/backend/parser/analyze.c @@ -591,6 +591,45 @@ analyze_requires_snapshot(RawStmt *parseTree) return stmt_requires_parse_analysis(parseTree); } +/* + * query_requires_rewrite_plan() + * Returns true if rewriting or planning is non-trivial for this Query. + * + * This is much like stmt_requires_parse_analysis(), but applies one step + * further down the pipeline. + * + * We do not provide an equivalent of analyze_requires_snapshot(): callers + * can assume that any rewriting or planning activity needs a snapshot. + */ +bool +query_requires_rewrite_plan(Query *query) +{ + bool result; + + if (query->commandType != CMD_UTILITY) + { + /* All optimizable statements require rewriting/planning */ + result = true; + } + else + { + /* This list should match stmt_requires_parse_analysis() */ + switch (nodeTag(query->utilityStmt)) + { + case T_DeclareCursorStmt: + case T_ExplainStmt: + case T_CreateTableAsStmt: + case T_CallStmt: + result = true; + break; + default: + result = false; + break; + } + } + return result; +} + /* * transformDeleteStmt - * transforms a Delete Statement diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c index 6c2979d5c82..5983927a4c2 100644 --- a/src/backend/utils/cache/plancache.c +++ b/src/backend/utils/cache/plancache.c @@ -14,7 +14,7 @@ * Cache invalidation is driven off sinval events. Any CachedPlanSource * that matches the event is marked invalid, as is its generic CachedPlan * if it has one. When (and if) the next demand for a cached plan occurs, - * parse analysis and rewrite is repeated to build a new valid query tree, + * parse analysis and/or rewrite is repeated to build a new valid query tree, * and then planning is performed as normal. We also force re-analysis and * re-planning if the active search_path is different from the previous time * or, if RLS is involved, if the user changes or the RLS environment changes. @@ -63,6 +63,7 @@ #include "nodes/nodeFuncs.h" #include "optimizer/optimizer.h" #include "parser/analyze.h" +#include "rewrite/rewriteHandler.h" #include "storage/lmgr.h" #include "tcop/pquery.h" #include "tcop/utility.h" @@ -74,18 +75,6 @@ #include "utils/syscache.h" -/* - * We must skip "overhead" operations that involve database access when the - * cached plan's subject statement is a transaction control command or one - * that requires a snapshot not to be set yet (such as SET or LOCK). More - * generally, statements that do not require parse analysis/rewrite/plan - * activity never need to be revalidated, so we can treat them all like that. - * For the convenience of postgres.c, treat empty statements that way too. - */ -#define StmtPlanRequiresRevalidation(plansource) \ - ((plansource)->raw_parse_tree != NULL && \ - stmt_requires_parse_analysis((plansource)->raw_parse_tree)) - /* * This is the head of the backend's list of "saved" CachedPlanSources (i.e., * those that are in long-lived storage and are examined for sinval events). @@ -100,6 +89,8 @@ static dlist_head saved_plan_list = DLIST_STATIC_INIT(saved_plan_list); static dlist_head cached_expression_list = DLIST_STATIC_INIT(cached_expression_list); static void ReleaseGenericPlan(CachedPlanSource *plansource); +static bool StmtPlanRequiresRevalidation(CachedPlanSource *plansource); +static bool BuildingPlanRequiresSnapshot(CachedPlanSource *plansource); static List *RevalidateCachedQuery(CachedPlanSource *plansource, QueryEnvironment *queryEnv, bool release_generic); @@ -166,7 +157,7 @@ InitPlanCache(void) } /* - * CreateCachedPlan: initially create a plan cache entry. + * CreateCachedPlan: initially create a plan cache entry for a raw parse tree. * * Creation of a cached plan is divided into two steps, CreateCachedPlan and * CompleteCachedPlan. CreateCachedPlan should be called after running the @@ -220,6 +211,7 @@ CreateCachedPlan(RawStmt *raw_parse_tree, plansource = (CachedPlanSource *) palloc0(sizeof(CachedPlanSource)); plansource->magic = CACHEDPLANSOURCE_MAGIC; plansource->raw_parse_tree = copyObject(raw_parse_tree); + plansource->analyzed_parse_tree = NULL; plansource->query_string = pstrdup(query_string); MemoryContextSetIdentifier(source_context, plansource->query_string); plansource->commandTag = commandTag; @@ -255,6 +247,34 @@ CreateCachedPlan(RawStmt *raw_parse_tree, return plansource; } +/* + * CreateCachedPlanForQuery: initially create a plan cache entry for a Query. + * + * This is used in the same way as CreateCachedPlan, except that the source + * query has already been through parse analysis, and the plancache will never + * try to re-do that step. + * + * Currently this is used only for new-style SQL functions, where we have a + * Query from the function's prosqlbody, but no source text. The query_string + * is typically empty, but is required anyway. + */ +CachedPlanSource * +CreateCachedPlanForQuery(Query *analyzed_parse_tree, + const char *query_string, + CommandTag commandTag) +{ + CachedPlanSource *plansource; + MemoryContext oldcxt; + + /* Rather than duplicating CreateCachedPlan, just do this: */ + plansource = CreateCachedPlan(NULL, query_string, commandTag); + oldcxt = MemoryContextSwitchTo(plansource->context); + plansource->analyzed_parse_tree = copyObject(analyzed_parse_tree); + MemoryContextSwitchTo(oldcxt); + + return plansource; +} + /* * CreateOneShotCachedPlan: initially create a one-shot plan cache entry. * @@ -289,6 +309,7 @@ CreateOneShotCachedPlan(RawStmt *raw_parse_tree, plansource = (CachedPlanSource *) palloc0(sizeof(CachedPlanSource)); plansource->magic = CACHEDPLANSOURCE_MAGIC; plansource->raw_parse_tree = raw_parse_tree; + plansource->analyzed_parse_tree = NULL; plansource->query_string = query_string; plansource->commandTag = commandTag; plansource->param_types = NULL; @@ -566,6 +587,42 @@ ReleaseGenericPlan(CachedPlanSource *plansource) } } +/* + * We must skip "overhead" operations that involve database access when the + * cached plan's subject statement is a transaction control command or one + * that requires a snapshot not to be set yet (such as SET or LOCK). More + * generally, statements that do not require parse analysis/rewrite/plan + * activity never need to be revalidated, so we can treat them all like that. + * For the convenience of postgres.c, treat empty statements that way too. + */ +static bool +StmtPlanRequiresRevalidation(CachedPlanSource *plansource) +{ + if (plansource->raw_parse_tree != NULL) + return stmt_requires_parse_analysis(plansource->raw_parse_tree); + else if (plansource->analyzed_parse_tree != NULL) + return query_requires_rewrite_plan(plansource->analyzed_parse_tree); + /* empty query never needs revalidation */ + return false; +} + +/* + * Determine if creating a plan for this CachedPlanSource requires a snapshot. + * In fact this function matches StmtPlanRequiresRevalidation(), but we want + * to preserve the distinction between stmt_requires_parse_analysis() and + * analyze_requires_snapshot(). + */ +static bool +BuildingPlanRequiresSnapshot(CachedPlanSource *plansource) +{ + if (plansource->raw_parse_tree != NULL) + return analyze_requires_snapshot(plansource->raw_parse_tree); + else if (plansource->analyzed_parse_tree != NULL) + return query_requires_rewrite_plan(plansource->analyzed_parse_tree); + /* empty query never needs a snapshot */ + return false; +} + /* * RevalidateCachedQuery: ensure validity of analyzed-and-rewritten query tree. * @@ -592,7 +649,6 @@ RevalidateCachedQuery(CachedPlanSource *plansource, bool release_generic) { bool snapshot_set; - RawStmt *rawtree; List *tlist; /* transient query-tree list */ List *qlist; /* permanent query-tree list */ TupleDesc resultDesc; @@ -615,7 +671,10 @@ RevalidateCachedQuery(CachedPlanSource *plansource, /* * If the query is currently valid, we should have a saved search_path --- * check to see if that matches the current environment. If not, we want - * to force replan. + * to force replan. (We could almost ignore this consideration when + * working from an analyzed parse tree; but there are scenarios where + * planning can have search_path-dependent results, for example if it + * inlines an old-style SQL function.) */ if (plansource->is_valid) { @@ -662,9 +721,9 @@ RevalidateCachedQuery(CachedPlanSource *plansource, } /* - * Discard the no-longer-useful query tree. (Note: we don't want to do - * this any earlier, else we'd not have been able to release locks - * correctly in the race condition case.) + * Discard the no-longer-useful rewritten query tree. (Note: we don't + * want to do this any earlier, else we'd not have been able to release + * locks correctly in the race condition case.) */ plansource->is_valid = false; plansource->query_list = NIL; @@ -711,25 +770,48 @@ RevalidateCachedQuery(CachedPlanSource *plansource, } /* - * Run parse analysis and rule rewriting. The parser tends to scribble on - * its input, so we must copy the raw parse tree to prevent corruption of - * the cache. + * Run parse analysis (if needed) and rule rewriting. */ - rawtree = copyObject(plansource->raw_parse_tree); - if (rawtree == NULL) - tlist = NIL; - else if (plansource->parserSetup != NULL) - tlist = pg_analyze_and_rewrite_withcb(rawtree, - plansource->query_string, - plansource->parserSetup, - plansource->parserSetupArg, - queryEnv); + if (plansource->raw_parse_tree != NULL) + { + /* Source is raw parse tree */ + RawStmt *rawtree; + + /* + * The parser tends to scribble on its input, so we must copy the raw + * parse tree to prevent corruption of the cache. + */ + rawtree = copyObject(plansource->raw_parse_tree); + if (plansource->parserSetup != NULL) + tlist = pg_analyze_and_rewrite_withcb(rawtree, + plansource->query_string, + plansource->parserSetup, + plansource->parserSetupArg, + queryEnv); + else + tlist = pg_analyze_and_rewrite_fixedparams(rawtree, + plansource->query_string, + plansource->param_types, + plansource->num_params, + queryEnv); + } + else if (plansource->analyzed_parse_tree != NULL) + { + /* Source is pre-analyzed query, so we only need to rewrite */ + Query *analyzed_tree; + + /* The rewriter scribbles on its input, too, so copy */ + analyzed_tree = copyObject(plansource->analyzed_parse_tree); + /* Acquire locks needed before rewriting ... */ + AcquireRewriteLocks(analyzed_tree, true, false); + /* ... and do it */ + tlist = pg_rewrite_query(analyzed_tree); + } else - tlist = pg_analyze_and_rewrite_fixedparams(rawtree, - plansource->query_string, - plansource->param_types, - plansource->num_params, - queryEnv); + { + /* Empty query, nothing to do */ + tlist = NIL; + } /* Release snapshot if we got one */ if (snapshot_set) @@ -963,8 +1045,7 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist, */ snapshot_set = false; if (!ActiveSnapshotSet() && - plansource->raw_parse_tree && - analyze_requires_snapshot(plansource->raw_parse_tree)) + BuildingPlanRequiresSnapshot(plansource)) { PushActiveSnapshot(GetTransactionSnapshot()); snapshot_set = true; @@ -1703,6 +1784,7 @@ CopyCachedPlan(CachedPlanSource *plansource) newsource = (CachedPlanSource *) palloc0(sizeof(CachedPlanSource)); newsource->magic = CACHEDPLANSOURCE_MAGIC; newsource->raw_parse_tree = copyObject(plansource->raw_parse_tree); + newsource->analyzed_parse_tree = copyObject(plansource->analyzed_parse_tree); newsource->query_string = pstrdup(plansource->query_string); MemoryContextSetIdentifier(source_context, newsource->query_string); newsource->commandTag = plansource->commandTag; diff --git a/src/include/parser/analyze.h b/src/include/parser/analyze.h index f1bd18c49f2..f29ed03b476 100644 --- a/src/include/parser/analyze.h +++ b/src/include/parser/analyze.h @@ -52,6 +52,7 @@ extern Query *transformStmt(ParseState *pstate, Node *parseTree); extern bool stmt_requires_parse_analysis(RawStmt *parseTree); extern bool analyze_requires_snapshot(RawStmt *parseTree); +extern bool query_requires_rewrite_plan(Query *query); extern const char *LCS_asString(LockClauseStrength strength); extern void CheckSelectLocking(Query *qry, LockClauseStrength strength); diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h index 199cc323a28..5930fcb50f0 100644 --- a/src/include/utils/plancache.h +++ b/src/include/utils/plancache.h @@ -25,7 +25,8 @@ #include "utils/resowner.h" -/* Forward declaration, to avoid including parsenodes.h here */ +/* Forward declarations, to avoid including parsenodes.h here */ +struct Query; struct RawStmt; /* possible values for plan_cache_mode */ @@ -45,12 +46,22 @@ extern PGDLLIMPORT int plan_cache_mode; /* * CachedPlanSource (which might better have been called CachedQuery) - * represents a SQL query that we expect to use multiple times. It stores - * the query source text, the raw parse tree, and the analyzed-and-rewritten + * represents a SQL query that we expect to use multiple times. It stores the + * query source text, the source parse tree, and the analyzed-and-rewritten * query tree, as well as adjunct data. Cache invalidation can happen as a * result of DDL affecting objects used by the query. In that case we discard * the analyzed-and-rewritten query tree, and rebuild it when next needed. * + * There are two ways in which the source query can be represented: either + * as a raw parse tree, or as an analyzed-but-not-rewritten parse tree. + * In the latter case we expect that cache invalidation need not affect + * the parse-analysis results, only the rewriting and planning steps. + * Only one of raw_parse_tree and analyzed_parse_tree can be non-NULL. + * (If both are NULL, the CachedPlanSource represents an empty query.) + * Note that query_string is typically just an empty string when the + * source query is an analyzed parse tree; also, param_types, num_params, + * parserSetup, and parserSetupArg will not be used. + * * An actual execution plan, represented by CachedPlan, is derived from the * CachedPlanSource when we need to execute the query. The plan could be * either generic (usable with any set of plan parameters) or custom (for a @@ -78,7 +89,7 @@ extern PGDLLIMPORT int plan_cache_mode; * though it may be useful if the CachedPlan can be discarded early.) * * A CachedPlanSource has two associated memory contexts: one that holds the - * struct itself, the query source text and the raw parse tree, and another + * struct itself, the query source text and the source parse tree, and another * context that holds the rewritten query tree and associated data. This * allows the query tree to be discarded easily when it is invalidated. * @@ -94,6 +105,7 @@ typedef struct CachedPlanSource { int magic; /* should equal CACHEDPLANSOURCE_MAGIC */ struct RawStmt *raw_parse_tree; /* output of raw_parser(), or NULL */ + struct Query *analyzed_parse_tree; /* analyzed parse tree, or NULL */ const char *query_string; /* source text of query */ CommandTag commandTag; /* command tag for query */ Oid *param_types; /* array of parameter type OIDs, or NULL */ @@ -196,6 +208,9 @@ extern void ReleaseAllPlanCacheRefsInOwner(ResourceOwner owner); extern CachedPlanSource *CreateCachedPlan(struct RawStmt *raw_parse_tree, const char *query_string, CommandTag commandTag); +extern CachedPlanSource *CreateCachedPlanForQuery(struct Query *analyzed_parse_tree, + const char *query_string, + CommandTag commandTag); extern CachedPlanSource *CreateOneShotCachedPlan(struct RawStmt *raw_parse_tree, const char *query_string, CommandTag commandTag); -- 2.43.5 From 0394a86c27cf5e82ce0574f951392428f80fe4b7 Mon Sep 17 00:00:00 2001 From: Tom Lane <tgl@sss.pgh.pa.us> Date: Sat, 29 Mar 2025 14:51:37 -0400 Subject: [PATCH v11 2/7] Provide a post-rewrite callback hook in plancache.c. SQL-language functions sometimes want to modify the targetlist of the query that returns their result. If they're to use the plan cache, it needs to be possible to do that over again when a replan occurs. Invent a callback hook to make that happen. I chose to provide a separate function SetPostRewriteHook to install such hooks. An alternative API could be to add two more arguments to CompleteCachedPlan. I didn't do so because I felt that few callers will want this, but there's a case that that way would be cleaner. Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/8216639.NyiUUSuA9g@aivenlaptop --- src/backend/utils/cache/plancache.c | 33 +++++++++++++++++++++++++++++ src/include/utils/plancache.h | 8 +++++++ src/tools/pgindent/typedefs.list | 1 + 3 files changed, 42 insertions(+) diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c index 5983927a4c2..3b681647060 100644 --- a/src/backend/utils/cache/plancache.c +++ b/src/backend/utils/cache/plancache.c @@ -219,6 +219,8 @@ CreateCachedPlan(RawStmt *raw_parse_tree, plansource->num_params = 0; plansource->parserSetup = NULL; plansource->parserSetupArg = NULL; + plansource->postRewrite = NULL; + plansource->postRewriteArg = NULL; plansource->cursor_options = 0; plansource->fixed_result = false; plansource->resultDesc = NULL; @@ -316,6 +318,8 @@ CreateOneShotCachedPlan(RawStmt *raw_parse_tree, plansource->num_params = 0; plansource->parserSetup = NULL; plansource->parserSetupArg = NULL; + plansource->postRewrite = NULL; + plansource->postRewriteArg = NULL; plansource->cursor_options = 0; plansource->fixed_result = false; plansource->resultDesc = NULL; @@ -485,6 +489,29 @@ CompleteCachedPlan(CachedPlanSource *plansource, plansource->is_valid = true; } +/* + * SetPostRewriteHook: set a hook to modify post-rewrite query trees + * + * Some callers have a need to modify the query trees between rewriting and + * planning. In the initial call to CompleteCachedPlan, it's assumed such + * work was already done on the querytree_list. However, if we're forced + * to replan, it will need to be done over. The caller can set this hook + * to provide code to make that happen. + * + * postRewriteArg is just passed verbatim to the hook. As with parserSetupArg, + * it is caller's responsibility that the referenced data remains + * valid for as long as the CachedPlanSource exists. + */ +void +SetPostRewriteHook(CachedPlanSource *plansource, + PostRewriteHook postRewrite, + void *postRewriteArg) +{ + Assert(plansource->magic == CACHEDPLANSOURCE_MAGIC); + plansource->postRewrite = postRewrite; + plansource->postRewriteArg = postRewriteArg; +} + /* * SaveCachedPlan: save a cached plan permanently * @@ -813,6 +840,10 @@ RevalidateCachedQuery(CachedPlanSource *plansource, tlist = NIL; } + /* Apply post-rewrite callback if there is one */ + if (plansource->postRewrite != NULL) + plansource->postRewrite(tlist, plansource->postRewriteArg); + /* Release snapshot if we got one */ if (snapshot_set) PopActiveSnapshot(); @@ -1800,6 +1831,8 @@ CopyCachedPlan(CachedPlanSource *plansource) newsource->num_params = plansource->num_params; newsource->parserSetup = plansource->parserSetup; newsource->parserSetupArg = plansource->parserSetupArg; + newsource->postRewrite = plansource->postRewrite; + newsource->postRewriteArg = plansource->postRewriteArg; newsource->cursor_options = plansource->cursor_options; newsource->fixed_result = plansource->fixed_result; if (plansource->resultDesc) diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h index 5930fcb50f0..07ec5318db7 100644 --- a/src/include/utils/plancache.h +++ b/src/include/utils/plancache.h @@ -40,6 +40,9 @@ typedef enum /* GUC parameter */ extern PGDLLIMPORT int plan_cache_mode; +/* Optional callback to editorialize on rewritten parse trees */ +typedef void (*PostRewriteHook) (List *querytree_list, void *arg); + #define CACHEDPLANSOURCE_MAGIC 195726186 #define CACHEDPLAN_MAGIC 953717834 #define CACHEDEXPR_MAGIC 838275847 @@ -112,6 +115,8 @@ typedef struct CachedPlanSource int num_params; /* length of param_types array */ ParserSetupHook parserSetup; /* alternative parameter spec method */ void *parserSetupArg; + PostRewriteHook postRewrite; /* see SetPostRewriteHook */ + void *postRewriteArg; int cursor_options; /* cursor options used for planning */ bool fixed_result; /* disallow change in result tupdesc? */ TupleDesc resultDesc; /* result type; NULL = doesn't return tuples */ @@ -223,6 +228,9 @@ extern void CompleteCachedPlan(CachedPlanSource *plansource, void *parserSetupArg, int cursor_options, bool fixed_result); +extern void SetPostRewriteHook(CachedPlanSource *plansource, + PostRewriteHook postRewrite, + void *postRewriteArg); extern void SaveCachedPlan(CachedPlanSource *plansource); extern void DropCachedPlan(CachedPlanSource *plansource); diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list index b66cecd8799..ff75a508876 100644 --- a/src/tools/pgindent/typedefs.list +++ b/src/tools/pgindent/typedefs.list @@ -2266,6 +2266,7 @@ PortalHashEnt PortalStatus PortalStrategy PostParseColumnRefHook +PostRewriteHook PostgresPollingStatusType PostingItem PreParseColumnRefHook -- 2.43.5 From 8911e13329809ce2c44aecd82a460e54ad164d25 Mon Sep 17 00:00:00 2001 From: Tom Lane <tgl@sss.pgh.pa.us> Date: Sat, 29 Mar 2025 16:56:23 -0400 Subject: [PATCH v11 3/7] Factor out plpgsql's management of its function cache. SQL-language functions need precisely this same functionality to manage a long-lived cache of functions. Rather than duplicating or reinventing that code, let's split it out into a new module funccache.c so that it is available for any language that wants to use it. This is mostly an exercise in moving and renaming code, and should not change any behavior. I have added one feature that plpgsql doesn't use but SQL functions will need: the cache lookup key can include the output tuple descriptor when the function returns composite. Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/8216639.NyiUUSuA9g@aivenlaptop --- src/backend/utils/cache/Makefile | 1 + src/backend/utils/cache/funccache.c | 612 ++++++++++++++++++++++++++++ src/backend/utils/cache/meson.build | 1 + src/include/utils/funccache.h | 134 ++++++ src/pl/plpgsql/src/pl_comp.c | 433 ++------------------ src/pl/plpgsql/src/pl_funcs.c | 9 +- src/pl/plpgsql/src/pl_handler.c | 15 +- src/pl/plpgsql/src/plpgsql.h | 45 +- src/tools/pgindent/typedefs.list | 5 + 9 files changed, 811 insertions(+), 444 deletions(-) create mode 100644 src/backend/utils/cache/funccache.c create mode 100644 src/include/utils/funccache.h diff --git a/src/backend/utils/cache/Makefile b/src/backend/utils/cache/Makefile index 5105018cb79..77b3e1a037b 100644 --- a/src/backend/utils/cache/Makefile +++ b/src/backend/utils/cache/Makefile @@ -16,6 +16,7 @@ OBJS = \ attoptcache.o \ catcache.o \ evtcache.o \ + funccache.o \ inval.o \ lsyscache.o \ partcache.o \ diff --git a/src/backend/utils/cache/funccache.c b/src/backend/utils/cache/funccache.c new file mode 100644 index 00000000000..203d17f2459 --- /dev/null +++ b/src/backend/utils/cache/funccache.c @@ -0,0 +1,612 @@ +/*------------------------------------------------------------------------- + * + * funccache.c + * Function cache management. + * + * funccache.c manages a cache of function execution data. The cache + * is used by SQL-language and PL/pgSQL functions, and could be used by + * other function languages. Each cache entry is specific to the execution + * of a particular function (identified by OID) with specific input data + * types; so a polymorphic function could have many associated cache entries. + * Trigger functions similarly have a cache entry per trigger. These rules + * allow the cached data to be specific to the particular data types the + * function call will be dealing with. + * + * + * Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group + * Portions Copyright (c) 1994, Regents of the University of California + * + * IDENTIFICATION + * src/backend/utils/cache/funccache.c + * + *------------------------------------------------------------------------- + */ +#include "postgres.h" + +#include "commands/event_trigger.h" +#include "commands/trigger.h" +#include "common/hashfn.h" +#include "funcapi.h" +#include "catalog/pg_proc.h" +#include "utils/funccache.h" +#include "utils/hsearch.h" +#include "utils/syscache.h" + + +/* + * Hash table for cached functions + */ +static HTAB *cfunc_hashtable = NULL; + +typedef struct CachedFunctionHashEntry +{ + CachedFunctionHashKey key; /* hash key, must be first */ + CachedFunction *function; /* points to data of language-specific size */ +} CachedFunctionHashEntry; + +#define FUNCS_PER_USER 128 /* initial table size */ + +static uint32 cfunc_hash(const void *key, Size keysize); +static int cfunc_match(const void *key1, const void *key2, Size keysize); + + +/* + * Initialize the hash table on first use. + * + * The hash table will be in TopMemoryContext regardless of caller's context. + */ +static void +cfunc_hashtable_init(void) +{ + HASHCTL ctl; + + /* don't allow double-initialization */ + Assert(cfunc_hashtable == NULL); + + ctl.keysize = sizeof(CachedFunctionHashKey); + ctl.entrysize = sizeof(CachedFunctionHashEntry); + ctl.hash = cfunc_hash; + ctl.match = cfunc_match; + cfunc_hashtable = hash_create("Cached function hash", + FUNCS_PER_USER, + &ctl, + HASH_ELEM | HASH_FUNCTION | HASH_COMPARE); +} + +/* + * cfunc_hash: hash function for cfunc hash table + * + * We need special hash and match functions to deal with the optional + * presence of a TupleDesc in the hash keys. As long as we have to do + * that, we might as well also be smart about not comparing unused + * elements of the argtypes arrays. + */ +static uint32 +cfunc_hash(const void *key, Size keysize) +{ + const CachedFunctionHashKey *k = (const CachedFunctionHashKey *) key; + uint32 h; + + Assert(keysize == sizeof(CachedFunctionHashKey)); + /* Hash all the fixed fields except callResultType */ + h = DatumGetUInt32(hash_any((const unsigned char *) k, + offsetof(CachedFunctionHashKey, callResultType))); + /* Incorporate input argument types */ + if (k->nargs > 0) + h = hash_combine(h, + DatumGetUInt32(hash_any((const unsigned char *) k->argtypes, + k->nargs * sizeof(Oid)))); + /* Incorporate callResultType if present */ + if (k->callResultType) + h = hash_combine(h, hashRowType(k->callResultType)); + return h; +} + +/* + * cfunc_match: match function to use with cfunc_hash + */ +static int +cfunc_match(const void *key1, const void *key2, Size keysize) +{ + const CachedFunctionHashKey *k1 = (const CachedFunctionHashKey *) key1; + const CachedFunctionHashKey *k2 = (const CachedFunctionHashKey *) key2; + + Assert(keysize == sizeof(CachedFunctionHashKey)); + /* Compare all the fixed fields except callResultType */ + if (memcmp(k1, k2, offsetof(CachedFunctionHashKey, callResultType)) != 0) + return 1; /* not equal */ + /* Compare input argument types (we just verified that nargs matches) */ + if (k1->nargs > 0 && + memcmp(k1->argtypes, k2->argtypes, k1->nargs * sizeof(Oid)) != 0) + return 1; /* not equal */ + /* Compare callResultType */ + if (k1->callResultType) + { + if (k2->callResultType) + { + if (!equalRowTypes(k1->callResultType, k2->callResultType)) + return 1; /* not equal */ + } + else + return 1; /* not equal */ + } + else + { + if (k2->callResultType) + return 1; /* not equal */ + } + return 0; /* equal */ +} + +/* + * Look up the CachedFunction for the given hash key. + * Returns NULL if not present. + */ +static CachedFunction * +cfunc_hashtable_lookup(CachedFunctionHashKey *func_key) +{ + CachedFunctionHashEntry *hentry; + + if (cfunc_hashtable == NULL) + return NULL; + + hentry = (CachedFunctionHashEntry *) hash_search(cfunc_hashtable, + func_key, + HASH_FIND, + NULL); + if (hentry) + return hentry->function; + else + return NULL; +} + +/* + * Insert a hash table entry. + */ +static void +cfunc_hashtable_insert(CachedFunction *function, + CachedFunctionHashKey *func_key) +{ + CachedFunctionHashEntry *hentry; + bool found; + + if (cfunc_hashtable == NULL) + cfunc_hashtable_init(); + + hentry = (CachedFunctionHashEntry *) hash_search(cfunc_hashtable, + func_key, + HASH_ENTER, + &found); + if (found) + elog(WARNING, "trying to insert a function that already exists"); + + /* + * If there's a callResultType, copy it into TopMemoryContext. If we're + * unlucky enough for that to fail, leave the entry with null + * callResultType, which will probably never match anything. + */ + if (func_key->callResultType) + { + MemoryContext oldcontext = MemoryContextSwitchTo(TopMemoryContext); + + hentry->key.callResultType = NULL; + hentry->key.callResultType = CreateTupleDescCopy(func_key->callResultType); + MemoryContextSwitchTo(oldcontext); + } + + hentry->function = function; + + /* Set back-link from function to hashtable key */ + function->fn_hashkey = &hentry->key; +} + +/* + * Delete a hash table entry. + */ +static void +cfunc_hashtable_delete(CachedFunction *function) +{ + CachedFunctionHashEntry *hentry; + TupleDesc tupdesc; + + /* do nothing if not in table */ + if (function->fn_hashkey == NULL) + return; + + /* + * We need to free the callResultType if present, which is slightly tricky + * because it has to be valid during the hashtable search. Fortunately, + * because we have the hashkey back-link, we can grab that pointer before + * deleting the hashtable entry. + */ + tupdesc = function->fn_hashkey->callResultType; + + hentry = (CachedFunctionHashEntry *) hash_search(cfunc_hashtable, + function->fn_hashkey, + HASH_REMOVE, + NULL); + if (hentry == NULL) + elog(WARNING, "trying to delete function that does not exist"); + + /* Remove back link, which no longer points to allocated storage */ + function->fn_hashkey = NULL; + + /* Release the callResultType if present */ + if (tupdesc) + FreeTupleDesc(tupdesc); +} + +/* + * Compute the hashkey for a given function invocation + * + * The hashkey is returned into the caller-provided storage at *hashkey. + * Note however that if a callResultType is incorporated, we've not done + * anything about copying that. + */ +static void +compute_function_hashkey(FunctionCallInfo fcinfo, + Form_pg_proc procStruct, + CachedFunctionHashKey *hashkey, + Size cacheEntrySize, + bool includeResultType, + bool forValidator) +{ + /* Make sure pad bytes within fixed part of the struct are zero */ + memset(hashkey, 0, offsetof(CachedFunctionHashKey, argtypes)); + + /* get function OID */ + hashkey->funcOid = fcinfo->flinfo->fn_oid; + + /* get call context */ + hashkey->isTrigger = CALLED_AS_TRIGGER(fcinfo); + hashkey->isEventTrigger = CALLED_AS_EVENT_TRIGGER(fcinfo); + + /* record cacheEntrySize so multiple languages can share hash table */ + hashkey->cacheEntrySize = cacheEntrySize; + + /* + * If DML trigger, include trigger's OID in the hash, so that each trigger + * usage gets a different hash entry, allowing for e.g. different relation + * rowtypes or transition table names. In validation mode we do not know + * what relation or transition table names are intended to be used, so we + * leave trigOid zero; the hash entry built in this case will never be + * used for any actual calls. + * + * We don't currently need to distinguish different event trigger usages + * in the same way, since the special parameter variables don't vary in + * type in that case. + */ + if (hashkey->isTrigger && !forValidator) + { + TriggerData *trigdata = (TriggerData *) fcinfo->context; + + hashkey->trigOid = trigdata->tg_trigger->tgoid; + } + + /* get input collation, if known */ + hashkey->inputCollation = fcinfo->fncollation; + + /* + * We include only input arguments in the hash key, since output argument + * types can be deduced from those, and it would require extra cycles to + * include the output arguments. But we have to resolve any polymorphic + * argument types to the real types for the call. + */ + if (procStruct->pronargs > 0) + { + hashkey->nargs = procStruct->pronargs; + memcpy(hashkey->argtypes, procStruct->proargtypes.values, + procStruct->pronargs * sizeof(Oid)); + cfunc_resolve_polymorphic_argtypes(procStruct->pronargs, + hashkey->argtypes, + NULL, /* all args are inputs */ + fcinfo->flinfo->fn_expr, + forValidator, + NameStr(procStruct->proname)); + } + + /* + * While regular OUT arguments are sufficiently represented by the + * resolved input arguments, a function returning composite has additional + * variability: ALTER TABLE/ALTER TYPE could affect what it returns. Also, + * a function returning RECORD may depend on a column definition list to + * determine its output rowtype. If the caller needs the exact result + * type to be part of the hash lookup key, we must run + * get_call_result_type() to find that out. + */ + if (includeResultType) + { + Oid resultTypeId; + TupleDesc tupdesc; + + switch (get_call_result_type(fcinfo, &resultTypeId, &tupdesc)) + { + case TYPEFUNC_COMPOSITE: + case TYPEFUNC_COMPOSITE_DOMAIN: + hashkey->callResultType = tupdesc; + break; + default: + /* scalar result, or indeterminate rowtype */ + break; + } + } +} + +/* + * This is the same as the standard resolve_polymorphic_argtypes() function, + * except that: + * 1. We go ahead and report the error if we can't resolve the types. + * 2. We treat RECORD-type input arguments (not output arguments) as if + * they were polymorphic, replacing their types with the actual input + * types if we can determine those. This allows us to create a separate + * function cache entry for each named composite type passed to such an + * argument. + * 3. In validation mode, we have no inputs to look at, so assume that + * polymorphic arguments are integer, integer-array or integer-range. + */ +void +cfunc_resolve_polymorphic_argtypes(int numargs, + Oid *argtypes, char *argmodes, + Node *call_expr, bool forValidator, + const char *proname) +{ + int i; + + if (!forValidator) + { + int inargno; + + /* normal case, pass to standard routine */ + if (!resolve_polymorphic_argtypes(numargs, argtypes, argmodes, + call_expr)) + ereport(ERROR, + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("could not determine actual argument " + "type for polymorphic function \"%s\"", + proname))); + /* also, treat RECORD inputs (but not outputs) as polymorphic */ + inargno = 0; + for (i = 0; i < numargs; i++) + { + char argmode = argmodes ? argmodes[i] : PROARGMODE_IN; + + if (argmode == PROARGMODE_OUT || argmode == PROARGMODE_TABLE) + continue; + if (argtypes[i] == RECORDOID || argtypes[i] == RECORDARRAYOID) + { + Oid resolvedtype = get_call_expr_argtype(call_expr, + inargno); + + if (OidIsValid(resolvedtype)) + argtypes[i] = resolvedtype; + } + inargno++; + } + } + else + { + /* special validation case (no need to do anything for RECORD) */ + for (i = 0; i < numargs; i++) + { + switch (argtypes[i]) + { + case ANYELEMENTOID: + case ANYNONARRAYOID: + case ANYENUMOID: /* XXX dubious */ + case ANYCOMPATIBLEOID: + case ANYCOMPATIBLENONARRAYOID: + argtypes[i] = INT4OID; + break; + case ANYARRAYOID: + case ANYCOMPATIBLEARRAYOID: + argtypes[i] = INT4ARRAYOID; + break; + case ANYRANGEOID: + case ANYCOMPATIBLERANGEOID: + argtypes[i] = INT4RANGEOID; + break; + case ANYMULTIRANGEOID: + argtypes[i] = INT4MULTIRANGEOID; + break; + default: + break; + } + } + } +} + +/* + * delete_function - clean up as much as possible of a stale function cache + * + * We can't release the CachedFunction struct itself, because of the + * possibility that there are fn_extra pointers to it. We can release + * the subsidiary storage, but only if there are no active evaluations + * in progress. Otherwise we'll just leak that storage. Since the + * case would only occur if a pg_proc update is detected during a nested + * recursive call on the function, a leak seems acceptable. + * + * Note that this can be called more than once if there are multiple fn_extra + * pointers to the same function cache. Hence be careful not to do things + * twice. + */ +static void +delete_function(CachedFunction *func) +{ + /* remove function from hash table (might be done already) */ + cfunc_hashtable_delete(func); + + /* release the function's storage if safe and not done already */ + if (func->use_count == 0 && + func->dcallback != NULL) + { + func->dcallback(func); + func->dcallback = NULL; + } +} + +/* + * Compile a cached function, if no existing cache entry is suitable. + * + * fcinfo is the current call information. + * + * function should be NULL or the result of a previous call of + * cached_function_compile() for the same fcinfo. The caller will + * typically save the result in fcinfo->flinfo->fn_extra, or in a + * field of a struct pointed to by fn_extra, to re-use in later + * calls within the same query. + * + * ccallback and dcallback are function-language-specific callbacks to + * compile and delete a cached function entry. dcallback can be NULL + * if there's nothing for it to do. + * + * cacheEntrySize is the function-language-specific size of the cache entry + * (which embeds a CachedFunction struct and typically has many more fields + * after that). + * + * If includeResultType is true and the function returns composite, + * include the actual result descriptor in the cache lookup key. + * + * If forValidator is true, we're only compiling for validation purposes, + * and so some checks are skipped. + * + * Note: it's important for this to fall through quickly if the function + * has already been compiled. + * + * Note: this function leaves the "use_count" field as zero. The caller + * is expected to increment the use_count and decrement it when done with + * the cache entry. + */ +CachedFunction * +cached_function_compile(FunctionCallInfo fcinfo, + CachedFunction *function, + CachedFunctionCompileCallback ccallback, + CachedFunctionDeleteCallback dcallback, + Size cacheEntrySize, + bool includeResultType, + bool forValidator) +{ + Oid funcOid = fcinfo->flinfo->fn_oid; + HeapTuple procTup; + Form_pg_proc procStruct; + CachedFunctionHashKey hashkey; + bool function_valid = false; + bool hashkey_valid = false; + + /* + * Lookup the pg_proc tuple by Oid; we'll need it in any case + */ + procTup = SearchSysCache1(PROCOID, ObjectIdGetDatum(funcOid)); + if (!HeapTupleIsValid(procTup)) + elog(ERROR, "cache lookup failed for function %u", funcOid); + procStruct = (Form_pg_proc) GETSTRUCT(procTup); + + /* + * Do we already have a cache entry for the current FmgrInfo? If not, try + * to find one in the hash table. + */ +recheck: + if (!function) + { + /* Compute hashkey using function signature and actual arg types */ + compute_function_hashkey(fcinfo, procStruct, &hashkey, + cacheEntrySize, includeResultType, + forValidator); + hashkey_valid = true; + + /* And do the lookup */ + function = cfunc_hashtable_lookup(&hashkey); + } + + if (function) + { + /* We have a compiled function, but is it still valid? */ + if (function->fn_xmin == HeapTupleHeaderGetRawXmin(procTup->t_data) && + ItemPointerEquals(&function->fn_tid, &procTup->t_self)) + function_valid = true; + else + { + /* + * Nope, so remove it from hashtable and try to drop associated + * storage (if not done already). + */ + delete_function(function); + + /* + * If the function isn't in active use then we can overwrite the + * func struct with new data, allowing any other existing fn_extra + * pointers to make use of the new definition on their next use. + * If it is in use then just leave it alone and make a new one. + * (The active invocations will run to completion using the + * previous definition, and then the cache entry will just be + * leaked; doesn't seem worth adding code to clean it up, given + * what a corner case this is.) + * + * If we found the function struct via fn_extra then it's possible + * a replacement has already been made, so go back and recheck the + * hashtable. + */ + if (function->use_count != 0) + { + function = NULL; + if (!hashkey_valid) + goto recheck; + } + } + } + + /* + * If the function wasn't found or was out-of-date, we have to compile it. + */ + if (!function_valid) + { + /* + * Calculate hashkey if we didn't already; we'll need it to store the + * completed function. + */ + if (!hashkey_valid) + compute_function_hashkey(fcinfo, procStruct, &hashkey, + cacheEntrySize, includeResultType, + forValidator); + + /* + * Create the new function struct, if not done already. The function + * structs are never thrown away, so keep them in TopMemoryContext. + */ + Assert(cacheEntrySize >= sizeof(CachedFunction)); + if (function == NULL) + { + function = (CachedFunction *) + MemoryContextAllocZero(TopMemoryContext, cacheEntrySize); + } + else + { + /* re-using a previously existing struct, so clear it out */ + memset(function, 0, cacheEntrySize); + } + + /* + * Fill in the CachedFunction part. fn_hashkey and use_count remain + * zeroes for now. + */ + function->fn_xmin = HeapTupleHeaderGetRawXmin(procTup->t_data); + function->fn_tid = procTup->t_self; + function->dcallback = dcallback; + + /* + * Do the hard, language-specific part. + */ + ccallback(fcinfo, procTup, &hashkey, function, forValidator); + + /* + * Add the completed struct to the hash table. + */ + cfunc_hashtable_insert(function, &hashkey); + } + + ReleaseSysCache(procTup); + + /* + * Finally return the compiled function + */ + return function; +} diff --git a/src/backend/utils/cache/meson.build b/src/backend/utils/cache/meson.build index 104b28737d7..a1784dce585 100644 --- a/src/backend/utils/cache/meson.build +++ b/src/backend/utils/cache/meson.build @@ -4,6 +4,7 @@ backend_sources += files( 'attoptcache.c', 'catcache.c', 'evtcache.c', + 'funccache.c', 'inval.c', 'lsyscache.c', 'partcache.c', diff --git a/src/include/utils/funccache.h b/src/include/utils/funccache.h new file mode 100644 index 00000000000..e0112ebfa11 --- /dev/null +++ b/src/include/utils/funccache.h @@ -0,0 +1,134 @@ +/*------------------------------------------------------------------------- + * + * funccache.h + * Function cache definitions. + * + * See funccache.c for comments. + * + * Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group + * Portions Copyright (c) 1994, Regents of the University of California + * + * src/include/utils/funccache.h + * + *------------------------------------------------------------------------- + */ +#ifndef FUNCCACHE_H +#define FUNCCACHE_H + +#include "access/htup_details.h" +#include "fmgr.h" +#include "storage/itemptr.h" + +struct CachedFunctionHashKey; /* forward references */ +struct CachedFunction; + +/* + * Callback that cached_function_compile() invokes when it's necessary to + * compile a cached function. The callback must fill in *function (except + * for the fields of struct CachedFunction), or throw an error if trouble. + * fcinfo: current call information + * procTup: function's pg_proc row from catcache + * hashkey: hash key that will be used for the function + * function: pre-zeroed workspace, of size passed to cached_function_compile() + * forValidator: passed through from cached_function_compile() + */ +typedef void (*CachedFunctionCompileCallback) (FunctionCallInfo fcinfo, + HeapTuple procTup, + const struct CachedFunctionHashKey *hashkey, + struct CachedFunction *function, + bool forValidator); + +/* + * Callback called when discarding a cache entry. Free any free-able + * subsidiary data of cfunc, but not the struct CachedFunction itself. + */ +typedef void (*CachedFunctionDeleteCallback) (struct CachedFunction *cfunc); + +/* + * Hash lookup key for functions. This must account for all aspects + * of a specific call that might lead to different data types or + * collations being used within the function. + */ +typedef struct CachedFunctionHashKey +{ + Oid funcOid; + + bool isTrigger; /* true if called as a DML trigger */ + bool isEventTrigger; /* true if called as an event trigger */ + + /* be careful that pad bytes in this struct get zeroed! */ + + /* + * We include the language-specific size of the function's cache entry in + * the cache key. This covers the case where CREATE OR REPLACE FUNCTION + * is used to change the implementation language, and the new language + * also uses funccache.c but needs a different-sized cache entry. + */ + Size cacheEntrySize; + + /* + * For a trigger function, the OID of the trigger is part of the hash key + * --- we want to compile the trigger function separately for each trigger + * it is used with, in case the rowtype or transition table names are + * different. Zero if not called as a DML trigger. + */ + Oid trigOid; + + /* + * We must include the input collation as part of the hash key too, + * because we have to generate different plans (with different Param + * collations) for different collation settings. + */ + Oid inputCollation; + + /* Number of arguments (counting input arguments only, ie pronargs) */ + int nargs; + + /* If you change anything below here, fix hashing code in funccache.c! */ + + /* + * If relevant, the result descriptor for a function returning composite. + */ + TupleDesc callResultType; + + /* + * Input argument types, with any polymorphic types resolved to actual + * types. Only the first nargs entries are valid. + */ + Oid argtypes[FUNC_MAX_ARGS]; +} CachedFunctionHashKey; + +/* + * Representation of a compiled function. This struct contains just the + * fields that funccache.c needs to deal with. It will typically be + * embedded in a larger struct containing function-language-specific data. + */ +typedef struct CachedFunction +{ + /* back-link to hashtable entry, or NULL if not in hash table */ + CachedFunctionHashKey *fn_hashkey; + /* xmin and ctid of function's pg_proc row; used to detect invalidation */ + TransactionId fn_xmin; + ItemPointerData fn_tid; + /* deletion callback */ + CachedFunctionDeleteCallback dcallback; + + /* this field changes when the function is used: */ + uint64 use_count; +} CachedFunction; + +extern CachedFunction *cached_function_compile(FunctionCallInfo fcinfo, + CachedFunction *function, + CachedFunctionCompileCallback ccallback, + CachedFunctionDeleteCallback dcallback, + Size cacheEntrySize, + bool includeResultType, + bool forValidator); +extern void cfunc_resolve_polymorphic_argtypes(int numargs, + Oid *argtypes, + char *argmodes, + Node *call_expr, + bool forValidator, + const char *proname); + +#endif /* FUNCCACHE_H */ diff --git a/src/pl/plpgsql/src/pl_comp.c b/src/pl/plpgsql/src/pl_comp.c index 6fdba95962d..1a091d0c55f 100644 --- a/src/pl/plpgsql/src/pl_comp.c +++ b/src/pl/plpgsql/src/pl_comp.c @@ -52,20 +52,6 @@ PLpgSQL_function *plpgsql_curr_compile; /* A context appropriate for short-term allocs during compilation */ MemoryContext plpgsql_compile_tmp_cxt; -/* ---------- - * Hash table for compiled functions - * ---------- - */ -static HTAB *plpgsql_HashTable = NULL; - -typedef struct plpgsql_hashent -{ - PLpgSQL_func_hashkey key; - PLpgSQL_function *function; -} plpgsql_HashEnt; - -#define FUNCS_PER_USER 128 /* initial table size */ - /* ---------- * Lookup table for EXCEPTION condition names * ---------- @@ -86,11 +72,11 @@ static const ExceptionLabelMap exception_label_map[] = { * static prototypes * ---------- */ -static PLpgSQL_function *do_compile(FunctionCallInfo fcinfo, - HeapTuple procTup, - PLpgSQL_function *function, - PLpgSQL_func_hashkey *hashkey, - bool forValidator); +static void plpgsql_compile_callback(FunctionCallInfo fcinfo, + HeapTuple procTup, + const CachedFunctionHashKey *hashkey, + CachedFunction *cfunc, + bool forValidator); static void plpgsql_compile_error_callback(void *arg); static void add_parameter_name(PLpgSQL_nsitem_type itemtype, int itemno, const char *name); static void add_dummy_return(PLpgSQL_function *function); @@ -105,19 +91,6 @@ static PLpgSQL_type *build_datatype(HeapTuple typeTup, int32 typmod, Oid collation, TypeName *origtypname); static void plpgsql_start_datums(void); static void plpgsql_finish_datums(PLpgSQL_function *function); -static void compute_function_hashkey(FunctionCallInfo fcinfo, - Form_pg_proc procStruct, - PLpgSQL_func_hashkey *hashkey, - bool forValidator); -static void plpgsql_resolve_polymorphic_argtypes(int numargs, - Oid *argtypes, char *argmodes, - Node *call_expr, bool forValidator, - const char *proname); -static PLpgSQL_function *plpgsql_HashTableLookup(PLpgSQL_func_hashkey *func_key); -static void plpgsql_HashTableInsert(PLpgSQL_function *function, - PLpgSQL_func_hashkey *func_key); -static void plpgsql_HashTableDelete(PLpgSQL_function *function); -static void delete_function(PLpgSQL_function *func); /* ---------- * plpgsql_compile Make an execution tree for a PL/pgSQL function. @@ -132,97 +105,24 @@ static void delete_function(PLpgSQL_function *func); PLpgSQL_function * plpgsql_compile(FunctionCallInfo fcinfo, bool forValidator) { - Oid funcOid = fcinfo->flinfo->fn_oid; - HeapTuple procTup; - Form_pg_proc procStruct; PLpgSQL_function *function; - PLpgSQL_func_hashkey hashkey; - bool function_valid = false; - bool hashkey_valid = false; - - /* - * Lookup the pg_proc tuple by Oid; we'll need it in any case - */ - procTup = SearchSysCache1(PROCOID, ObjectIdGetDatum(funcOid)); - if (!HeapTupleIsValid(procTup)) - elog(ERROR, "cache lookup failed for function %u", funcOid); - procStruct = (Form_pg_proc) GETSTRUCT(procTup); - - /* - * See if there's already a cache entry for the current FmgrInfo. If not, - * try to find one in the hash table. - */ - function = (PLpgSQL_function *) fcinfo->flinfo->fn_extra; - -recheck: - if (!function) - { - /* Compute hashkey using function signature and actual arg types */ - compute_function_hashkey(fcinfo, procStruct, &hashkey, forValidator); - hashkey_valid = true; - - /* And do the lookup */ - function = plpgsql_HashTableLookup(&hashkey); - } - - if (function) - { - /* We have a compiled function, but is it still valid? */ - if (function->fn_xmin == HeapTupleHeaderGetRawXmin(procTup->t_data) && - ItemPointerEquals(&function->fn_tid, &procTup->t_self)) - function_valid = true; - else - { - /* - * Nope, so remove it from hashtable and try to drop associated - * storage (if not done already). - */ - delete_function(function); - - /* - * If the function isn't in active use then we can overwrite the - * func struct with new data, allowing any other existing fn_extra - * pointers to make use of the new definition on their next use. - * If it is in use then just leave it alone and make a new one. - * (The active invocations will run to completion using the - * previous definition, and then the cache entry will just be - * leaked; doesn't seem worth adding code to clean it up, given - * what a corner case this is.) - * - * If we found the function struct via fn_extra then it's possible - * a replacement has already been made, so go back and recheck the - * hashtable. - */ - if (function->use_count != 0) - { - function = NULL; - if (!hashkey_valid) - goto recheck; - } - } - } /* - * If the function wasn't found or was out-of-date, we have to compile it + * funccache.c manages re-use of existing PLpgSQL_function caches. + * + * In PL/pgSQL we use fn_extra directly as the pointer to the long-lived + * function cache entry; we have no need for any query-lifespan cache. + * Also, we don't need to make the cache key depend on composite result + * type (at least for now). */ - if (!function_valid) - { - /* - * Calculate hashkey if we didn't already; we'll need it to store the - * completed function. - */ - if (!hashkey_valid) - compute_function_hashkey(fcinfo, procStruct, &hashkey, - forValidator); - - /* - * Do the hard part. - */ - function = do_compile(fcinfo, procTup, function, - &hashkey, forValidator); - } - - ReleaseSysCache(procTup); + function = (PLpgSQL_function *) + cached_function_compile(fcinfo, + fcinfo->flinfo->fn_extra, + plpgsql_compile_callback, + plpgsql_delete_callback, + sizeof(PLpgSQL_function), + false, + forValidator); /* * Save pointer in FmgrInfo to avoid search on subsequent calls @@ -244,8 +144,8 @@ struct compile_error_callback_arg /* * This is the slow part of plpgsql_compile(). * - * The passed-in "function" pointer is either NULL or an already-allocated - * function struct to overwrite. + * The passed-in "cfunc" struct is expected to be zeroes, except + * for the CachedFunction fields, which we don't touch here. * * While compiling a function, the CurrentMemoryContext is the * per-function memory context of the function we are compiling. That @@ -263,13 +163,14 @@ struct compile_error_callback_arg * NB: this code is not re-entrant. We assume that nothing we do here could * result in the invocation of another plpgsql function. */ -static PLpgSQL_function * -do_compile(FunctionCallInfo fcinfo, - HeapTuple procTup, - PLpgSQL_function *function, - PLpgSQL_func_hashkey *hashkey, - bool forValidator) +static void +plpgsql_compile_callback(FunctionCallInfo fcinfo, + HeapTuple procTup, + const CachedFunctionHashKey *hashkey, + CachedFunction *cfunc, + bool forValidator) { + PLpgSQL_function *function = (PLpgSQL_function *) cfunc; Form_pg_proc procStruct = (Form_pg_proc) GETSTRUCT(procTup); bool is_dml_trigger = CALLED_AS_TRIGGER(fcinfo); bool is_event_trigger = CALLED_AS_EVENT_TRIGGER(fcinfo); @@ -320,21 +221,6 @@ do_compile(FunctionCallInfo fcinfo, * reasons. */ plpgsql_check_syntax = forValidator; - - /* - * Create the new function struct, if not done already. The function - * structs are never thrown away, so keep them in TopMemoryContext. - */ - if (function == NULL) - { - function = (PLpgSQL_function *) - MemoryContextAllocZero(TopMemoryContext, sizeof(PLpgSQL_function)); - } - else - { - /* re-using a previously existing struct, so clear it out */ - memset(function, 0, sizeof(PLpgSQL_function)); - } plpgsql_curr_compile = function; /* @@ -349,8 +235,6 @@ do_compile(FunctionCallInfo fcinfo, function->fn_signature = format_procedure(fcinfo->flinfo->fn_oid); MemoryContextSetIdentifier(func_cxt, function->fn_signature); function->fn_oid = fcinfo->flinfo->fn_oid; - function->fn_xmin = HeapTupleHeaderGetRawXmin(procTup->t_data); - function->fn_tid = procTup->t_self; function->fn_input_collation = fcinfo->fncollation; function->fn_cxt = func_cxt; function->out_param_varno = -1; /* set up for no OUT param */ @@ -400,10 +284,14 @@ do_compile(FunctionCallInfo fcinfo, numargs = get_func_arg_info(procTup, &argtypes, &argnames, &argmodes); - plpgsql_resolve_polymorphic_argtypes(numargs, argtypes, argmodes, - fcinfo->flinfo->fn_expr, - forValidator, - plpgsql_error_funcname); + /* + * XXX can't we get rid of this in favor of using funccache.c's + * results? But why are we considering argmodes here not there?? + */ + cfunc_resolve_polymorphic_argtypes(numargs, argtypes, argmodes, + fcinfo->flinfo->fn_expr, + forValidator, + plpgsql_error_funcname); in_arg_varnos = (int *) palloc(numargs * sizeof(int)); out_arg_variables = (PLpgSQL_variable **) palloc(numargs * sizeof(PLpgSQL_variable *)); @@ -819,11 +707,6 @@ do_compile(FunctionCallInfo fcinfo, if (plpgsql_DumpExecTree) plpgsql_dumptree(function); - /* - * add it to the hash table - */ - plpgsql_HashTableInsert(function, hashkey); - /* * Pop the error context stack */ @@ -834,14 +717,13 @@ do_compile(FunctionCallInfo fcinfo, MemoryContextSwitchTo(plpgsql_compile_tmp_cxt); plpgsql_compile_tmp_cxt = NULL; - return function; } /* ---------- * plpgsql_compile_inline Make an execution tree for an anonymous code block. * - * Note: this is generally parallel to do_compile(); is it worth trying to - * merge the two? + * Note: this is generally parallel to plpgsql_compile_callback(); is it worth + * trying to merge the two? * * Note: we assume the block will be thrown away so there is no need to build * persistent data structures. @@ -2437,242 +2319,3 @@ plpgsql_add_initdatums(int **varnos) datums_last = plpgsql_nDatums; return n; } - - -/* - * Compute the hashkey for a given function invocation - * - * The hashkey is returned into the caller-provided storage at *hashkey. - */ -static void -compute_function_hashkey(FunctionCallInfo fcinfo, - Form_pg_proc procStruct, - PLpgSQL_func_hashkey *hashkey, - bool forValidator) -{ - /* Make sure any unused bytes of the struct are zero */ - MemSet(hashkey, 0, sizeof(PLpgSQL_func_hashkey)); - - /* get function OID */ - hashkey->funcOid = fcinfo->flinfo->fn_oid; - - /* get call context */ - hashkey->isTrigger = CALLED_AS_TRIGGER(fcinfo); - hashkey->isEventTrigger = CALLED_AS_EVENT_TRIGGER(fcinfo); - - /* - * If DML trigger, include trigger's OID in the hash, so that each trigger - * usage gets a different hash entry, allowing for e.g. different relation - * rowtypes or transition table names. In validation mode we do not know - * what relation or transition table names are intended to be used, so we - * leave trigOid zero; the hash entry built in this case will never be - * used for any actual calls. - * - * We don't currently need to distinguish different event trigger usages - * in the same way, since the special parameter variables don't vary in - * type in that case. - */ - if (hashkey->isTrigger && !forValidator) - { - TriggerData *trigdata = (TriggerData *) fcinfo->context; - - hashkey->trigOid = trigdata->tg_trigger->tgoid; - } - - /* get input collation, if known */ - hashkey->inputCollation = fcinfo->fncollation; - - if (procStruct->pronargs > 0) - { - /* get the argument types */ - memcpy(hashkey->argtypes, procStruct->proargtypes.values, - procStruct->pronargs * sizeof(Oid)); - - /* resolve any polymorphic argument types */ - plpgsql_resolve_polymorphic_argtypes(procStruct->pronargs, - hashkey->argtypes, - NULL, - fcinfo->flinfo->fn_expr, - forValidator, - NameStr(procStruct->proname)); - } -} - -/* - * This is the same as the standard resolve_polymorphic_argtypes() function, - * except that: - * 1. We go ahead and report the error if we can't resolve the types. - * 2. We treat RECORD-type input arguments (not output arguments) as if - * they were polymorphic, replacing their types with the actual input - * types if we can determine those. This allows us to create a separate - * function cache entry for each named composite type passed to such an - * argument. - * 3. In validation mode, we have no inputs to look at, so assume that - * polymorphic arguments are integer, integer-array or integer-range. - */ -static void -plpgsql_resolve_polymorphic_argtypes(int numargs, - Oid *argtypes, char *argmodes, - Node *call_expr, bool forValidator, - const char *proname) -{ - int i; - - if (!forValidator) - { - int inargno; - - /* normal case, pass to standard routine */ - if (!resolve_polymorphic_argtypes(numargs, argtypes, argmodes, - call_expr)) - ereport(ERROR, - (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), - errmsg("could not determine actual argument " - "type for polymorphic function \"%s\"", - proname))); - /* also, treat RECORD inputs (but not outputs) as polymorphic */ - inargno = 0; - for (i = 0; i < numargs; i++) - { - char argmode = argmodes ? argmodes[i] : PROARGMODE_IN; - - if (argmode == PROARGMODE_OUT || argmode == PROARGMODE_TABLE) - continue; - if (argtypes[i] == RECORDOID || argtypes[i] == RECORDARRAYOID) - { - Oid resolvedtype = get_call_expr_argtype(call_expr, - inargno); - - if (OidIsValid(resolvedtype)) - argtypes[i] = resolvedtype; - } - inargno++; - } - } - else - { - /* special validation case (no need to do anything for RECORD) */ - for (i = 0; i < numargs; i++) - { - switch (argtypes[i]) - { - case ANYELEMENTOID: - case ANYNONARRAYOID: - case ANYENUMOID: /* XXX dubious */ - case ANYCOMPATIBLEOID: - case ANYCOMPATIBLENONARRAYOID: - argtypes[i] = INT4OID; - break; - case ANYARRAYOID: - case ANYCOMPATIBLEARRAYOID: - argtypes[i] = INT4ARRAYOID; - break; - case ANYRANGEOID: - case ANYCOMPATIBLERANGEOID: - argtypes[i] = INT4RANGEOID; - break; - case ANYMULTIRANGEOID: - argtypes[i] = INT4MULTIRANGEOID; - break; - default: - break; - } - } - } -} - -/* - * delete_function - clean up as much as possible of a stale function cache - * - * We can't release the PLpgSQL_function struct itself, because of the - * possibility that there are fn_extra pointers to it. We can release - * the subsidiary storage, but only if there are no active evaluations - * in progress. Otherwise we'll just leak that storage. Since the - * case would only occur if a pg_proc update is detected during a nested - * recursive call on the function, a leak seems acceptable. - * - * Note that this can be called more than once if there are multiple fn_extra - * pointers to the same function cache. Hence be careful not to do things - * twice. - */ -static void -delete_function(PLpgSQL_function *func) -{ - /* remove function from hash table (might be done already) */ - plpgsql_HashTableDelete(func); - - /* release the function's storage if safe and not done already */ - if (func->use_count == 0) - plpgsql_free_function_memory(func); -} - -/* exported so we can call it from _PG_init() */ -void -plpgsql_HashTableInit(void) -{ - HASHCTL ctl; - - /* don't allow double-initialization */ - Assert(plpgsql_HashTable == NULL); - - ctl.keysize = sizeof(PLpgSQL_func_hashkey); - ctl.entrysize = sizeof(plpgsql_HashEnt); - plpgsql_HashTable = hash_create("PLpgSQL function hash", - FUNCS_PER_USER, - &ctl, - HASH_ELEM | HASH_BLOBS); -} - -static PLpgSQL_function * -plpgsql_HashTableLookup(PLpgSQL_func_hashkey *func_key) -{ - plpgsql_HashEnt *hentry; - - hentry = (plpgsql_HashEnt *) hash_search(plpgsql_HashTable, - func_key, - HASH_FIND, - NULL); - if (hentry) - return hentry->function; - else - return NULL; -} - -static void -plpgsql_HashTableInsert(PLpgSQL_function *function, - PLpgSQL_func_hashkey *func_key) -{ - plpgsql_HashEnt *hentry; - bool found; - - hentry = (plpgsql_HashEnt *) hash_search(plpgsql_HashTable, - func_key, - HASH_ENTER, - &found); - if (found) - elog(WARNING, "trying to insert a function that already exists"); - - hentry->function = function; - /* prepare back link from function to hashtable key */ - function->fn_hashkey = &hentry->key; -} - -static void -plpgsql_HashTableDelete(PLpgSQL_function *function) -{ - plpgsql_HashEnt *hentry; - - /* do nothing if not in table */ - if (function->fn_hashkey == NULL) - return; - - hentry = (plpgsql_HashEnt *) hash_search(plpgsql_HashTable, - function->fn_hashkey, - HASH_REMOVE, - NULL); - if (hentry == NULL) - elog(WARNING, "trying to delete function that does not exist"); - - /* remove back link, which no longer points to allocated storage */ - function->fn_hashkey = NULL; -} diff --git a/src/pl/plpgsql/src/pl_funcs.c b/src/pl/plpgsql/src/pl_funcs.c index 6b5394fc5fa..bc7a61feb4d 100644 --- a/src/pl/plpgsql/src/pl_funcs.c +++ b/src/pl/plpgsql/src/pl_funcs.c @@ -718,7 +718,7 @@ plpgsql_free_function_memory(PLpgSQL_function *func) int i; /* Better not call this on an in-use function */ - Assert(func->use_count == 0); + Assert(func->cfunc.use_count == 0); /* Release plans associated with variable declarations */ for (i = 0; i < func->ndatums; i++) @@ -767,6 +767,13 @@ plpgsql_free_function_memory(PLpgSQL_function *func) func->fn_cxt = NULL; } +/* Deletion callback used by funccache.c */ +void +plpgsql_delete_callback(CachedFunction *cfunc) +{ + plpgsql_free_function_memory((PLpgSQL_function *) cfunc); +} + /********************************************************************** * Debug functions for analyzing the compiled code diff --git a/src/pl/plpgsql/src/pl_handler.c b/src/pl/plpgsql/src/pl_handler.c index 1bf12232862..e9a72929947 100644 --- a/src/pl/plpgsql/src/pl_handler.c +++ b/src/pl/plpgsql/src/pl_handler.c @@ -202,7 +202,6 @@ _PG_init(void) MarkGUCPrefixReserved("plpgsql"); - plpgsql_HashTableInit(); RegisterXactCallback(plpgsql_xact_cb, NULL); RegisterSubXactCallback(plpgsql_subxact_cb, NULL); @@ -247,7 +246,7 @@ plpgsql_call_handler(PG_FUNCTION_ARGS) save_cur_estate = func->cur_estate; /* Mark the function as busy, so it can't be deleted from under us */ - func->use_count++; + func->cfunc.use_count++; /* * If we'll need a procedure-lifespan resowner to execute any CALL or DO @@ -284,7 +283,7 @@ plpgsql_call_handler(PG_FUNCTION_ARGS) PG_FINALLY(); { /* Decrement use-count, restore cur_estate */ - func->use_count--; + func->cfunc.use_count--; func->cur_estate = save_cur_estate; /* Be sure to release the procedure resowner if any */ @@ -334,7 +333,7 @@ plpgsql_inline_handler(PG_FUNCTION_ARGS) func = plpgsql_compile_inline(codeblock->source_text); /* Mark the function as busy, just pro forma */ - func->use_count++; + func->cfunc.use_count++; /* * Set up a fake fcinfo with just enough info to satisfy @@ -398,8 +397,8 @@ plpgsql_inline_handler(PG_FUNCTION_ARGS) ResourceOwnerDelete(simple_eval_resowner); /* Function should now have no remaining use-counts ... */ - func->use_count--; - Assert(func->use_count == 0); + func->cfunc.use_count--; + Assert(func->cfunc.use_count == 0); /* ... so we can free subsidiary storage */ plpgsql_free_function_memory(func); @@ -415,8 +414,8 @@ plpgsql_inline_handler(PG_FUNCTION_ARGS) ResourceOwnerDelete(simple_eval_resowner); /* Function should now have no remaining use-counts ... */ - func->use_count--; - Assert(func->use_count == 0); + func->cfunc.use_count--; + Assert(func->cfunc.use_count == 0); /* ... so we can free subsidiary storage */ plpgsql_free_function_memory(func); diff --git a/src/pl/plpgsql/src/plpgsql.h b/src/pl/plpgsql/src/plpgsql.h index b67847b5111..41e52b8ce71 100644 --- a/src/pl/plpgsql/src/plpgsql.h +++ b/src/pl/plpgsql/src/plpgsql.h @@ -21,6 +21,7 @@ #include "commands/trigger.h" #include "executor/spi.h" #include "utils/expandedrecord.h" +#include "utils/funccache.h" #include "utils/typcache.h" @@ -941,40 +942,6 @@ typedef struct PLpgSQL_stmt_dynexecute List *params; /* USING expressions */ } PLpgSQL_stmt_dynexecute; -/* - * Hash lookup key for functions - */ -typedef struct PLpgSQL_func_hashkey -{ - Oid funcOid; - - bool isTrigger; /* true if called as a DML trigger */ - bool isEventTrigger; /* true if called as an event trigger */ - - /* be careful that pad bytes in this struct get zeroed! */ - - /* - * For a trigger function, the OID of the trigger is part of the hash key - * --- we want to compile the trigger function separately for each trigger - * it is used with, in case the rowtype or transition table names are - * different. Zero if not called as a DML trigger. - */ - Oid trigOid; - - /* - * We must include the input collation as part of the hash key too, - * because we have to generate different plans (with different Param - * collations) for different collation settings. - */ - Oid inputCollation; - - /* - * We include actual argument types in the hash key to support polymorphic - * PLpgSQL functions. Be careful that extra positions are zeroed! - */ - Oid argtypes[FUNC_MAX_ARGS]; -} PLpgSQL_func_hashkey; - /* * Trigger type */ @@ -990,13 +957,12 @@ typedef enum PLpgSQL_trigtype */ typedef struct PLpgSQL_function { + CachedFunction cfunc; /* fields managed by funccache.c */ + char *fn_signature; Oid fn_oid; - TransactionId fn_xmin; - ItemPointerData fn_tid; PLpgSQL_trigtype fn_is_trigger; Oid fn_input_collation; - PLpgSQL_func_hashkey *fn_hashkey; /* back-link to hashtable key */ MemoryContext fn_cxt; Oid fn_rettype; @@ -1036,9 +1002,8 @@ typedef struct PLpgSQL_function bool requires_procedure_resowner; /* contains CALL or DO? */ bool has_exception_block; /* contains BEGIN...EXCEPTION? */ - /* these fields change when the function is used */ + /* this field changes when the function is used */ struct PLpgSQL_execstate *cur_estate; - unsigned long use_count; } PLpgSQL_function; /* @@ -1287,7 +1252,6 @@ extern PGDLLEXPORT int plpgsql_recognize_err_condition(const char *condname, extern PLpgSQL_condition *plpgsql_parse_err_condition(char *condname); extern void plpgsql_adddatum(PLpgSQL_datum *newdatum); extern int plpgsql_add_initdatums(int **varnos); -extern void plpgsql_HashTableInit(void); /* * Functions in pl_exec.c @@ -1335,6 +1299,7 @@ extern PGDLLEXPORT const char *plpgsql_stmt_typename(PLpgSQL_stmt *stmt); extern const char *plpgsql_getdiag_kindname(PLpgSQL_getdiag_kind kind); extern void plpgsql_mark_local_assignment_targets(PLpgSQL_function *func); extern void plpgsql_free_function_memory(PLpgSQL_function *func); +extern void plpgsql_delete_callback(CachedFunction *cfunc); extern void plpgsql_dumptree(PLpgSQL_function *func); /* diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list index ff75a508876..144c4e9662c 100644 --- a/src/tools/pgindent/typedefs.list +++ b/src/tools/pgindent/typedefs.list @@ -381,6 +381,11 @@ CURLM CURLoption CV CachedExpression +CachedFunction +CachedFunctionCompileCallback +CachedFunctionDeleteCallback +CachedFunctionHashEntry +CachedFunctionHashKey CachedPlan CachedPlanSource CallContext -- 2.43.5 From f0b52fe1844bf1be0a3fcc718069c7c1a41393e1 Mon Sep 17 00:00:00 2001 From: Tom Lane <tgl@sss.pgh.pa.us> Date: Sat, 29 Mar 2025 16:56:57 -0400 Subject: [PATCH v11 4/7] Restructure check_sql_fn_retval(). To support using the plan cache for SQL functions, we'll need to be able to redo the work of check_sql_fn_retval() on just one query's list-of-rewritten-queries at a time, since the plan cache will treat each query independently. This would be simple enough, except for a bizarre historical behavior: the existing code will take the last canSetTag query in the function as determining the result, even if it came from not-the-last original query. (The case is only possible when the last original query(s) are deleted by a DO INSTEAD NOTHING rule.) This behavior is undocumented except in source code comments, and it seems hard to believe that anyone's relying on it. It would be a mess to support with the plan cache, because a change in the rules applicable to some table could change which CachedPlanSource is supposed to produce the function result, even if the function itself has not changed. Let's just get rid of that silliness and insist that the last source query in the function is the one that must produce the result. Having mandated that, we can refactor check_sql_fn_retval() into an outer and an inner function where the inner one considers only a single list-of-rewritten-queries; the inner one will be usable in a post-rewrite callback hook as contemplated by the previous commit. Likewise refactor check_sql_fn_statements() so that we have a version that can be applied to just one list of Queries. (As things stand, it's not really necessary to recheck that during a replan, but maybe future changes in the rule system would create cases where it matters.) Also remove check_sql_fn_retval()'s targetlist output argument, putting the equivalent functionality into a separate function. This is needed because the plan cache would be in the way of passing that data directly. No outside caller needed that anyway. Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/8216639.NyiUUSuA9g@aivenlaptop --- src/backend/catalog/pg_proc.c | 2 +- src/backend/executor/functions.c | 176 ++++++++++++++++++--------- src/backend/optimizer/util/clauses.c | 4 +- src/include/executor/functions.h | 3 +- 4 files changed, 121 insertions(+), 64 deletions(-) diff --git a/src/backend/catalog/pg_proc.c b/src/backend/catalog/pg_proc.c index fe0490259e9..880b597fb3a 100644 --- a/src/backend/catalog/pg_proc.c +++ b/src/backend/catalog/pg_proc.c @@ -960,7 +960,7 @@ fmgr_sql_validator(PG_FUNCTION_ARGS) (void) check_sql_fn_retval(querytree_list, rettype, rettupdesc, proc->prokind, - false, NULL); + false); } error_context_stack = sqlerrcontext.previous; diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c index 6aa8e9c4d8a..5b06df84335 100644 --- a/src/backend/executor/functions.c +++ b/src/backend/executor/functions.c @@ -153,11 +153,16 @@ static Datum postquel_get_single_result(TupleTableSlot *slot, MemoryContext resultcontext); static void sql_exec_error_callback(void *arg); static void ShutdownSQLFunction(Datum arg); +static void check_sql_fn_statement(List *queryTreeList); +static bool check_sql_stmt_retval(List *queryTreeList, + Oid rettype, TupleDesc rettupdesc, + char prokind, bool insertDroppedCols); static bool coerce_fn_result_column(TargetEntry *src_tle, Oid res_type, int32 res_typmod, bool tlist_is_modifiable, List **upper_tlist, bool *upper_tlist_nontrivial); +static List *get_sql_fn_result_tlist(List *queryTreeList); static void sqlfunction_startup(DestReceiver *self, int operation, TupleDesc typeinfo); static bool sqlfunction_receive(TupleTableSlot *slot, DestReceiver *self); static void sqlfunction_shutdown(DestReceiver *self); @@ -592,7 +597,6 @@ init_sql_fcache(FunctionCallInfo fcinfo, Oid collation, bool lazyEvalOK) Form_pg_proc procedureStruct; SQLFunctionCachePtr fcache; List *queryTree_list; - List *resulttlist; ListCell *lc; Datum tmp; bool isNull; @@ -748,8 +752,7 @@ init_sql_fcache(FunctionCallInfo fcinfo, Oid collation, bool lazyEvalOK) rettype, rettupdesc, procedureStruct->prokind, - false, - &resulttlist); + false); /* * Construct a JunkFilter we can use to coerce the returned rowtype to the @@ -762,6 +765,14 @@ init_sql_fcache(FunctionCallInfo fcinfo, Oid collation, bool lazyEvalOK) { TupleTableSlot *slot = MakeSingleTupleTableSlot(NULL, &TTSOpsMinimalTuple); + List *resulttlist; + + /* + * Re-fetch the (possibly modified) output tlist of the final + * statement. By this point, we should have thrown an error if there + * is not one. + */ + resulttlist = get_sql_fn_result_tlist(llast_node(List, queryTree_list)); /* * If the result is composite, *and* we are returning the whole tuple @@ -1541,29 +1552,39 @@ check_sql_fn_statements(List *queryTreeLists) foreach(lc, queryTreeLists) { List *sublist = lfirst_node(List, lc); - ListCell *lc2; - foreach(lc2, sublist) - { - Query *query = lfirst_node(Query, lc2); + check_sql_fn_statement(sublist); + } +} - /* - * Disallow calling procedures with output arguments. The current - * implementation would just throw the output values away, unless - * the statement is the last one. Per SQL standard, we should - * assign the output values by name. By disallowing this here, we - * preserve an opportunity for future improvement. - */ - if (query->commandType == CMD_UTILITY && - IsA(query->utilityStmt, CallStmt)) - { - CallStmt *stmt = (CallStmt *) query->utilityStmt; +/* + * As above, for a single sublist of Queries. + */ +static void +check_sql_fn_statement(List *queryTreeList) +{ + ListCell *lc; - if (stmt->outargs != NIL) - ereport(ERROR, - (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), - errmsg("calling procedures with output arguments is not supported in SQL functions"))); - } + foreach(lc, queryTreeList) + { + Query *query = lfirst_node(Query, lc); + + /* + * Disallow calling procedures with output arguments. The current + * implementation would just throw the output values away, unless the + * statement is the last one. Per SQL standard, we should assign the + * output values by name. By disallowing this here, we preserve an + * opportunity for future improvement. + */ + if (query->commandType == CMD_UTILITY && + IsA(query->utilityStmt, CallStmt)) + { + CallStmt *stmt = (CallStmt *) query->utilityStmt; + + if (stmt->outargs != NIL) + ereport(ERROR, + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("calling procedures with output arguments is not supported in SQL functions"))); } } } @@ -1602,17 +1623,45 @@ check_sql_fn_statements(List *queryTreeLists) * In addition to coercing individual output columns, we can modify the * output to include dummy NULL columns for any dropped columns appearing * in rettupdesc. This is done only if the caller asks for it. - * - * If resultTargetList isn't NULL, then *resultTargetList is set to the - * targetlist that defines the final statement's result. Exception: if the - * function is defined to return VOID then *resultTargetList is set to NIL. */ bool check_sql_fn_retval(List *queryTreeLists, Oid rettype, TupleDesc rettupdesc, char prokind, - bool insertDroppedCols, - List **resultTargetList) + bool insertDroppedCols) +{ + List *queryTreeList; + + /* + * We consider only the last sublist of Query nodes, so that only the last + * original statement is a candidate to produce the result. This is a + * change from pre-v18 versions, which would back up to the last statement + * that includes a canSetTag query, thus ignoring any ending statement(s) + * that rewrite to DO INSTEAD NOTHING. That behavior was undocumented and + * there seems no good reason for it, except that it was an artifact of + * the original coding. + * + * If the function body is completely empty, handle that the same as if + * the last query had rewritten to nothing. + */ + if (queryTreeLists != NIL) + queryTreeList = llast_node(List, queryTreeLists); + else + queryTreeList = NIL; + + return check_sql_stmt_retval(queryTreeList, + rettype, rettupdesc, + prokind, insertDroppedCols); +} + +/* + * As for check_sql_fn_retval, but we are given just the last query's + * rewritten-queries list. + */ +static bool +check_sql_stmt_retval(List *queryTreeList, + Oid rettype, TupleDesc rettupdesc, + char prokind, bool insertDroppedCols) { bool is_tuple_result = false; Query *parse; @@ -1625,9 +1674,6 @@ check_sql_fn_retval(List *queryTreeLists, bool upper_tlist_nontrivial = false; ListCell *lc; - if (resultTargetList) - *resultTargetList = NIL; /* initialize in case of VOID result */ - /* * If it's declared to return VOID, we don't care what's in the function. * (This takes care of procedures with no output parameters, as well.) @@ -1636,30 +1682,20 @@ check_sql_fn_retval(List *queryTreeLists, return false; /* - * Find the last canSetTag query in the function body (which is presented - * to us as a list of sublists of Query nodes). This isn't necessarily - * the last parsetree, because rule rewriting can insert queries after - * what the user wrote. Note that it might not even be in the last - * sublist, for example if the last query rewrites to DO INSTEAD NOTHING. - * (It might not be unreasonable to throw an error in such a case, but - * this is the historical behavior and it doesn't seem worth changing.) + * Find the last canSetTag query in the list of Query nodes. This isn't + * necessarily the last parsetree, because rule rewriting can insert + * queries after what the user wrote. */ parse = NULL; parse_cell = NULL; - foreach(lc, queryTreeLists) + foreach(lc, queryTreeList) { - List *sublist = lfirst_node(List, lc); - ListCell *lc2; + Query *q = lfirst_node(Query, lc); - foreach(lc2, sublist) + if (q->canSetTag) { - Query *q = lfirst_node(Query, lc2); - - if (q->canSetTag) - { - parse = q; - parse_cell = lc2; - } + parse = q; + parse_cell = lc; } } @@ -1812,12 +1848,7 @@ check_sql_fn_retval(List *queryTreeLists, * further checking. Assume we're returning the whole tuple. */ if (rettupdesc == NULL) - { - /* Return tlist if requested */ - if (resultTargetList) - *resultTargetList = tlist; return true; - } /* * Verify that the targetlist matches the return tuple type. We scan @@ -1984,10 +2015,6 @@ tlist_coercion_finished: lfirst(parse_cell) = newquery; } - /* Return tlist (possibly modified) if requested */ - if (resultTargetList) - *resultTargetList = upper_tlist; - return is_tuple_result; } @@ -2063,6 +2090,37 @@ coerce_fn_result_column(TargetEntry *src_tle, return true; } +/* + * Extract the targetlist of the last canSetTag query in the given list + * of parsed-and-rewritten Queries. Returns NIL if there is none. + */ +static List * +get_sql_fn_result_tlist(List *queryTreeList) +{ + Query *parse = NULL; + ListCell *lc; + + foreach(lc, queryTreeList) + { + Query *q = lfirst_node(Query, lc); + + if (q->canSetTag) + parse = q; + } + if (parse && + parse->commandType == CMD_SELECT) + return parse->targetList; + else if (parse && + (parse->commandType == CMD_INSERT || + parse->commandType == CMD_UPDATE || + parse->commandType == CMD_DELETE || + parse->commandType == CMD_MERGE) && + parse->returningList) + return parse->returningList; + else + return NIL; +} + /* * CreateSQLFunctionDestReceiver -- create a suitable DestReceiver object diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c index 43dfecfb47f..816536ab865 100644 --- a/src/backend/optimizer/util/clauses.c +++ b/src/backend/optimizer/util/clauses.c @@ -4742,7 +4742,7 @@ inline_function(Oid funcid, Oid result_type, Oid result_collid, if (check_sql_fn_retval(list_make1(querytree_list), result_type, rettupdesc, funcform->prokind, - false, NULL)) + false)) goto fail; /* reject whole-tuple-result cases */ /* @@ -5288,7 +5288,7 @@ inline_set_returning_function(PlannerInfo *root, RangeTblEntry *rte) if (!check_sql_fn_retval(list_make1(querytree_list), fexpr->funcresulttype, rettupdesc, funcform->prokind, - true, NULL) && + true) && (functypclass == TYPEFUNC_COMPOSITE || functypclass == TYPEFUNC_COMPOSITE_DOMAIN || functypclass == TYPEFUNC_RECORD)) diff --git a/src/include/executor/functions.h b/src/include/executor/functions.h index a6ae2e72d79..58bdff9b039 100644 --- a/src/include/executor/functions.h +++ b/src/include/executor/functions.h @@ -48,8 +48,7 @@ extern void check_sql_fn_statements(List *queryTreeLists); extern bool check_sql_fn_retval(List *queryTreeLists, Oid rettype, TupleDesc rettupdesc, char prokind, - bool insertDroppedCols, - List **resultTargetList); + bool insertDroppedCols); extern DestReceiver *CreateSQLFunctionDestReceiver(void); -- 2.43.5 From d85adbeae4f83ce0355c8f626f7c8b0aa961735f Mon Sep 17 00:00:00 2001 From: Tom Lane <tgl@sss.pgh.pa.us> Date: Sat, 29 Mar 2025 20:31:50 -0400 Subject: [PATCH v11 5/7] Add a test case showing undesirable RLS behavior in SQL functions. In the historical implementation of SQL functions, once we have built a set of plans for a SQL function we'll continue to use them during subsequent function invocations in the same query. This isn't ideal, and this somewhat-contrived test case shows one reason why not: we don't notice changes in RLS-relevant state. I'm putting this as a separate patch in the series so that the change in behavior will be apparent. Author: Alexander Pyhalov <a.pyhalov@postgrespro.ru> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/8216639.NyiUUSuA9g@aivenlaptop --- src/test/regress/expected/rowsecurity.out | 59 +++++++++++++++++++++++ src/test/regress/sql/rowsecurity.sql | 44 +++++++++++++++++ 2 files changed, 103 insertions(+) diff --git a/src/test/regress/expected/rowsecurity.out b/src/test/regress/expected/rowsecurity.out index 87929191d06..8f2c8319172 100644 --- a/src/test/regress/expected/rowsecurity.out +++ b/src/test/regress/expected/rowsecurity.out @@ -4695,6 +4695,65 @@ RESET ROLE; DROP FUNCTION rls_f(); DROP VIEW rls_v; DROP TABLE rls_t; +-- Check that RLS changes invalidate SQL function plans +create table rls_t (c text); +create table test_t (c text); +insert into rls_t values ('a'), ('b'), ('c'), ('d'); +insert into test_t values ('a'), ('b'); +alter table rls_t enable row level security; +grant select on rls_t to regress_rls_alice; +grant select on test_t to regress_rls_alice; +create policy p1 on rls_t for select to regress_rls_alice + using (c = current_setting('rls_test.blah')); +-- Function changes row_security setting and so invalidates plan +create function rls_f(text) returns text +begin atomic + select set_config('rls_test.blah', $1, true) || set_config('row_security', 'false', true) || string_agg(c, ',' order byc) from rls_t; +end; +set plan_cache_mode to force_custom_plan; +-- Table owner bypasses RLS +select rls_f(c) from test_t order by rls_f; + rls_f +------------- + aoffa,b,c,d + boffa,b,c,d +(2 rows) + +-- For other users, changes in row_security setting +-- should lead to RLS error during query rewrite +set role regress_rls_alice; +select rls_f(c) from test_t order by rls_f; + rls_f +------- + boffa + +(2 rows) + +reset role; +set plan_cache_mode to force_generic_plan; +-- Table owner bypasses RLS, although cached plan will be invalidated +select rls_f(c) from test_t order by rls_f; + rls_f +------------- + aoffa,b,c,d + boffa,b,c,d +(2 rows) + +-- For other users, changes in row_security setting +-- should lead to plan invalidation and RLS error during query rewrite +set role regress_rls_alice; +select rls_f(c) from test_t order by rls_f; + rls_f +------- + boffa + +(2 rows) + +reset role; +reset plan_cache_mode; +reset rls_test.blah; +drop function rls_f(text); +drop table rls_t, test_t; -- -- Clean up objects -- diff --git a/src/test/regress/sql/rowsecurity.sql b/src/test/regress/sql/rowsecurity.sql index f61dbbf9581..9da967a9ef2 100644 --- a/src/test/regress/sql/rowsecurity.sql +++ b/src/test/regress/sql/rowsecurity.sql @@ -2307,6 +2307,50 @@ DROP FUNCTION rls_f(); DROP VIEW rls_v; DROP TABLE rls_t; +-- Check that RLS changes invalidate SQL function plans +create table rls_t (c text); +create table test_t (c text); +insert into rls_t values ('a'), ('b'), ('c'), ('d'); +insert into test_t values ('a'), ('b'); +alter table rls_t enable row level security; +grant select on rls_t to regress_rls_alice; +grant select on test_t to regress_rls_alice; +create policy p1 on rls_t for select to regress_rls_alice + using (c = current_setting('rls_test.blah')); + +-- Function changes row_security setting and so invalidates plan +create function rls_f(text) returns text +begin atomic + select set_config('rls_test.blah', $1, true) || set_config('row_security', 'false', true) || string_agg(c, ',' order byc) from rls_t; +end; + +set plan_cache_mode to force_custom_plan; + +-- Table owner bypasses RLS +select rls_f(c) from test_t order by rls_f; + +-- For other users, changes in row_security setting +-- should lead to RLS error during query rewrite +set role regress_rls_alice; +select rls_f(c) from test_t order by rls_f; +reset role; + +set plan_cache_mode to force_generic_plan; + +-- Table owner bypasses RLS, although cached plan will be invalidated +select rls_f(c) from test_t order by rls_f; + +-- For other users, changes in row_security setting +-- should lead to plan invalidation and RLS error during query rewrite +set role regress_rls_alice; +select rls_f(c) from test_t order by rls_f; +reset role; + +reset plan_cache_mode; +reset rls_test.blah; +drop function rls_f(text); +drop table rls_t, test_t; + -- -- Clean up objects -- -- 2.43.5 From 7427db23bb4de73dddfdbbf9d6ed902079894aa1 Mon Sep 17 00:00:00 2001 From: Tom Lane <tgl@sss.pgh.pa.us> Date: Sat, 29 Mar 2025 21:57:50 -0400 Subject: [PATCH v11 6/7] Change SQL-language functions to use the plan cache. In the historical implementation of SQL functions (when they don't get inlined), we built plans for the contained queries at first call within an outer query, and then re-used those plans for the duration of the outer query, and then forgot everything. This was not ideal, not least because the plans could not be customized to specific values of the function's parameters. Our plancache infrastructure seems mature enough to be used here. That will solve both the problem with not being able to build custom plans and the problem with not being able to share work across successive outer queries. Moreover, this reimplementation will react to events that should cause a replan at the next entry to the SQL function. This is illustrated in the change in the rowsecurity test, where we now detect an RLS context change that was previously ignored. (I also added a test in create_function_sql that exercises ShutdownSQLFunction(), after noting from coverage results that that wasn't getting reached.) Author: Alexander Pyhalov <a.pyhalov@postgrespro.ru> Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/8216639.NyiUUSuA9g@aivenlaptop --- src/backend/executor/functions.c | 1082 +++++++++++------ .../expected/test_extensions.out | 2 +- .../regress/expected/create_function_sql.out | 17 +- src/test/regress/expected/rowsecurity.out | 16 +- src/test/regress/sql/create_function_sql.sql | 9 + src/tools/pgindent/typedefs.list | 2 + 6 files changed, 738 insertions(+), 390 deletions(-) diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c index 5b06df84335..b5a9ecea637 100644 --- a/src/backend/executor/functions.c +++ b/src/backend/executor/functions.c @@ -31,6 +31,7 @@ #include "tcop/utility.h" #include "utils/builtins.h" #include "utils/datum.h" +#include "utils/funccache.h" #include "utils/lsyscache.h" #include "utils/memutils.h" #include "utils/snapmgr.h" @@ -50,7 +51,7 @@ typedef struct /* * We have an execution_state record for each query in a function. Each - * record contains a plantree for its query. If the query is currently in + * record references a plantree for its query. If the query is currently in * F_EXEC_RUN state then there's a QueryDesc too. * * The "next" fields chain together all the execution_state records generated @@ -74,24 +75,43 @@ typedef struct execution_state /* - * An SQLFunctionCache record is built during the first call, - * and linked to from the fn_extra field of the FmgrInfo struct. + * Data associated with a SQL-language function is kept in three main + * data structures: * - * Note that currently this has only the lifespan of the calling query. - * Someday we should rewrite this code to use plancache.c to save parse/plan - * results for longer than that. + * 1. SQLFunctionHashEntry is a long-lived (potentially session-lifespan) + * struct that holds all the info we need out of the function's pg_proc row. + * In addition it holds pointers to CachedPlanSource(s) that manage creation + * of plans for the query(s) within the function. A SQLFunctionHashEntry is + * potentially shared across multiple concurrent executions of the function, + * so it must contain no execution-specific state; but its use_count must + * reflect the number of SQLFunctionLink structs pointing at it. + * If the function's pg_proc row is updated, we throw away and regenerate + * the SQLFunctionHashEntry and subsidiary data. (Also note that if the + * function is polymorphic or used as a trigger, there is a separate + * SQLFunctionHashEntry for each usage, so that we need consider only one + * set of relevant data types.) The struct itself is in memory managed by + * funccache.c, and its subsidiary data is kept in hcontext ("hash context"). * - * Physically, though, the data has the lifespan of the FmgrInfo that's used - * to call the function, and there are cases (particularly with indexes) - * where the FmgrInfo might survive across transactions. We cannot assume - * that the parse/plan trees are good for longer than the (sub)transaction in - * which parsing was done, so we must mark the record with the LXID/subxid of - * its creation time, and regenerate everything if that's obsolete. To avoid - * memory leakage when we do have to regenerate things, all the data is kept - * in a sub-context of the FmgrInfo's fn_mcxt. + * 2. SQLFunctionCache lasts for the duration of a single execution of + * the SQL function. (In "lazyEval" mode, this might span multiple calls of + * fmgr_sql.) It holds a reference to the CachedPlan for the current query, + * and other data that is execution-specific. The SQLFunctionCache itself + * as well as its subsidiary data are kept in fcontext ("function context"), + * which we free at completion. In non-returnsSet mode, this is just a child + * of the call-time context. In returnsSet mode, it is made a child of the + * FmgrInfo's fn_mcxt so that it will survive between fmgr_sql calls. + * + * 3. SQLFunctionLink is a tiny struct that just holds pointers to + * the SQLFunctionHashEntry and the current SQLFunctionCache (if any). + * It is pointed to by the fn_extra field of the FmgrInfo struct, and is + * always allocated in the FmgrInfo's fn_mcxt. Its purpose is to reduce + * the cost of repeat lookups of the SQLFunctionHashEntry. */ -typedef struct + +typedef struct SQLFunctionHashEntry { + CachedFunction cfunc; /* fields managed by funccache.c */ + char *fname; /* function name (for error msgs) */ char *src; /* function body text (for error msgs) */ @@ -102,8 +122,25 @@ typedef struct bool typbyval; /* true if return type is pass by value */ bool returnsSet; /* true if returning multiple rows */ bool returnsTuple; /* true if returning whole tuple result */ - bool shutdown_reg; /* true if registered shutdown callback */ bool readonly_func; /* true to run in "read only" mode */ + char prokind; /* prokind from pg_proc row */ + + TupleDesc rettupdesc; /* result tuple descriptor */ + + List *plansource_list; /* CachedPlanSources for fn's queries */ + + /* if positive, this is the index of the query we're parsing */ + int error_query_index; + + MemoryContext hcontext; /* memory context holding all above */ +} SQLFunctionHashEntry; + +typedef struct SQLFunctionCache +{ + SQLFunctionHashEntry *func; /* associated SQLFunctionHashEntry */ + + bool lazyEvalOK; /* true if lazyEval is safe */ + bool shutdown_reg; /* true if registered shutdown callback */ bool lazyEval; /* true if using lazyEval for result query */ ParamListInfo paramLI; /* Param list representing current args */ @@ -112,23 +149,40 @@ typedef struct JunkFilter *junkFilter; /* will be NULL if function returns VOID */ + /* if positive, this is the index of the query we're executing */ + int error_query_index; + /* - * func_state is a List of execution_state records, each of which is the - * first for its original parsetree, with any additional records chained - * to it via the "next" fields. This sublist structure is needed to keep - * track of where the original query boundaries are. + * While executing a particular query within the function, cplan is the + * CachedPlan we've obtained for that query, and eslist is a list of + * execution_state records for the individual plans within the CachedPlan. + * next_query_index is the 0-based index of the next CachedPlanSource to + * get a CachedPlan from. */ - List *func_state; + CachedPlan *cplan; /* Plan for current query, if any */ + ResourceOwner cowner; /* CachedPlan is registered with this owner */ + execution_state *eslist; /* execution_state records */ + int next_query_index; /* index of next CachedPlanSource to run */ MemoryContext fcontext; /* memory context holding this struct and all * subsidiary data */ - - LocalTransactionId lxid; /* lxid in which cache was made */ - SubTransactionId subxid; /* subxid in which cache was made */ } SQLFunctionCache; typedef SQLFunctionCache *SQLFunctionCachePtr; +/* Struct pointed to by FmgrInfo.fn_extra for a SQL function */ +typedef struct SQLFunctionLink +{ + /* Permanent pointer to associated SQLFunctionHashEntry */ + SQLFunctionHashEntry *func; + + /* Transient pointer to SQLFunctionCache, used only if returnsSet */ + SQLFunctionCache *fcache; + + /* Callback to release our use-count on the SQLFunctionHashEntry */ + MemoryContextCallback mcb; +} SQLFunctionLink; + /* non-export function prototypes */ static Node *sql_fn_param_ref(ParseState *pstate, ParamRef *pref); @@ -138,10 +192,10 @@ static Node *sql_fn_make_param(SQLFunctionParseInfoPtr pinfo, int paramno, int location); static Node *sql_fn_resolve_param_name(SQLFunctionParseInfoPtr pinfo, const char *paramname, int location); -static List *init_execution_state(List *queryTree_list, - SQLFunctionCachePtr fcache, - bool lazyEvalOK); -static void init_sql_fcache(FunctionCallInfo fcinfo, Oid collation, bool lazyEvalOK); +static bool init_execution_state(SQLFunctionCachePtr fcache); +static void sql_postrewrite_callback(List *querytree_list, void *arg); +static SQLFunctionCache *init_sql_fcache(FunctionCallInfo fcinfo, + bool lazyEvalOK); static void postquel_start(execution_state *es, SQLFunctionCachePtr fcache); static bool postquel_getnext(execution_state *es, SQLFunctionCachePtr fcache); static void postquel_end(execution_state *es); @@ -151,8 +205,10 @@ static Datum postquel_get_single_result(TupleTableSlot *slot, FunctionCallInfo fcinfo, SQLFunctionCachePtr fcache, MemoryContext resultcontext); +static void sql_compile_error_callback(void *arg); static void sql_exec_error_callback(void *arg); static void ShutdownSQLFunction(Datum arg); +static void RemoveSQLFunctionLink(void *arg); static void check_sql_fn_statement(List *queryTreeList); static bool check_sql_stmt_retval(List *queryTreeList, Oid rettype, TupleDesc rettupdesc, @@ -460,99 +516,172 @@ sql_fn_resolve_param_name(SQLFunctionParseInfoPtr pinfo, } /* - * Set up the per-query execution_state records for a SQL function. + * Set up the per-query execution_state records for the next query within + * the SQL function. * - * The input is a List of Lists of parsed and rewritten, but not planned, - * querytrees. The sublist structure denotes the original query boundaries. + * Returns true if successful, false if there are no more queries. */ -static List * -init_execution_state(List *queryTree_list, - SQLFunctionCachePtr fcache, - bool lazyEvalOK) +static bool +init_execution_state(SQLFunctionCachePtr fcache) { - List *eslist = NIL; + CachedPlanSource *plansource; + execution_state *preves = NULL; execution_state *lasttages = NULL; - ListCell *lc1; + ListCell *lc; - foreach(lc1, queryTree_list) + /* + * Clean up after previous query, if there was one. Note that we just + * leak the old execution_state records until end of function execution; + * there aren't likely to be enough of them to matter. + */ + if (fcache->cplan) { - List *qtlist = lfirst_node(List, lc1); - execution_state *firstes = NULL; - execution_state *preves = NULL; - ListCell *lc2; + ReleaseCachedPlan(fcache->cplan, fcache->cowner); + fcache->cplan = NULL; + } + fcache->eslist = NULL; - foreach(lc2, qtlist) - { - Query *queryTree = lfirst_node(Query, lc2); - PlannedStmt *stmt; - execution_state *newes; + /* + * Get the next CachedPlanSource, or stop if there are no more. + */ + if (fcache->next_query_index >= list_length(fcache->func->plansource_list)) + return false; + plansource = (CachedPlanSource *) list_nth(fcache->func->plansource_list, + fcache->next_query_index); + fcache->next_query_index++; - /* Plan the query if needed */ - if (queryTree->commandType == CMD_UTILITY) - { - /* Utility commands require no planning. */ - stmt = makeNode(PlannedStmt); - stmt->commandType = CMD_UTILITY; - stmt->canSetTag = queryTree->canSetTag; - stmt->utilityStmt = queryTree->utilityStmt; - stmt->stmt_location = queryTree->stmt_location; - stmt->stmt_len = queryTree->stmt_len; - stmt->queryId = queryTree->queryId; - } - else - stmt = pg_plan_query(queryTree, - fcache->src, - CURSOR_OPT_PARALLEL_OK, - NULL); + /* Count source queries for sql_exec_error_callback */ + fcache->error_query_index++; - /* - * Precheck all commands for validity in a function. This should - * generally match the restrictions spi.c applies. - */ - if (stmt->commandType == CMD_UTILITY) - { - if (IsA(stmt->utilityStmt, CopyStmt) && - ((CopyStmt *) stmt->utilityStmt)->filename == NULL) - ereport(ERROR, - (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), - errmsg("cannot COPY to/from client in an SQL function"))); + /* + * Generate plans for the query or queries within this CachedPlanSource. + * Register the CachedPlan with the current resource owner. (Saving + * cowner here is mostly paranoia, but this way we needn't assume that + * CurrentResourceOwner will be the same when ShutdownSQLFunction runs.) + */ + fcache->cowner = CurrentResourceOwner; + fcache->cplan = GetCachedPlan(plansource, + fcache->paramLI, + fcache->cowner, + NULL); - if (IsA(stmt->utilityStmt, TransactionStmt)) - ereport(ERROR, - (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), - /* translator: %s is a SQL statement name */ - errmsg("%s is not allowed in an SQL function", - CreateCommandName(stmt->utilityStmt)))); - } + /* + * Build execution_state list to match the number of contained plans. + */ + foreach(lc, fcache->cplan->stmt_list) + { + PlannedStmt *stmt = lfirst_node(PlannedStmt, lc); + execution_state *newes; + + /* + * Precheck all commands for validity in a function. This should + * generally match the restrictions spi.c applies. + */ + if (stmt->commandType == CMD_UTILITY) + { + if (IsA(stmt->utilityStmt, CopyStmt) && + ((CopyStmt *) stmt->utilityStmt)->filename == NULL) + ereport(ERROR, + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("cannot COPY to/from client in an SQL function"))); - if (fcache->readonly_func && !CommandIsReadOnly(stmt)) + if (IsA(stmt->utilityStmt, TransactionStmt)) ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), /* translator: %s is a SQL statement name */ - errmsg("%s is not allowed in a non-volatile function", - CreateCommandName((Node *) stmt)))); + errmsg("%s is not allowed in an SQL function", + CreateCommandName(stmt->utilityStmt)))); + } - /* OK, build the execution_state for this query */ - newes = (execution_state *) palloc(sizeof(execution_state)); - if (preves) - preves->next = newes; - else - firstes = newes; + if (fcache->func->readonly_func && !CommandIsReadOnly(stmt)) + ereport(ERROR, + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + /* translator: %s is a SQL statement name */ + errmsg("%s is not allowed in a non-volatile function", + CreateCommandName((Node *) stmt)))); + + /* OK, build the execution_state for this query */ + newes = (execution_state *) palloc(sizeof(execution_state)); + if (preves) + preves->next = newes; + else + fcache->eslist = newes; - newes->next = NULL; - newes->status = F_EXEC_START; - newes->setsResult = false; /* might change below */ - newes->lazyEval = false; /* might change below */ - newes->stmt = stmt; - newes->qd = NULL; + newes->next = NULL; + newes->status = F_EXEC_START; + newes->setsResult = false; /* might change below */ + newes->lazyEval = false; /* might change below */ + newes->stmt = stmt; + newes->qd = NULL; - if (queryTree->canSetTag) - lasttages = newes; + if (stmt->canSetTag) + lasttages = newes; - preves = newes; - } + preves = newes; + } + + /* + * If this isn't the last CachedPlanSource, we're done here. Otherwise, + * we need to prepare information about how to return the results. + */ + if (fcache->next_query_index < list_length(fcache->func->plansource_list)) + return true; + + /* + * Construct a JunkFilter we can use to coerce the returned rowtype to the + * desired form, unless the result type is VOID, in which case there's + * nothing to coerce to. (XXX Frequently, the JunkFilter isn't doing + * anything very interesting, but much of this module expects it to be + * there anyway.) + */ + if (fcache->func->rettype != VOIDOID) + { + TupleTableSlot *slot = MakeSingleTupleTableSlot(NULL, + &TTSOpsMinimalTuple); + List *resulttlist; + + /* + * Re-fetch the (possibly modified) output tlist of the final + * statement. By this point, we should have thrown an error if there + * is not one. + */ + resulttlist = get_sql_fn_result_tlist(plansource->query_list); - eslist = lappend(eslist, firstes); + /* + * We need to make a copy to ensure that it doesn't disappear + * underneath us due to plancache invalidation. + */ + resulttlist = copyObject(resulttlist); + + /* + * If the result is composite, *and* we are returning the whole tuple + * result, we need to insert nulls for any dropped columns. In the + * single-column-result case, there might be dropped columns within + * the composite column value, but it's not our problem here. There + * should be no resjunk entries in resulttlist, so in the second case + * the JunkFilter is certainly a no-op. + */ + if (fcache->func->rettupdesc && fcache->func->returnsTuple) + fcache->junkFilter = ExecInitJunkFilterConversion(resulttlist, + fcache->func->rettupdesc, + slot); + else + fcache->junkFilter = ExecInitJunkFilter(resulttlist, slot); + } + + if (fcache->func->returnsTuple) + { + /* Make sure output rowtype is properly blessed */ + BlessTupleDesc(fcache->junkFilter->jf_resultSlot->tts_tupleDescriptor); + } + else if (fcache->func->returnsSet && type_is_rowtype(fcache->func->rettype)) + { + /* + * Returning rowtype as if it were scalar --- materialize won't work. + * Right now it's sufficient to override any caller preference for + * materialize mode, but this might need more work in future. + */ + fcache->lazyEvalOK = true; } /* @@ -572,68 +701,69 @@ init_execution_state(List *queryTree_list, if (lasttages && fcache->junkFilter) { lasttages->setsResult = true; - if (lazyEvalOK && + if (fcache->lazyEvalOK && lasttages->stmt->commandType == CMD_SELECT && !lasttages->stmt->hasModifyingCTE) fcache->lazyEval = lasttages->lazyEval = true; } - return eslist; + return true; } /* - * Initialize the SQLFunctionCache for a SQL function + * Fill a new SQLFunctionHashEntry. + * + * The passed-in "cfunc" struct is expected to be zeroes, except + * for the CachedFunction fields, which we don't touch here. + * + * We expect to be called in a short-lived memory context (typically a + * query's per-tuple context). Data that is to be part of the hash entry + * must be copied into the hcontext, or put into a CachedPlanSource. */ static void -init_sql_fcache(FunctionCallInfo fcinfo, Oid collation, bool lazyEvalOK) +sql_compile_callback(FunctionCallInfo fcinfo, + HeapTuple procedureTuple, + const CachedFunctionHashKey *hashkey, + CachedFunction *cfunc, + bool forValidator) { - FmgrInfo *finfo = fcinfo->flinfo; - Oid foid = finfo->fn_oid; - MemoryContext fcontext; - MemoryContext oldcontext; + SQLFunctionHashEntry *func = (SQLFunctionHashEntry *) cfunc; + Form_pg_proc procedureStruct = (Form_pg_proc) GETSTRUCT(procedureTuple); + ErrorContextCallback comperrcontext; + MemoryContext hcontext; + MemoryContext oldcontext = CurrentMemoryContext; Oid rettype; TupleDesc rettupdesc; - HeapTuple procedureTuple; - Form_pg_proc procedureStruct; - SQLFunctionCachePtr fcache; - List *queryTree_list; - ListCell *lc; Datum tmp; bool isNull; + List *queryTree_list; + List *plansource_list; + ListCell *qlc; + ListCell *plc; /* - * Create memory context that holds all the SQLFunctionCache data. It - * must be a child of whatever context holds the FmgrInfo. - */ - fcontext = AllocSetContextCreate(finfo->fn_mcxt, - "SQL function", - ALLOCSET_DEFAULT_SIZES); - - oldcontext = MemoryContextSwitchTo(fcontext); - - /* - * Create the struct proper, link it to fcontext and fn_extra. Once this - * is done, we'll be able to recover the memory after failure, even if the - * FmgrInfo is long-lived. + * Setup error traceback support for ereport() during compile */ - fcache = (SQLFunctionCachePtr) palloc0(sizeof(SQLFunctionCache)); - fcache->fcontext = fcontext; - finfo->fn_extra = fcache; + comperrcontext.callback = sql_compile_error_callback; + comperrcontext.arg = func; + comperrcontext.previous = error_context_stack; + error_context_stack = &comperrcontext; /* - * get the procedure tuple corresponding to the given function Oid + * Create the hash entry's memory context. For now it's a child of the + * caller's context, so that it will go away if we fail partway through. */ - procedureTuple = SearchSysCache1(PROCOID, ObjectIdGetDatum(foid)); - if (!HeapTupleIsValid(procedureTuple)) - elog(ERROR, "cache lookup failed for function %u", foid); - procedureStruct = (Form_pg_proc) GETSTRUCT(procedureTuple); + hcontext = AllocSetContextCreate(CurrentMemoryContext, + "SQL function", + ALLOCSET_SMALL_SIZES); /* * copy function name immediately for use by error reporting callback, and * for use as memory context identifier */ - fcache->fname = pstrdup(NameStr(procedureStruct->proname)); - MemoryContextSetIdentifier(fcontext, fcache->fname); + func->fname = MemoryContextStrdup(hcontext, + NameStr(procedureStruct->proname)); + MemoryContextSetIdentifier(hcontext, func->fname); /* * Resolve any polymorphism, obtaining the actual result type, and the @@ -641,32 +771,44 @@ init_sql_fcache(FunctionCallInfo fcinfo, Oid collation, bool lazyEvalOK) */ (void) get_call_result_type(fcinfo, &rettype, &rettupdesc); - fcache->rettype = rettype; + func->rettype = rettype; + if (rettupdesc) + { + MemoryContextSwitchTo(hcontext); + func->rettupdesc = CreateTupleDescCopy(rettupdesc); + MemoryContextSwitchTo(oldcontext); + } /* Fetch the typlen and byval info for the result type */ - get_typlenbyval(rettype, &fcache->typlen, &fcache->typbyval); + get_typlenbyval(rettype, &func->typlen, &func->typbyval); /* Remember whether we're returning setof something */ - fcache->returnsSet = procedureStruct->proretset; + func->returnsSet = procedureStruct->proretset; /* Remember if function is STABLE/IMMUTABLE */ - fcache->readonly_func = + func->readonly_func = (procedureStruct->provolatile != PROVOLATILE_VOLATILE); + /* Remember routine kind */ + func->prokind = procedureStruct->prokind; + /* * We need the actual argument types to pass to the parser. Also make * sure that parameter symbols are considered to have the function's * resolved input collation. */ - fcache->pinfo = prepare_sql_fn_parse_info(procedureTuple, - finfo->fn_expr, - collation); + MemoryContextSwitchTo(hcontext); + func->pinfo = prepare_sql_fn_parse_info(procedureTuple, + fcinfo->flinfo->fn_expr, + PG_GET_COLLATION()); + MemoryContextSwitchTo(oldcontext); /* * And of course we need the function body text. */ tmp = SysCacheGetAttrNotNull(PROCOID, procedureTuple, Anum_pg_proc_prosrc); - fcache->src = TextDatumGetCString(tmp); + func->src = MemoryContextStrdup(hcontext, + TextDatumGetCString(tmp)); /* If we have prosqlbody, pay attention to that not prosrc. */ tmp = SysCacheGetAttr(PROCOID, @@ -675,19 +817,20 @@ init_sql_fcache(FunctionCallInfo fcinfo, Oid collation, bool lazyEvalOK) &isNull); /* - * Parse and rewrite the queries in the function text. Use sublists to - * keep track of the original query boundaries. - * - * Note: since parsing and planning is done in fcontext, we will generate - * a lot of cruft that lives as long as the fcache does. This is annoying - * but we'll not worry about it until the module is rewritten to use - * plancache.c. + * Now we must parse and rewrite the function's queries, and create + * CachedPlanSources. Note that we apply CreateCachedPlan[ForQuery] + * immediately so that it captures the original state of the parsetrees, + * but we don't do CompleteCachedPlan until after fixing up the final + * query's targetlist. */ queryTree_list = NIL; + plansource_list = NIL; if (!isNull) { + /* Source queries are already parse-analyzed */ Node *n; List *stored_query_list; + ListCell *lc; n = stringToNode(TextDatumGetCString(tmp)); if (IsA(n, List)) @@ -698,8 +841,17 @@ init_sql_fcache(FunctionCallInfo fcinfo, Oid collation, bool lazyEvalOK) foreach(lc, stored_query_list) { Query *parsetree = lfirst_node(Query, lc); + CachedPlanSource *plansource; List *queryTree_sublist; + /* Count source queries for sql_compile_error_callback */ + func->error_query_index++; + + plansource = CreateCachedPlanForQuery(parsetree, + func->src, + CreateCommandTag((Node *) parsetree)); + plansource_list = lappend(plansource_list, plansource); + AcquireRewriteLocks(parsetree, true, false); queryTree_sublist = pg_rewrite_query(parsetree); queryTree_list = lappend(queryTree_list, queryTree_sublist); @@ -707,24 +859,38 @@ init_sql_fcache(FunctionCallInfo fcinfo, Oid collation, bool lazyEvalOK) } else { + /* Source queries are raw parsetrees */ List *raw_parsetree_list; + ListCell *lc; - raw_parsetree_list = pg_parse_query(fcache->src); + raw_parsetree_list = pg_parse_query(func->src); foreach(lc, raw_parsetree_list) { RawStmt *parsetree = lfirst_node(RawStmt, lc); + CachedPlanSource *plansource; List *queryTree_sublist; + /* Count source queries for sql_compile_error_callback */ + func->error_query_index++; + + plansource = CreateCachedPlan(parsetree, + func->src, + CreateCommandTag(parsetree->stmt)); + plansource_list = lappend(plansource_list, plansource); + queryTree_sublist = pg_analyze_and_rewrite_withcb(parsetree, - fcache->src, + func->src, (ParserSetupHook) sql_fn_parser_setup, - fcache->pinfo, + func->pinfo, NULL); queryTree_list = lappend(queryTree_list, queryTree_sublist); } } + /* Failures below here are reported as "during startup" */ + func->error_query_index = 0; + /* * Check that there are no statements we don't want to allow. */ @@ -740,7 +906,7 @@ init_sql_fcache(FunctionCallInfo fcinfo, Oid collation, bool lazyEvalOK) * ask it to insert nulls for dropped columns; the junkfilter handles * that.) * - * Note: we set fcache->returnsTuple according to whether we are returning + * Note: we set func->returnsTuple according to whether we are returning * the whole tuple result or just a single column. In the latter case we * clear returnsTuple because we need not act different from the scalar * result case, even if it's a rowtype column. (However, we have to force @@ -748,76 +914,244 @@ init_sql_fcache(FunctionCallInfo fcinfo, Oid collation, bool lazyEvalOK) * the rowtype column into multiple columns, since we have no way to * notify the caller that it should do that.) */ - fcache->returnsTuple = check_sql_fn_retval(queryTree_list, - rettype, - rettupdesc, - procedureStruct->prokind, - false); + func->returnsTuple = check_sql_fn_retval(queryTree_list, + rettype, + rettupdesc, + procedureStruct->prokind, + false); /* - * Construct a JunkFilter we can use to coerce the returned rowtype to the - * desired form, unless the result type is VOID, in which case there's - * nothing to coerce to. (XXX Frequently, the JunkFilter isn't doing - * anything very interesting, but much of this module expects it to be - * there anyway.) + * Now that check_sql_fn_retval has done its thing, we can complete plan + * cache entry creation. */ - if (rettype != VOIDOID) + forboth(qlc, queryTree_list, plc, plansource_list) { - TupleTableSlot *slot = MakeSingleTupleTableSlot(NULL, - &TTSOpsMinimalTuple); - List *resulttlist; + List *queryTree_sublist = lfirst(qlc); + CachedPlanSource *plansource = lfirst(plc); + bool islast; + + /* Finish filling in the CachedPlanSource */ + CompleteCachedPlan(plansource, + queryTree_sublist, + NULL, + NULL, + 0, + (ParserSetupHook) sql_fn_parser_setup, + func->pinfo, + CURSOR_OPT_PARALLEL_OK | CURSOR_OPT_NO_SCROLL, + false); /* - * Re-fetch the (possibly modified) output tlist of the final - * statement. By this point, we should have thrown an error if there - * is not one. + * Install post-rewrite hook. Its arg is the hash entry if this is + * the last statement, else NULL. */ - resulttlist = get_sql_fn_result_tlist(llast_node(List, queryTree_list)); + islast = (lnext(queryTree_list, qlc) == NULL); + SetPostRewriteHook(plansource, + sql_postrewrite_callback, + islast ? func : NULL); + } - /* - * If the result is composite, *and* we are returning the whole tuple - * result, we need to insert nulls for any dropped columns. In the - * single-column-result case, there might be dropped columns within - * the composite column value, but it's not our problem here. There - * should be no resjunk entries in resulttlist, so in the second case - * the JunkFilter is certainly a no-op. - */ - if (rettupdesc && fcache->returnsTuple) - fcache->junkFilter = ExecInitJunkFilterConversion(resulttlist, - rettupdesc, - slot); - else - fcache->junkFilter = ExecInitJunkFilter(resulttlist, slot); + /* + * While the CachedPlanSources can take care of themselves, our List + * pointing to them had better be in the hcontext. + */ + MemoryContextSwitchTo(hcontext); + plansource_list = list_copy(plansource_list); + MemoryContextSwitchTo(oldcontext); + + /* + * We have now completed building the hash entry, so reparent stuff under + * CacheMemoryContext to make all the subsidiary data long-lived. + * Importantly, this part can't fail partway through. + */ + foreach(plc, plansource_list) + { + CachedPlanSource *plansource = lfirst(plc); + + SaveCachedPlan(plansource); + } + MemoryContextSetParent(hcontext, CacheMemoryContext); + + /* And finally, arm sql_delete_callback to delete the stuff again */ + func->plansource_list = plansource_list; + func->hcontext = hcontext; + + error_context_stack = comperrcontext.previous; +} + +/* Deletion callback used by funccache.c */ +static void +sql_delete_callback(CachedFunction *cfunc) +{ + SQLFunctionHashEntry *func = (SQLFunctionHashEntry *) cfunc; + ListCell *lc; + + /* Release the CachedPlanSources */ + foreach(lc, func->plansource_list) + { + CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc); + + DropCachedPlan(plansource); } + func->plansource_list = NIL; + + /* + * If we have an hcontext, free it, thereby getting rid of all subsidiary + * data. + */ + if (func->hcontext) + MemoryContextDelete(func->hcontext); + func->hcontext = NULL; +} + +/* Post-rewrite callback used by plancache.c */ +static void +sql_postrewrite_callback(List *querytree_list, void *arg) +{ + /* + * Check that there are no statements we don't want to allow. (Presently, + * there's no real point in this because the result can't change from what + * we saw originally. But it's cheap and maybe someday it will matter.) + */ + check_sql_fn_statement(querytree_list); - if (fcache->returnsTuple) + /* + * If this is the last query, we must re-do what check_sql_fn_retval did + * to its targetlist. Also check that returnsTuple didn't change (it + * probably cannot, but be cautious). + */ + if (arg != NULL) { - /* Make sure output rowtype is properly blessed */ - BlessTupleDesc(fcache->junkFilter->jf_resultSlot->tts_tupleDescriptor); + SQLFunctionHashEntry *func = (SQLFunctionHashEntry *) arg; + bool returnsTuple; + + returnsTuple = check_sql_stmt_retval(querytree_list, + func->rettype, + func->rettupdesc, + func->prokind, + false); + if (returnsTuple != func->returnsTuple) + ereport(ERROR, + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("cached plan must not change result type"))); + } +} + +/* + * Initialize the SQLFunctionCache for a SQL function + */ +static SQLFunctionCache * +init_sql_fcache(FunctionCallInfo fcinfo, bool lazyEvalOK) +{ + FmgrInfo *finfo = fcinfo->flinfo; + SQLFunctionHashEntry *func; + SQLFunctionCache *fcache; + SQLFunctionLink *flink; + MemoryContext pcontext; + MemoryContext fcontext; + MemoryContext oldcontext; + + /* + * If this is the first execution for this FmgrInfo, set up a link struct + * (initially containing null pointers). The link must live as long as + * the FmgrInfo, so it goes in fn_mcxt. Also set up a memory context + * callback that will be invoked when fn_mcxt is deleted. + */ + flink = finfo->fn_extra; + if (flink == NULL) + { + flink = (SQLFunctionLink *) + MemoryContextAllocZero(finfo->fn_mcxt, sizeof(SQLFunctionLink)); + flink->mcb.func = RemoveSQLFunctionLink; + flink->mcb.arg = flink; + MemoryContextRegisterResetCallback(finfo->fn_mcxt, &flink->mcb); + finfo->fn_extra = flink; } - else if (fcache->returnsSet && type_is_rowtype(fcache->rettype)) + + /* + * If we are resuming execution of a set-returning function, just keep + * using the same cache. We do not ask funccache.c to re-validate the + * SQLFunctionHashEntry: we want to run to completion using the function's + * initial definition. + */ + if (flink->fcache != NULL) { - /* - * Returning rowtype as if it were scalar --- materialize won't work. - * Right now it's sufficient to override any caller preference for - * materialize mode, but to add more smarts in init_execution_state - * about this, we'd probably need a three-way flag instead of bool. - */ - lazyEvalOK = true; + Assert(flink->fcache->func == flink->func); + return flink->fcache; + } + + /* + * Look up, or re-validate, the long-lived hash entry. Make the hash key + * depend on the result of get_call_result_type() when that's composite, + * so that we can safely assume that we'll build a new hash entry if the + * composite rowtype changes. + */ + func = (SQLFunctionHashEntry *) + cached_function_compile(fcinfo, + (CachedFunction *) flink->func, + sql_compile_callback, + sql_delete_callback, + sizeof(SQLFunctionHashEntry), + true, + false); + + /* + * Install the hash pointer in the SQLFunctionLink, and increment its use + * count to reflect that. If cached_function_compile gave us back a + * different hash entry than we were using before, we must decrement that + * one's use count. + */ + if (func != flink->func) + { + if (flink->func != NULL) + { + Assert(flink->func->cfunc.use_count > 0); + flink->func->cfunc.use_count--; + } + flink->func = func; + func->cfunc.use_count++; } - /* Finally, plan the queries */ - fcache->func_state = init_execution_state(queryTree_list, - fcache, - lazyEvalOK); + /* + * Create memory context that holds all the SQLFunctionCache data. If we + * return a set, we must keep this in whatever context holds the FmgrInfo + * (anything shorter-lived risks leaving a dangling pointer in flink). But + * in a non-SRF we'll delete it before returning, and there's no need for + * it to outlive the caller's context. + */ + pcontext = func->returnsSet ? finfo->fn_mcxt : CurrentMemoryContext; + fcontext = AllocSetContextCreate(pcontext, + "SQL function execution", + ALLOCSET_DEFAULT_SIZES); + + oldcontext = MemoryContextSwitchTo(fcontext); - /* Mark fcache with time of creation to show it's valid */ - fcache->lxid = MyProc->vxid.lxid; - fcache->subxid = GetCurrentSubTransactionId(); + /* + * Create the struct proper, link it to func and fcontext. + */ + fcache = (SQLFunctionCache *) palloc0(sizeof(SQLFunctionCache)); + fcache->func = func; + fcache->fcontext = fcontext; + fcache->lazyEvalOK = lazyEvalOK; - ReleaseSysCache(procedureTuple); + /* + * If we return a set, we must link the fcache into fn_extra so that we + * can find it again during future calls. But in a non-SRF there is no + * need to link it into fn_extra at all. Not doing so removes the risk of + * having a dangling pointer in a long-lived FmgrInfo. + */ + if (func->returnsSet) + flink->fcache = fcache; + + /* + * We're beginning a new execution of the function, so convert params to + * appropriate format. + */ + postquel_sub_params(fcache, fcinfo); MemoryContextSwitchTo(oldcontext); + + return fcache; } /* Start up execution of one execution_state node */ @@ -852,7 +1186,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache) es->qd = CreateQueryDesc(es->stmt, NULL, - fcache->src, + fcache->func->src, GetActiveSnapshot(), InvalidSnapshot, dest, @@ -893,7 +1227,7 @@ postquel_getnext(execution_state *es, SQLFunctionCachePtr fcache) if (es->qd->operation == CMD_UTILITY) { ProcessUtility(es->qd->plannedstmt, - fcache->src, + fcache->func->src, true, /* protect function cache's parsetree */ PROCESS_UTILITY_QUERY, es->qd->params, @@ -949,7 +1283,7 @@ postquel_sub_params(SQLFunctionCachePtr fcache, if (nargs > 0) { ParamListInfo paramLI; - Oid *argtypes = fcache->pinfo->argtypes; + Oid *argtypes = fcache->func->pinfo->argtypes; if (fcache->paramLI == NULL) { @@ -982,7 +1316,8 @@ postquel_sub_params(SQLFunctionCachePtr fcache, prm->value = MakeExpandedObjectReadOnly(fcinfo->args[i].value, prm->isnull, get_typlen(argtypes[i])); - prm->pflags = 0; + /* Allow the value to be substituted into custom plans */ + prm->pflags = PARAM_FLAG_CONST; prm->ptype = argtypes[i]; } } @@ -1012,7 +1347,7 @@ postquel_get_single_result(TupleTableSlot *slot, */ oldcontext = MemoryContextSwitchTo(resultcontext); - if (fcache->returnsTuple) + if (fcache->func->returnsTuple) { /* We must return the whole tuple as a Datum. */ fcinfo->isnull = false; @@ -1027,7 +1362,7 @@ postquel_get_single_result(TupleTableSlot *slot, value = slot_getattr(slot, 1, &(fcinfo->isnull)); if (!fcinfo->isnull) - value = datumCopy(value, fcache->typbyval, fcache->typlen); + value = datumCopy(value, fcache->func->typbyval, fcache->func->typlen); } MemoryContextSwitchTo(oldcontext); @@ -1042,25 +1377,16 @@ Datum fmgr_sql(PG_FUNCTION_ARGS) { SQLFunctionCachePtr fcache; + SQLFunctionLink *flink; ErrorContextCallback sqlerrcontext; + MemoryContext tscontext; MemoryContext oldcontext; bool randomAccess; bool lazyEvalOK; - bool is_first; bool pushed_snapshot; execution_state *es; TupleTableSlot *slot; Datum result; - List *eslist; - ListCell *eslc; - - /* - * Setup error traceback support for ereport() - */ - sqlerrcontext.callback = sql_exec_error_callback; - sqlerrcontext.arg = fcinfo->flinfo; - sqlerrcontext.previous = error_context_stack; - error_context_stack = &sqlerrcontext; /* Check call context */ if (fcinfo->flinfo->fn_retset) @@ -1081,80 +1407,63 @@ fmgr_sql(PG_FUNCTION_ARGS) errmsg("set-valued function called in context that cannot accept a set"))); randomAccess = rsi->allowedModes & SFRM_Materialize_Random; lazyEvalOK = !(rsi->allowedModes & SFRM_Materialize_Preferred); + /* tuplestore must have query lifespan */ + tscontext = rsi->econtext->ecxt_per_query_memory; } else { randomAccess = false; lazyEvalOK = true; + /* tuplestore needn't outlive caller context */ + tscontext = CurrentMemoryContext; } /* - * Initialize fcache (build plans) if first time through; or re-initialize - * if the cache is stale. + * Initialize fcache if starting a fresh execution. */ - fcache = (SQLFunctionCachePtr) fcinfo->flinfo->fn_extra; + fcache = init_sql_fcache(fcinfo, lazyEvalOK); + /* init_sql_fcache also ensures we have a SQLFunctionLink */ + flink = fcinfo->flinfo->fn_extra; - if (fcache != NULL) - { - if (fcache->lxid != MyProc->vxid.lxid || - !SubTransactionIsActive(fcache->subxid)) - { - /* It's stale; unlink and delete */ - fcinfo->flinfo->fn_extra = NULL; - MemoryContextDelete(fcache->fcontext); - fcache = NULL; - } - } + /* + * Now we can set up error traceback support for ereport() + */ + sqlerrcontext.callback = sql_exec_error_callback; + sqlerrcontext.arg = fcache; + sqlerrcontext.previous = error_context_stack; + error_context_stack = &sqlerrcontext; - if (fcache == NULL) - { - init_sql_fcache(fcinfo, PG_GET_COLLATION(), lazyEvalOK); - fcache = (SQLFunctionCachePtr) fcinfo->flinfo->fn_extra; - } + /* + * Build tuplestore to hold results, if we don't have one already. Make + * sure it's in a suitable context. + */ + oldcontext = MemoryContextSwitchTo(tscontext); + + if (!fcache->tstore) + fcache->tstore = tuplestore_begin_heap(randomAccess, false, work_mem); /* - * Switch to context in which the fcache lives. This ensures that our - * tuplestore etc will have sufficient lifetime. The sub-executor is + * Switch to context in which the fcache lives. The sub-executor is * responsible for deleting per-tuple information. (XXX in the case of a - * long-lived FmgrInfo, this policy represents more memory leakage, but - * it's not entirely clear where to keep stuff instead.) + * long-lived FmgrInfo, this policy potentially causes memory leakage, but + * it's not very clear where we could keep stuff instead. Fortunately, + * there are few if any cases where set-returning functions are invoked + * via FmgrInfos that would outlive the calling query.) */ - oldcontext = MemoryContextSwitchTo(fcache->fcontext); + MemoryContextSwitchTo(fcache->fcontext); /* - * Find first unfinished query in function, and note whether it's the - * first query. + * Find first unfinished execution_state. If none, advance to the next + * query in function. */ - eslist = fcache->func_state; - es = NULL; - is_first = true; - foreach(eslc, eslist) + do { - es = (execution_state *) lfirst(eslc); - + es = fcache->eslist; while (es && es->status == F_EXEC_DONE) - { - is_first = false; es = es->next; - } - if (es) break; - } - - /* - * Convert params to appropriate format if starting a fresh execution. (If - * continuing execution, we can re-use prior params.) - */ - if (is_first && es && es->status == F_EXEC_START) - postquel_sub_params(fcache, fcinfo); - - /* - * Build tuplestore to hold results, if we don't have one already. Note - * it's in the query-lifespan context. - */ - if (!fcache->tstore) - fcache->tstore = tuplestore_begin_heap(randomAccess, false, work_mem); + } while (init_execution_state(fcache)); /* * Execute each command in the function one after another until we either @@ -1187,7 +1496,7 @@ fmgr_sql(PG_FUNCTION_ARGS) * visible. Take a new snapshot if we don't have one yet, * otherwise just bump the command ID in the existing snapshot. */ - if (!fcache->readonly_func) + if (!fcache->func->readonly_func) { CommandCounterIncrement(); if (!pushed_snapshot) @@ -1201,7 +1510,7 @@ fmgr_sql(PG_FUNCTION_ARGS) postquel_start(es, fcache); } - else if (!fcache->readonly_func && !pushed_snapshot) + else if (!fcache->func->readonly_func && !pushed_snapshot) { /* Re-establish active snapshot when re-entering function */ PushActiveSnapshot(es->qd->snapshot); @@ -1218,7 +1527,7 @@ fmgr_sql(PG_FUNCTION_ARGS) * set, we can shut it down anyway because it must be a SELECT and we * don't care about fetching any more result rows. */ - if (completed || !fcache->returnsSet) + if (completed || !fcache->func->returnsSet) postquel_end(es); /* @@ -1234,17 +1543,11 @@ fmgr_sql(PG_FUNCTION_ARGS) break; /* - * Advance to next execution_state, which might be in the next list. + * Advance to next execution_state, and perhaps next query. */ es = es->next; while (!es) { - eslc = lnext(eslist, eslc); - if (!eslc) - break; /* end of function */ - - es = (execution_state *) lfirst(eslc); - /* * Flush the current snapshot so that we will take a new one for * the new query list. This ensures that new snaps are taken at @@ -1256,13 +1559,18 @@ fmgr_sql(PG_FUNCTION_ARGS) PopActiveSnapshot(); pushed_snapshot = false; } + + if (!init_execution_state(fcache)) + break; /* end of function */ + + es = fcache->eslist; } } /* * The tuplestore now contains whatever row(s) we are supposed to return. */ - if (fcache->returnsSet) + if (fcache->func->returnsSet) { ReturnSetInfo *rsi = (ReturnSetInfo *) fcinfo->resultinfo; @@ -1298,7 +1606,7 @@ fmgr_sql(PG_FUNCTION_ARGS) { RegisterExprContextCallback(rsi->econtext, ShutdownSQLFunction, - PointerGetDatum(fcache)); + PointerGetDatum(flink)); fcache->shutdown_reg = true; } } @@ -1322,7 +1630,7 @@ fmgr_sql(PG_FUNCTION_ARGS) { UnregisterExprContextCallback(rsi->econtext, ShutdownSQLFunction, - PointerGetDatum(fcache)); + PointerGetDatum(flink)); fcache->shutdown_reg = false; } } @@ -1338,7 +1646,12 @@ fmgr_sql(PG_FUNCTION_ARGS) fcache->tstore = NULL; /* must copy desc because execSRF.c will free it */ if (fcache->junkFilter) + { + /* setDesc must be allocated in suitable context */ + MemoryContextSwitchTo(tscontext); rsi->setDesc = CreateTupleDescCopy(fcache->junkFilter->jf_cleanTupType); + MemoryContextSwitchTo(fcache->fcontext); + } fcinfo->isnull = true; result = (Datum) 0; @@ -1348,7 +1661,7 @@ fmgr_sql(PG_FUNCTION_ARGS) { UnregisterExprContextCallback(rsi->econtext, ShutdownSQLFunction, - PointerGetDatum(fcache)); + PointerGetDatum(flink)); fcache->shutdown_reg = false; } } @@ -1374,7 +1687,7 @@ fmgr_sql(PG_FUNCTION_ARGS) else { /* Should only get here for VOID functions and procedures */ - Assert(fcache->rettype == VOIDOID); + Assert(fcache->func->rettype == VOIDOID); fcinfo->isnull = true; result = (Datum) 0; } @@ -1387,154 +1700,171 @@ fmgr_sql(PG_FUNCTION_ARGS) if (pushed_snapshot) PopActiveSnapshot(); + MemoryContextSwitchTo(oldcontext); + /* - * If we've gone through every command in the function, we are done. Reset - * the execution states to start over again on next call. + * If we've gone through every command in the function, we are done. + * Release the cache to start over again on next call. */ if (es == NULL) { - foreach(eslc, fcache->func_state) - { - es = (execution_state *) lfirst(eslc); - while (es) - { - es->status = F_EXEC_START; - es = es->next; - } - } + if (fcache->tstore) + tuplestore_end(fcache->tstore); + Assert(fcache->cplan == NULL); + flink->fcache = NULL; + MemoryContextDelete(fcache->fcontext); } error_context_stack = sqlerrcontext.previous; - MemoryContextSwitchTo(oldcontext); - return result; } /* - * error context callback to let us supply a call-stack traceback + * error context callback to let us supply a traceback during compile */ static void -sql_exec_error_callback(void *arg) +sql_compile_error_callback(void *arg) { - FmgrInfo *flinfo = (FmgrInfo *) arg; - SQLFunctionCachePtr fcache = (SQLFunctionCachePtr) flinfo->fn_extra; + SQLFunctionHashEntry *func = (SQLFunctionHashEntry *) arg; int syntaxerrposition; /* - * We can do nothing useful if init_sql_fcache() didn't get as far as - * saving the function name + * We can do nothing useful if sql_compile_callback() didn't get as far as + * copying the function name */ - if (fcache == NULL || fcache->fname == NULL) + if (func->fname == NULL) return; /* * If there is a syntax error position, convert to internal syntax error */ syntaxerrposition = geterrposition(); - if (syntaxerrposition > 0 && fcache->src != NULL) + if (syntaxerrposition > 0 && func->src != NULL) { errposition(0); internalerrposition(syntaxerrposition); - internalerrquery(fcache->src); + internalerrquery(func->src); } /* - * Try to determine where in the function we failed. If there is a query - * with non-null QueryDesc, finger it. (We check this rather than looking - * for F_EXEC_RUN state, so that errors during ExecutorStart or - * ExecutorEnd are blamed on the appropriate query; see postquel_start and - * postquel_end.) + * If we failed while parsing an identifiable query within the function, + * report that. Otherwise say it was "during startup". */ - if (fcache->func_state) - { - execution_state *es; - int query_num; - ListCell *lc; - - es = NULL; - query_num = 1; - foreach(lc, fcache->func_state) - { - es = (execution_state *) lfirst(lc); - while (es) - { - if (es->qd) - { - errcontext("SQL function \"%s\" statement %d", - fcache->fname, query_num); - break; - } - es = es->next; - } - if (es) - break; - query_num++; - } - if (es == NULL) - { - /* - * couldn't identify a running query; might be function entry, - * function exit, or between queries. - */ - errcontext("SQL function \"%s\"", fcache->fname); - } - } + if (func->error_query_index > 0) + errcontext("SQL function \"%s\" statement %d", + func->fname, func->error_query_index); else + errcontext("SQL function \"%s\" during startup", func->fname); +} + +/* + * error context callback to let us supply a call-stack traceback at runtime + */ +static void +sql_exec_error_callback(void *arg) +{ + SQLFunctionCachePtr fcache = (SQLFunctionCachePtr) arg; + int syntaxerrposition; + + /* + * If there is a syntax error position, convert to internal syntax error + */ + syntaxerrposition = geterrposition(); + if (syntaxerrposition > 0 && fcache->func->src != NULL) { - /* - * Assume we failed during init_sql_fcache(). (It's possible that the - * function actually has an empty body, but in that case we may as - * well report all errors as being "during startup".) - */ - errcontext("SQL function \"%s\" during startup", fcache->fname); + errposition(0); + internalerrposition(syntaxerrposition); + internalerrquery(fcache->func->src); } + + /* + * If we failed while executing an identifiable query within the function, + * report that. Otherwise say it was "during startup". + */ + if (fcache->error_query_index > 0) + errcontext("SQL function \"%s\" statement %d", + fcache->func->fname, fcache->error_query_index); + else + errcontext("SQL function \"%s\" during startup", fcache->func->fname); } /* - * callback function in case a function-returning-set needs to be shut down - * before it has been run to completion + * ExprContext callback function + * + * We register this in the active ExprContext while a set-returning SQL + * function is running, in case the function needs to be shut down before it + * has been run to completion. Note that this will not be called during an + * error abort, but we don't need it because transaction abort will take care + * of releasing executor resources. */ static void ShutdownSQLFunction(Datum arg) { - SQLFunctionCachePtr fcache = (SQLFunctionCachePtr) DatumGetPointer(arg); - execution_state *es; - ListCell *lc; + SQLFunctionLink *flink = (SQLFunctionLink *) DatumGetPointer(arg); + SQLFunctionCachePtr fcache = flink->fcache; - foreach(lc, fcache->func_state) + if (fcache != NULL) { - es = (execution_state *) lfirst(lc); + execution_state *es; + + /* Make sure we don't somehow try to do this twice */ + flink->fcache = NULL; + + es = fcache->eslist; while (es) { /* Shut down anything still running */ if (es->status == F_EXEC_RUN) { /* Re-establish active snapshot for any called functions */ - if (!fcache->readonly_func) + if (!fcache->func->readonly_func) PushActiveSnapshot(es->qd->snapshot); postquel_end(es); - if (!fcache->readonly_func) + if (!fcache->func->readonly_func) PopActiveSnapshot(); } - - /* Reset states to START in case we're called again */ - es->status = F_EXEC_START; es = es->next; } - } - /* Release tuplestore if we have one */ - if (fcache->tstore) - tuplestore_end(fcache->tstore); - fcache->tstore = NULL; + /* Release tuplestore if we have one */ + if (fcache->tstore) + tuplestore_end(fcache->tstore); + /* Release CachedPlan if we have one */ + if (fcache->cplan) + ReleaseCachedPlan(fcache->cplan, fcache->cowner); + + /* Release the cache */ + MemoryContextDelete(fcache->fcontext); + } /* execUtils will deregister the callback... */ - fcache->shutdown_reg = false; +} + +/* + * MemoryContext callback function + * + * We register this in the memory context that contains a SQLFunctionLink + * struct. When the memory context is reset or deleted, we release the + * reference count (if any) that the link holds on the long-lived hash entry. + * Note that this will happen even during error aborts. + */ +static void +RemoveSQLFunctionLink(void *arg) +{ + SQLFunctionLink *flink = (SQLFunctionLink *) arg; + + if (flink->func != NULL) + { + Assert(flink->func->cfunc.use_count > 0); + flink->func->cfunc.use_count--; + /* This should be unnecessary, but let's just be sure: */ + flink->func = NULL; + } } /* diff --git a/src/test/modules/test_extensions/expected/test_extensions.out b/src/test/modules/test_extensions/expected/test_extensions.out index d5388a1fecf..72bae1bf254 100644 --- a/src/test/modules/test_extensions/expected/test_extensions.out +++ b/src/test/modules/test_extensions/expected/test_extensions.out @@ -651,7 +651,7 @@ LINE 1: SELECT public.dep_req2() || ' req3b' ^ HINT: No function matches the given name and argument types. You might need to add explicit type casts. QUERY: SELECT public.dep_req2() || ' req3b' -CONTEXT: SQL function "dep_req3b" during startup +CONTEXT: SQL function "dep_req3b" statement 1 DROP EXTENSION test_ext_req_schema3; ALTER EXTENSION test_ext_req_schema1 SET SCHEMA test_s_dep2; -- now ok SELECT test_s_dep2.dep_req1(); diff --git a/src/test/regress/expected/create_function_sql.out b/src/test/regress/expected/create_function_sql.out index 50aca5940ff..70ed5742b65 100644 --- a/src/test/regress/expected/create_function_sql.out +++ b/src/test/regress/expected/create_function_sql.out @@ -563,6 +563,20 @@ CREATE OR REPLACE PROCEDURE functest1(a int) LANGUAGE SQL AS 'SELECT $1'; ERROR: cannot change routine kind DETAIL: "functest1" is a function. DROP FUNCTION functest1(a int); +-- early shutdown of set-returning functions +CREATE FUNCTION functest_srf0() RETURNS SETOF int +LANGUAGE SQL +AS $$ SELECT i FROM generate_series(1, 100) i $$; +SELECT functest_srf0() LIMIT 5; + functest_srf0 +--------------- + 1 + 2 + 3 + 4 + 5 +(5 rows) + -- inlining of set-returning functions CREATE TABLE functest3 (a int); INSERT INTO functest3 VALUES (1), (2), (3); @@ -708,7 +722,7 @@ CREATE FUNCTION test1 (int) RETURNS int LANGUAGE SQL ERROR: only one AS item needed for language "sql" -- Cleanup DROP SCHEMA temp_func_test CASCADE; -NOTICE: drop cascades to 30 other objects +NOTICE: drop cascades to 31 other objects DETAIL: drop cascades to function functest_a_1(text,date) drop cascades to function functest_a_2(text[]) drop cascades to function functest_a_3() @@ -732,6 +746,7 @@ drop cascades to function functest_s_10(text,date) drop cascades to function functest_s_13() drop cascades to function functest_s_15(integer) drop cascades to function functest_b_2(bigint) +drop cascades to function functest_srf0() drop cascades to function functest_sri1() drop cascades to function voidtest1(integer) drop cascades to function voidtest2(integer,integer) diff --git a/src/test/regress/expected/rowsecurity.out b/src/test/regress/expected/rowsecurity.out index 8f2c8319172..1c4e37d2249 100644 --- a/src/test/regress/expected/rowsecurity.out +++ b/src/test/regress/expected/rowsecurity.out @@ -4723,12 +4723,8 @@ select rls_f(c) from test_t order by rls_f; -- should lead to RLS error during query rewrite set role regress_rls_alice; select rls_f(c) from test_t order by rls_f; - rls_f -------- - boffa - -(2 rows) - +ERROR: query would be affected by row-level security policy for table "rls_t" +CONTEXT: SQL function "rls_f" statement 1 reset role; set plan_cache_mode to force_generic_plan; -- Table owner bypasses RLS, although cached plan will be invalidated @@ -4743,12 +4739,8 @@ select rls_f(c) from test_t order by rls_f; -- should lead to plan invalidation and RLS error during query rewrite set role regress_rls_alice; select rls_f(c) from test_t order by rls_f; - rls_f -------- - boffa - -(2 rows) - +ERROR: query would be affected by row-level security policy for table "rls_t" +CONTEXT: SQL function "rls_f" statement 1 reset role; reset plan_cache_mode; reset rls_test.blah; diff --git a/src/test/regress/sql/create_function_sql.sql b/src/test/regress/sql/create_function_sql.sql index 89e9af3a499..1dd3c4a4e5f 100644 --- a/src/test/regress/sql/create_function_sql.sql +++ b/src/test/regress/sql/create_function_sql.sql @@ -328,6 +328,15 @@ CREATE OR REPLACE PROCEDURE functest1(a int) LANGUAGE SQL AS 'SELECT $1'; DROP FUNCTION functest1(a int); +-- early shutdown of set-returning functions + +CREATE FUNCTION functest_srf0() RETURNS SETOF int +LANGUAGE SQL +AS $$ SELECT i FROM generate_series(1, 100) i $$; + +SELECT functest_srf0() LIMIT 5; + + -- inlining of set-returning functions CREATE TABLE functest3 (a int); diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list index 144c4e9662c..2bbcb43055e 100644 --- a/src/tools/pgindent/typedefs.list +++ b/src/tools/pgindent/typedefs.list @@ -2613,6 +2613,8 @@ SPPageDesc SQLDropObject SQLFunctionCache SQLFunctionCachePtr +SQLFunctionHashEntry +SQLFunctionLink SQLFunctionParseInfo SQLFunctionParseInfoPtr SQLValueFunction -- 2.43.5 From bf3feaa70765f78a988cf14bd09eace362253fd3 Mon Sep 17 00:00:00 2001 From: Tom Lane <tgl@sss.pgh.pa.us> Date: Mon, 31 Mar 2025 18:10:29 -0400 Subject: [PATCH v11 7/7] Delay parse analysis and rewrite until we're ready to execute the query. This change fixes a longstanding bugaboo with SQL functions: you could not write DDL that would affect later statements in the same function. That's mostly still true with new-style SQL functions, since the results of parse analysis are baked into the stored query trees (and protected by dependency records). But for old-style SQL functions, it will now work much as it does with plpgsql functions. The key changes required are to (1) stash the parsetrees read from pg_proc somewhere safe until we're ready to process them, and (2) adjust the error context reporting. sql_compile_error_callback is now only useful for giving context for errors detected by raw parsing. Errors detected in either parse analysis or planning are handled by sql_exec_error_callback, as they were before this patch series. Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/8216639.NyiUUSuA9g@aivenlaptop --- doc/src/sgml/xfunc.sgml | 15 - src/backend/executor/functions.c | 315 ++++++++++-------- .../regress/expected/create_function_sql.out | 42 ++- src/test/regress/expected/rangefuncs.out | 2 +- src/test/regress/sql/create_function_sql.sql | 25 ++ 5 files changed, 241 insertions(+), 158 deletions(-) diff --git a/doc/src/sgml/xfunc.sgml b/doc/src/sgml/xfunc.sgml index 35d34f224ef..8074f66417d 100644 --- a/doc/src/sgml/xfunc.sgml +++ b/doc/src/sgml/xfunc.sgml @@ -234,21 +234,6 @@ CALL clean_emp(); whereas returning <type>void</type> is a PostgreSQL extension. </para> - <note> - <para> - The entire body of an SQL function is parsed before any of it is - executed. While an SQL function can contain commands that alter - the system catalogs (e.g., <command>CREATE TABLE</command>), the effects - of such commands will not be visible during parse analysis of - later commands in the function. Thus, for example, - <literal>CREATE TABLE foo (...); INSERT INTO foo VALUES(...);</literal> - will not work as desired if packaged up into a single SQL function, - since <structname>foo</structname> won't exist yet when the <command>INSERT</command> - command is parsed. It's recommended to use <application>PL/pgSQL</application> - instead of an SQL function in this type of situation. - </para> - </note> - <para> The syntax of the <command>CREATE FUNCTION</command> command requires the function body to be written as a string constant. It is usually diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c index b5a9ecea637..83dbcad78ad 100644 --- a/src/backend/executor/functions.c +++ b/src/backend/executor/functions.c @@ -90,7 +90,13 @@ typedef struct execution_state * function is polymorphic or used as a trigger, there is a separate * SQLFunctionHashEntry for each usage, so that we need consider only one * set of relevant data types.) The struct itself is in memory managed by - * funccache.c, and its subsidiary data is kept in hcontext ("hash context"). + * funccache.c, and its subsidiary data is kept in one of two contexts: + * * pcontext ("parse context") holds the raw parse trees or Query trees + * that we read from the pg_proc row. These will be converted to + * CachedPlanSources as they are needed. Once the last one is converted, + * pcontext can be freed. + * * hcontext ("hash context") holds everything else belonging to the + * SQLFunctionHashEntry. * * 2. SQLFunctionCache lasts for the duration of a single execution of * the SQL function. (In "lazyEval" mode, this might span multiple calls of @@ -127,12 +133,14 @@ typedef struct SQLFunctionHashEntry TupleDesc rettupdesc; /* result tuple descriptor */ - List *plansource_list; /* CachedPlanSources for fn's queries */ + List *source_list; /* RawStmts or Queries read from pg_proc */ + int num_queries; /* original length of source_list */ + bool raw_source; /* true if source_list contains RawStmts */ - /* if positive, this is the index of the query we're parsing */ - int error_query_index; + List *plansource_list; /* CachedPlanSources for fn's queries */ - MemoryContext hcontext; /* memory context holding all above */ + MemoryContext pcontext; /* memory context holding source_list */ + MemoryContext hcontext; /* memory context holding all else */ } SQLFunctionHashEntry; typedef struct SQLFunctionCache @@ -149,7 +157,7 @@ typedef struct SQLFunctionCache JunkFilter *junkFilter; /* will be NULL if function returns VOID */ - /* if positive, this is the index of the query we're executing */ + /* if positive, this is the index of the query we're processing */ int error_query_index; /* @@ -193,6 +201,13 @@ static Node *sql_fn_make_param(SQLFunctionParseInfoPtr pinfo, static Node *sql_fn_resolve_param_name(SQLFunctionParseInfoPtr pinfo, const char *paramname, int location); static bool init_execution_state(SQLFunctionCachePtr fcache); +static void sql_compile_callback(FunctionCallInfo fcinfo, + HeapTuple procedureTuple, + const CachedFunctionHashKey *hashkey, + CachedFunction *cfunc, + bool forValidator); +static void prepare_next_query(SQLFunctionHashEntry *func); +static void sql_delete_callback(CachedFunction *cfunc); static void sql_postrewrite_callback(List *querytree_list, void *arg); static SQLFunctionCache *init_sql_fcache(FunctionCallInfo fcinfo, bool lazyEvalOK); @@ -542,17 +557,25 @@ init_execution_state(SQLFunctionCachePtr fcache) fcache->eslist = NULL; /* - * Get the next CachedPlanSource, or stop if there are no more. + * Get the next CachedPlanSource, or stop if there are no more. We might + * need to create the next CachedPlanSource; if so, advance + * error_query_index first, so that errors detected in prepare_next_query + * are blamed on the right statement. */ if (fcache->next_query_index >= list_length(fcache->func->plansource_list)) - return false; + { + if (fcache->next_query_index >= fcache->func->num_queries) + return false; + fcache->error_query_index++; + prepare_next_query(fcache->func); + } + else + fcache->error_query_index++; + plansource = (CachedPlanSource *) list_nth(fcache->func->plansource_list, fcache->next_query_index); fcache->next_query_index++; - /* Count source queries for sql_exec_error_callback */ - fcache->error_query_index++; - /* * Generate plans for the query or queries within this CachedPlanSource. * Register the CachedPlan with the current resource owner. (Saving @@ -624,7 +647,7 @@ init_execution_state(SQLFunctionCachePtr fcache) * If this isn't the last CachedPlanSource, we're done here. Otherwise, * we need to prepare information about how to return the results. */ - if (fcache->next_query_index < list_length(fcache->func->plansource_list)) + if (fcache->next_query_index < fcache->func->num_queries) return true; /* @@ -718,7 +741,7 @@ init_execution_state(SQLFunctionCachePtr fcache) * * We expect to be called in a short-lived memory context (typically a * query's per-tuple context). Data that is to be part of the hash entry - * must be copied into the hcontext, or put into a CachedPlanSource. + * must be copied into the hcontext or pcontext as appropriate. */ static void sql_compile_callback(FunctionCallInfo fcinfo, @@ -731,18 +754,17 @@ sql_compile_callback(FunctionCallInfo fcinfo, Form_pg_proc procedureStruct = (Form_pg_proc) GETSTRUCT(procedureTuple); ErrorContextCallback comperrcontext; MemoryContext hcontext; + MemoryContext pcontext; MemoryContext oldcontext = CurrentMemoryContext; Oid rettype; TupleDesc rettupdesc; Datum tmp; bool isNull; - List *queryTree_list; - List *plansource_list; - ListCell *qlc; - ListCell *plc; + List *source_list; /* - * Setup error traceback support for ereport() during compile + * Setup error traceback support for ereport() during compile. (This is + * mainly useful for reporting parse errors from pg_parse_query.) */ comperrcontext.callback = sql_compile_error_callback; comperrcontext.arg = func; @@ -757,6 +779,15 @@ sql_compile_callback(FunctionCallInfo fcinfo, "SQL function", ALLOCSET_SMALL_SIZES); + /* + * Create the not-as-long-lived pcontext. We make this a child of + * hcontext so that it doesn't require separate deletion. + */ + pcontext = AllocSetContextCreate(hcontext, + "SQL function parse trees", + ALLOCSET_SMALL_SIZES); + func->pcontext = pcontext; + /* * copy function name immediately for use by error reporting callback, and * for use as memory context identifier @@ -815,96 +846,111 @@ sql_compile_callback(FunctionCallInfo fcinfo, procedureTuple, Anum_pg_proc_prosqlbody, &isNull); - - /* - * Now we must parse and rewrite the function's queries, and create - * CachedPlanSources. Note that we apply CreateCachedPlan[ForQuery] - * immediately so that it captures the original state of the parsetrees, - * but we don't do CompleteCachedPlan until after fixing up the final - * query's targetlist. - */ - queryTree_list = NIL; - plansource_list = NIL; if (!isNull) { /* Source queries are already parse-analyzed */ Node *n; - List *stored_query_list; - ListCell *lc; n = stringToNode(TextDatumGetCString(tmp)); if (IsA(n, List)) - stored_query_list = linitial_node(List, castNode(List, n)); + source_list = linitial_node(List, castNode(List, n)); else - stored_query_list = list_make1(n); + source_list = list_make1(n); + func->raw_source = false; + } + else + { + /* Source queries are raw parsetrees */ + source_list = pg_parse_query(func->src); + func->raw_source = true; + } - foreach(lc, stored_query_list) - { - Query *parsetree = lfirst_node(Query, lc); - CachedPlanSource *plansource; - List *queryTree_sublist; + /* + * Note: we must save the number of queries so that we'll still remember + * how many there are after we discard source_list. + */ + func->num_queries = list_length(source_list); + + /* Save the source trees in pcontext for now. */ + MemoryContextSwitchTo(pcontext); + func->source_list = copyObject(source_list); + MemoryContextSwitchTo(oldcontext); - /* Count source queries for sql_compile_error_callback */ - func->error_query_index++; + /* + * We now have a fully valid hash entry, so reparent hcontext under + * CacheMemoryContext to make all the subsidiary data long-lived, and only + * then install the hcontext link so that sql_delete_callback will know to + * delete it. + */ + MemoryContextSetParent(hcontext, CacheMemoryContext); + func->hcontext = hcontext; - plansource = CreateCachedPlanForQuery(parsetree, - func->src, - CreateCommandTag((Node *) parsetree)); - plansource_list = lappend(plansource_list, plansource); + error_context_stack = comperrcontext.previous; +} - AcquireRewriteLocks(parsetree, true, false); - queryTree_sublist = pg_rewrite_query(parsetree); - queryTree_list = lappend(queryTree_list, queryTree_sublist); - } +/* + * Convert the SQL function's next query from source form (RawStmt or Query) + * into a CachedPlanSource. If it's the last query, also determine whether + * the function returnsTuple. + */ +static void +prepare_next_query(SQLFunctionHashEntry *func) +{ + int qindex; + bool islast; + CachedPlanSource *plansource; + List *queryTree_list; + MemoryContext oldcontext; + + /* Which query should we process? */ + qindex = list_length(func->plansource_list); + Assert(qindex < func->num_queries); /* else caller error */ + islast = (qindex + 1 >= func->num_queries); + + /* + * Parse and/or rewrite the query, creating a CachedPlanSource that holds + * a copy of the original parsetree. + */ + if (!func->raw_source) + { + /* Source queries are already parse-analyzed */ + Query *parsetree = list_nth_node(Query, func->source_list, qindex); + + plansource = CreateCachedPlanForQuery(parsetree, + func->src, + CreateCommandTag((Node *) parsetree)); + AcquireRewriteLocks(parsetree, true, false); + queryTree_list = pg_rewrite_query(parsetree); } else { /* Source queries are raw parsetrees */ - List *raw_parsetree_list; - ListCell *lc; - - raw_parsetree_list = pg_parse_query(func->src); - - foreach(lc, raw_parsetree_list) - { - RawStmt *parsetree = lfirst_node(RawStmt, lc); - CachedPlanSource *plansource; - List *queryTree_sublist; - - /* Count source queries for sql_compile_error_callback */ - func->error_query_index++; - - plansource = CreateCachedPlan(parsetree, - func->src, - CreateCommandTag(parsetree->stmt)); - plansource_list = lappend(plansource_list, plansource); - - queryTree_sublist = pg_analyze_and_rewrite_withcb(parsetree, - func->src, - (ParserSetupHook) sql_fn_parser_setup, - func->pinfo, - NULL); - queryTree_list = lappend(queryTree_list, queryTree_sublist); - } + RawStmt *parsetree = list_nth_node(RawStmt, func->source_list, qindex); + + plansource = CreateCachedPlan(parsetree, + func->src, + CreateCommandTag(parsetree->stmt)); + queryTree_list = pg_analyze_and_rewrite_withcb(parsetree, + func->src, + (ParserSetupHook) sql_fn_parser_setup, + func->pinfo, + NULL); } - /* Failures below here are reported as "during startup" */ - func->error_query_index = 0; - /* * Check that there are no statements we don't want to allow. */ - check_sql_fn_statements(queryTree_list); + check_sql_fn_statement(queryTree_list); /* - * Check that the function returns the type it claims to. Although in - * simple cases this was already done when the function was defined, we - * have to recheck because database objects used in the function's queries - * might have changed type. We'd have to recheck anyway if the function - * had any polymorphic arguments. Moreover, check_sql_fn_retval takes - * care of injecting any required column type coercions. (But we don't - * ask it to insert nulls for dropped columns; the junkfilter handles - * that.) + * If this is the last query, check that the function returns the type it + * claims to. Although in simple cases this was already done when the + * function was defined, we have to recheck because database objects used + * in the function's queries might have changed type. We'd have to + * recheck anyway if the function had any polymorphic arguments. Moreover, + * check_sql_stmt_retval takes care of injecting any required column type + * coercions. (But we don't ask it to insert nulls for dropped columns; + * the junkfilter handles that.) * * Note: we set func->returnsTuple according to whether we are returning * the whole tuple result or just a single column. In the latter case we @@ -914,69 +960,60 @@ sql_compile_callback(FunctionCallInfo fcinfo, * the rowtype column into multiple columns, since we have no way to * notify the caller that it should do that.) */ - func->returnsTuple = check_sql_fn_retval(queryTree_list, - rettype, - rettupdesc, - procedureStruct->prokind, - false); + if (islast) + func->returnsTuple = check_sql_stmt_retval(queryTree_list, + func->rettype, + func->rettupdesc, + func->prokind, + false); /* - * Now that check_sql_fn_retval has done its thing, we can complete plan + * Now that check_sql_stmt_retval has done its thing, we can complete plan * cache entry creation. */ - forboth(qlc, queryTree_list, plc, plansource_list) - { - List *queryTree_sublist = lfirst(qlc); - CachedPlanSource *plansource = lfirst(plc); - bool islast; - - /* Finish filling in the CachedPlanSource */ - CompleteCachedPlan(plansource, - queryTree_sublist, - NULL, - NULL, - 0, - (ParserSetupHook) sql_fn_parser_setup, - func->pinfo, - CURSOR_OPT_PARALLEL_OK | CURSOR_OPT_NO_SCROLL, - false); + CompleteCachedPlan(plansource, + queryTree_list, + NULL, + NULL, + 0, + (ParserSetupHook) sql_fn_parser_setup, + func->pinfo, + CURSOR_OPT_PARALLEL_OK | CURSOR_OPT_NO_SCROLL, + false); - /* - * Install post-rewrite hook. Its arg is the hash entry if this is - * the last statement, else NULL. - */ - islast = (lnext(queryTree_list, qlc) == NULL); - SetPostRewriteHook(plansource, - sql_postrewrite_callback, - islast ? func : NULL); - } + /* + * Install post-rewrite hook. Its arg is the hash entry if this is the + * last statement, else NULL. + */ + SetPostRewriteHook(plansource, + sql_postrewrite_callback, + islast ? func : NULL); /* * While the CachedPlanSources can take care of themselves, our List * pointing to them had better be in the hcontext. */ - MemoryContextSwitchTo(hcontext); - plansource_list = list_copy(plansource_list); + oldcontext = MemoryContextSwitchTo(func->hcontext); + func->plansource_list = lappend(func->plansource_list, plansource); MemoryContextSwitchTo(oldcontext); /* - * We have now completed building the hash entry, so reparent stuff under - * CacheMemoryContext to make all the subsidiary data long-lived. - * Importantly, this part can't fail partway through. + * As soon as we've linked the CachedPlanSource into the list, mark it as + * "saved". */ - foreach(plc, plansource_list) - { - CachedPlanSource *plansource = lfirst(plc); + SaveCachedPlan(plansource); - SaveCachedPlan(plansource); + /* + * Finally, if this was the last statement, we can flush the pcontext with + * the original query trees; they're all safely copied into + * CachedPlanSources now. + */ + if (islast) + { + func->source_list = NIL; /* avoid dangling pointer */ + MemoryContextDelete(func->pcontext); + func->pcontext = NULL; } - MemoryContextSetParent(hcontext, CacheMemoryContext); - - /* And finally, arm sql_delete_callback to delete the stuff again */ - func->plansource_list = plansource_list; - func->hcontext = hcontext; - - error_context_stack = comperrcontext.previous; } /* Deletion callback used by funccache.c */ @@ -997,7 +1034,7 @@ sql_delete_callback(CachedFunction *cfunc) /* * If we have an hcontext, free it, thereby getting rid of all subsidiary - * data. + * data. (If we still have a pcontext, this gets rid of that too.) */ if (func->hcontext) MemoryContextDelete(func->hcontext); @@ -1016,7 +1053,7 @@ sql_postrewrite_callback(List *querytree_list, void *arg) check_sql_fn_statement(querytree_list); /* - * If this is the last query, we must re-do what check_sql_fn_retval did + * If this is the last query, we must re-do what check_sql_stmt_retval did * to its targetlist. Also check that returnsTuple didn't change (it * probably cannot, but be cautious). */ @@ -1749,14 +1786,10 @@ sql_compile_error_callback(void *arg) } /* - * If we failed while parsing an identifiable query within the function, - * report that. Otherwise say it was "during startup". + * sql_compile_callback() doesn't do any per-query processing, so just + * report the context as "during startup". */ - if (func->error_query_index > 0) - errcontext("SQL function \"%s\" statement %d", - func->fname, func->error_query_index); - else - errcontext("SQL function \"%s\" during startup", func->fname); + errcontext("SQL function \"%s\" during startup", func->fname); } /* diff --git a/src/test/regress/expected/create_function_sql.out b/src/test/regress/expected/create_function_sql.out index 70ed5742b65..2ee7631044e 100644 --- a/src/test/regress/expected/create_function_sql.out +++ b/src/test/regress/expected/create_function_sql.out @@ -680,6 +680,43 @@ SELECT * FROM voidtest5(3); ----------- (0 rows) +-- DDL within a SQL function can now affect later statements in the function; +-- though that doesn't work if check_function_bodies is on. +SET check_function_bodies TO off; +CREATE FUNCTION create_and_insert() RETURNS VOID LANGUAGE sql AS $$ + create table ddl_test (f1 int); + insert into ddl_test values (1.2); +$$; +SELECT create_and_insert(); + create_and_insert +------------------- + +(1 row) + +TABLE ddl_test; + f1 +---- + 1 +(1 row) + +CREATE FUNCTION alter_and_insert() RETURNS VOID LANGUAGE sql AS $$ + alter table ddl_test alter column f1 type numeric; + insert into ddl_test values (1.2); +$$; +SELECT alter_and_insert(); + alter_and_insert +------------------ + +(1 row) + +TABLE ddl_test; + f1 +----- + 1 + 1.2 +(2 rows) + +RESET check_function_bodies; -- Regression tests for bugs: -- Check that arguments that are R/W expanded datums aren't corrupted by -- multiple uses. This test knows that array_append() returns a R/W datum @@ -722,7 +759,7 @@ CREATE FUNCTION test1 (int) RETURNS int LANGUAGE SQL ERROR: only one AS item needed for language "sql" -- Cleanup DROP SCHEMA temp_func_test CASCADE; -NOTICE: drop cascades to 31 other objects +NOTICE: drop cascades to 34 other objects DETAIL: drop cascades to function functest_a_1(text,date) drop cascades to function functest_a_2(text[]) drop cascades to function functest_a_3() @@ -753,6 +790,9 @@ drop cascades to function voidtest2(integer,integer) drop cascades to function voidtest3(integer) drop cascades to function voidtest4(integer) drop cascades to function voidtest5(integer) +drop cascades to function create_and_insert() +drop cascades to table ddl_test +drop cascades to function alter_and_insert() drop cascades to function double_append(anyarray,anyelement) DROP USER regress_unpriv_user; RESET search_path; diff --git a/src/test/regress/expected/rangefuncs.out b/src/test/regress/expected/rangefuncs.out index 397a8b35d6d..c21be83aa4a 100644 --- a/src/test/regress/expected/rangefuncs.out +++ b/src/test/regress/expected/rangefuncs.out @@ -1885,7 +1885,7 @@ select * from array_to_set(array['one', 'two']) as t(f1 numeric(4,2),f2 text); select * from array_to_set(array['one', 'two']) as t(f1 point,f2 text); ERROR: return type mismatch in function declared to return record DETAIL: Final statement returns integer instead of point at column 1. -CONTEXT: SQL function "array_to_set" during startup +CONTEXT: SQL function "array_to_set" statement 1 -- with "strict", this function can't be inlined in FROM explain (verbose, costs off) select * from array_to_set(array['one', 'two']) as t(f1 numeric(4,2),f2 text); diff --git a/src/test/regress/sql/create_function_sql.sql b/src/test/regress/sql/create_function_sql.sql index 1dd3c4a4e5f..68776be4c8d 100644 --- a/src/test/regress/sql/create_function_sql.sql +++ b/src/test/regress/sql/create_function_sql.sql @@ -394,6 +394,31 @@ CREATE FUNCTION voidtest5(a int) RETURNS SETOF VOID LANGUAGE SQL AS $$ SELECT generate_series(1, a) $$ STABLE; SELECT * FROM voidtest5(3); +-- DDL within a SQL function can now affect later statements in the function; +-- though that doesn't work if check_function_bodies is on. + +SET check_function_bodies TO off; + +CREATE FUNCTION create_and_insert() RETURNS VOID LANGUAGE sql AS $$ + create table ddl_test (f1 int); + insert into ddl_test values (1.2); +$$; + +SELECT create_and_insert(); + +TABLE ddl_test; + +CREATE FUNCTION alter_and_insert() RETURNS VOID LANGUAGE sql AS $$ + alter table ddl_test alter column f1 type numeric; + insert into ddl_test values (1.2); +$$; + +SELECT alter_and_insert(); + +TABLE ddl_test; + +RESET check_function_bodies; + -- Regression tests for bugs: -- Check that arguments that are R/W expanded datums aren't corrupted by -- 2.43.5
I wrote: > Anyway, I feel pretty good about this patch now and am quite content > to stop here for PG 18. Since feature freeze is fast approaching, I did a tiny bit more cosmetic work on this patchset and then pushed it. (There's still plenty of time for adjustments if you have further comments.) Thanks for working on this! This is something I've wanted to see done ever since we invented plancache. regards, tom lane
Tom Lane писал(а) 2025-04-02 21:09: > I wrote: >> Anyway, I feel pretty good about this patch now and am quite content >> to stop here for PG 18. > > Since feature freeze is fast approaching, I did a tiny bit more > cosmetic work on this patchset and then pushed it. (There's still > plenty of time for adjustments if you have further comments.) > > Thanks for working on this! This is something I've wanted to see > done ever since we invented plancache. > > regards, tom lane Hi. I've looked through the latest patch series, but didn't have much time to examine them thoroughly. Haven't found any issue so far. From cosmetic things I've noticed - I would set func->pcontext to NULL in sql_delete_callback() to avoid dangling pointer if we error out early (but it seems only to be a matter of taste). -- Best regards, Alexander Pyhalov, Postgres Professional
Hello Tom,
02.04.2025 21:09, Tom Lane wrote:
02.04.2025 21:09, Tom Lane wrote:
Since feature freeze is fast approaching, I did a tiny bit more cosmetic work on this patchset and then pushed it. (There's still plenty of time for adjustments if you have further comments.)
I've discovered that starting from 0dca5d68d, the following query:
CREATE FUNCTION f(x anyelement) RETURNS anyarray AS '' LANGUAGE SQL;
SELECT f(0);
triggers:
TRAP: failed Assert("fcache->func->rettype == VOIDOID"), File: "functions.c", Line: 1737, PID: 3784779
On 0dca5d68d~1, it raises:
ERROR: return type mismatch in function declared to return integer[]
DETAIL: Function's final statement must be SELECT or INSERT/UPDATE/DELETE/MERGE RETURNING.
CONTEXT: SQL function "f" during startup
Best regards,
Alexander Lakhin
Neon (https://neon.tech)
Alexander Lakhin <exclusion@gmail.com> writes: > I've discovered that starting from 0dca5d68d, the following query: > CREATE FUNCTION f(x anyelement) RETURNS anyarray AS '' LANGUAGE SQL; > SELECT f(0); > triggers: > TRAP: failed Assert("fcache->func->rettype == VOIDOID"), File: "functions.c", Line: 1737, PID: 3784779 Drat. I thought I'd tested the empty-function-body case, but evidently that was a few changes too far back. Will fix, thanks for catching it. regards, tom lane
Hello Tom,
03.04.2025 22:13, Tom Lane wrpte:
03.04.2025 22:13, Tom Lane wrpte:
Drat. I thought I'd tested the empty-function-body case, but evidently that was a few changes too far back. Will fix, thanks for catching it.
I've stumbled upon another defect introduced with 0dca5d68d:
CREATE FUNCTION f(VARIADIC ANYARRAY) RETURNS ANYELEMENT AS $$ SELECT x FROM generate_series(1,1) g(i) $$ LANGUAGE SQL IMMUTABLE;
SELECT f(1);
SELECT f(1);
fails under Valgrind with:
2025-04-04 18:31:13.771 UTC [242811] LOG: statement: SELECT f(1);
==00:00:00:19.324 242811== Invalid read of size 4
==00:00:00:19.324 242811== at 0x48D610: copyObjectImpl (copyfuncs.c:187)
==00:00:00:19.324 242811== by 0x490194: _copyAlias (copyfuncs.funcs.c:48)
==00:00:00:19.324 242811== by 0x48D636: copyObjectImpl (copyfuncs.switch.c:19)
==00:00:00:19.324 242811== by 0x491CBF: _copyRangeFunction (copyfuncs.funcs.c:1279)
==00:00:00:19.324 242811== by 0x48DC9C: copyObjectImpl (copyfuncs.switch.c:271)
==00:00:00:19.324 242811== by 0x4A295C: list_copy_deep (list.c:1652)
==00:00:00:19.324 242811== by 0x4900FA: copyObjectImpl (copyfuncs.c:192)
==00:00:00:19.324 242811== by 0x4931E4: _copySelectStmt (copyfuncs.funcs.c:2109)
==00:00:00:19.324 242811== by 0x48E00C: copyObjectImpl (copyfuncs.switch.c:436)
==00:00:00:19.324 242811== by 0x492F85: _copyRawStmt (copyfuncs.funcs.c:2026)
==00:00:00:19.324 242811== by 0x48DFBC: copyObjectImpl (copyfuncs.switch.c:421)
==00:00:00:19.324 242811== by 0x76DCCE: CreateCachedPlan (plancache.c:213)
==00:00:00:19.324 242811== Address 0x11b478d0 is 4,032 bytes inside a block of size 8,192 alloc'd
==00:00:00:19.324 242811== at 0x4846828: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==00:00:00:19.324 242811== by 0x7AA15D: AllocSetContextCreateInternal (aset.c:444)
==00:00:00:19.324 242811== by 0x43CF56: init_sql_fcache (functions.c:616)
==00:00:00:19.324 242811== by 0x43CF56: fmgr_sql (functions.c:1484)
==00:00:00:19.324 242811== by 0x423E13: ExecInterpExpr (execExprInterp.c:926)
==00:00:00:19.324 242811== by 0x41EB54: ExecInterpExprStillValid (execExprInterp.c:2299)
...
Best regards,
Alexander Lakhin
Neon (https://neon.tech)
Alexander Lakhin <exclusion@gmail.com> writes: > I've stumbled upon another defect introduced with 0dca5d68d: > CREATE FUNCTION f(VARIADIC ANYARRAY) RETURNS ANYELEMENT AS $$ SELECT x FROM generate_series(1,1) g(i) $$ LANGUAGE SQL > IMMUTABLE; > SELECT f(1); > SELECT f(1); Hmm, I see regression=# CREATE FUNCTION f(VARIADIC ANYARRAY) RETURNS ANYELEMENT AS $$ SELECT x FROM generate_series(1,1) g(i) $$ LANGUAGESQL IMMUTABLE; CREATE FUNCTION regression=# SELECT f(1); ERROR: column "x" does not exist LINE 1: SELECT x FROM generate_series(1,1) g(i) ^ QUERY: SELECT x FROM generate_series(1,1) g(i) CONTEXT: SQL function "f" statement 1 regression=# SELECT f(1); ERROR: unrecognized node type: 2139062143 CONTEXT: SQL function "f" statement 1 Did you intend the typo? The "unrecognized node type" does indicate a problem, but your message doesn't seem to indicate that you're expecting a syntax error. regards, tom lane
05.04.2025 00:47, Tom Lane wrote:
Alexander Lakhin <exclusion@gmail.com> writes:I've stumbled upon another defect introduced with 0dca5d68d: CREATE FUNCTION f(VARIADIC ANYARRAY) RETURNS ANYELEMENT AS $$ SELECT x FROM generate_series(1,1) g(i) $$ LANGUAGE SQL IMMUTABLE; SELECT f(1); SELECT f(1);Hmm, I see regression=# CREATE FUNCTION f(VARIADIC ANYARRAY) RETURNS ANYELEMENT AS $$ SELECT x FROM generate_series(1,1) g(i) $$ LANGUAGE SQL IMMUTABLE; CREATE FUNCTION regression=# SELECT f(1); ERROR: column "x" does not exist LINE 1: SELECT x FROM generate_series(1,1) g(i) ^ QUERY: SELECT x FROM generate_series(1,1) g(i) CONTEXT: SQL function "f" statement 1 regression=# SELECT f(1); ERROR: unrecognized node type: 2139062143 CONTEXT: SQL function "f" statement 1 Did you intend the typo? The "unrecognized node type" does indicate a problem, but your message doesn't seem to indicate that you're expecting a syntax error.
Yes, the typo is intended. With Valgrind, I get the "column does not exist"
error on the first call and the Valgrind complaint on the second one.
Best regards,
Alexander Lakhin
Neon (https://neon.tech)