Thread: Unnecessary locks for partitioned tables
Hello!
Recently I`ve been pushing into life a new project and immediately experienced an Out of shared memory error while querying partitioned tables.
Imagine a huge busy table that you want to split into hundreds of partitions by list. Partition key is a kind of surrogate key that can be calculated outside of database and can be provided as parameter.
create table t (a int) partition by list (a);
select format('create table %I partition of t for values in (%s)', 'p'||x, x)
from generate_series(0,20) x
Query is executed inside stored procedure or as a prepared statement by, let’s say, Spring JPA.
prepare s (int) as select * from t where a = $1;
explain (analyze) execute s(0);
At the beginning database (12,13,14, or 15 versions) will try to create a custom execution plan like this one:
Query Text: prepare s (int) as select * from t where a = $1;
Seq Scan on p0 t (cost=0.00..41.88 rows=13 width=4) (actual time=0.011..0.011 rows=0 loops=1)
Filter: (a = 0)
We have only one partition examined. But suddenly database decides to create more generic query plan to deal with all range of partitions.
begin;
set local plan_cache_mode = force_generic_plan;
Now all the queries looks like:
Query Text: prepare s (int) as select * from t where a = $1;
Append (cost=0.00..880.74 rows=273 width=4) (actual time=0.014..0.015 rows=0 loops=1)
Subplans Removed: 20
-> Seq Scan on p0 t_1 (cost=0.00..41.88 rows=13 width=4) (actual time=0.013..0.013 rows=0 loops=1)
Filter: (a = $1)
Here the most important part is a count of removed plans. Database realizes that all partitions can be skipped except interested one. But under load we start receiving errors:
ERROR: out of shared memory
Hint: You might need to increase max_locks_per_transaction.
Ok, let`s increase max_locks_per_transaction, but why this type of query produces so much locks? Looks like DB issues locks for all the partitioned objects involved in query and ONLY AFTER THAT it does partition pruning.
Here are locks :
select relation::regclass, mode
from pg_locks
where pid = pg_backend_pid() and
locktype = 'relation' and relation::regclass <> 'pg_locks'::regclass;
"p14" "AccessShareLock"
"p20" "AccessShareLock"
"p17" "AccessShareLock"
"pg_type_oid_index" "AccessShareLock"
"p19" "AccessShareLock"
....... etc
But for real life there are also indexes, keys, up to 5000 locks in my case for single select.
Setting plan_cache_mode = force_custom_plan resolves issue and also makes DB more performant (*5 in my case, literally from 140 to 650 executions per second on my development pc).
I`m calling to PG authors, is there any way to optimize prepared queries and minimize unnecessary locks in generic query plan? For instance, lock parent table, eliminate unnecessary partitions, lock remining objects. This may help much in any place where partition pruning possible.
Meanwhile I`m using force_custom_plan - no more huge peaks, all queries are at the lowest seen execution time. As a drawback there must be an overhead to compile plans every time.
Thanks,
Nikolay
On Wed, 2022-11-09 at 14:11 +0300, n.kobzarev@aeronavigator.ru wrote: > Recently I`ve been pushing into life a new project and immediately experienced an > Out of shared memory error while querying partitioned tables. > > ERROR: out of shared memory > Hint: You might need to increase max_locks_per_transaction. > > Ok, let`s increase max_locks_per_transaction, but why this type of query produces so much locks? > Looks like DB issues locks for all the partitioned objects involved in query and ONLY AFTER THAT > it does partition pruning. Yes, of course. It needs an ACCESS SHARE lock when it looks at metadata like the partition constraint, and locks are held until the end of the transaction. Yours, Laurenz Albe -- Cybertec | https://www.cybertec-postgresql.com
> > On Wed, 2022-11-09 at 14:11 +0300, n.kobzarev@aeronavigator.ru wrote: > > Recently I`ve been pushing into life a new project and immediately > > experienced an Out of shared memory error while querying partitioned tables. > > > > ERROR: out of shared memory > > Hint: You might need to increase max_locks_per_transaction. > > > > Ok, let`s increase max_locks_per_transaction, but why this type of query produces so much locks? > > Looks like DB issues locks for all the partitioned objects involved in > > query and ONLY AFTER THAT it does partition pruning. > Yes, of course. It needs an ACCESS SHARE lock when it looks at metadata like the partition constraint, and locks > areheld until the end of the transaction. > Yours, > Laurenz Albe > -- > Cybertec | https://www.cybertec-postgresql.com Thank you for reply! But why it is not necessary in case of custom plan? Oh, I did not explicitly write that, in case of custom plan (first attempts or with force_custom_plan) database holds onlya couple of locks! Why in this case it is sufficient to lock only one partition and parent table ? From my perspective, if exists the case of custom plan that produces locks for only one partition and parent, we can makethe same optimization in case of generic plan too. Especially when database already identifies possibility of partitionpruning. Also, to query partitioned table metadata it is not required to lock all partitions, but parent only. Isn't it? Nikolay
<n.kobzarev@aeronavigator.ru> writes: > Oh, I did not explicitly write that, in case of custom plan (first attempts or with force_custom_plan) database holds onlya couple of locks! Why in this case it is sufficient to lock only one partition and parent table ? Because partition routing is done at planning time in that case, based on the actual values of the plan's parameters. A generic plan doesn't have the parameter values available, so it has to build plan nodes for every partition that could conceivably be accessed. So for queries of this kind (ie point queries against heavily partitioned tables) the generic plan is pretty much always going to lose. That doesn't bother me enormously --- there are other query patterns with similar behavior. If you know that your queries always need custom plans, I question the value of using PREPARE at all. regards, tom lane
> <n.kobzarev@aeronavigator.ru> writes: > > Oh, I did not explicitly write that, in case of custom plan (first attempts or with force_custom_plan) database > > holds only a couple of locks! > > Why in this > case it is sufficient to lock only one partition and parent table ? > Because partition routing is done at planning time in that case, based on the actual values of the plan's parameters. > A generic plan doesn't have the parameter values available, so it has to build plan nodes for every partition that could conceivably be accessed. > So for queries of this kind (ie point queries against heavily partitioned > tables) the generic plan is pretty much always going to lose. > That doesn't bother me enormously --- there are other query patterns with similar behavior. > If you know that your queries always need custom plans, I question the value of using PREPARE at all. > > regards, tom lane Thank you for your time, Tom. PREPARE is not mandatory, it is mostly for reproducing purposes. Queries in stored procedures behaves like prepared statements too, that is expected. If someone would create delayed locking for generic plans, after parameters are known and partition pruning occurs, I believe generic plan will be on pars with custom. So, I`m sticking with plan cache parameter for feature development, that was clear. Thanks, Nikolay
On Thu, 10 Nov 2022 at 04:11, <n.kobzarev@aeronavigator.ru> wrote: > If someone would create delayed locking for generic plans, after parameters > are known and partition pruning occurs, I believe generic plan will be on > pars with custom. > So, I`m sticking with plan cache parameter for feature development, that was > clear. The current problem is that the locks must be obtained on the objects mentioned in the plan so that we can check if anying has been modified that might invalidate the prepared plan. For example, index has been dropped, partition dropped, etc. The partition pruning in your prepared plan is currently done during executor startup, which is after the locks are obtained (which is why we must lock everything in the plan). There is a patch around at the moment that moves the run-time partition pruning away from executor startup to before we obtain the locks so that we can forego the locking of partitions which have been pruned. If that patch makes it then the problem will be solved, at least starting with the version the patch makes it into. David