Re: Unexpected result count from update statement on partitioned table - Mailing list pgsql-general
From | Tom Lane |
---|---|
Subject | Re: Unexpected result count from update statement on partitioned table |
Date | |
Msg-id | 167832.1608318979@sss.pgh.pa.us Whole thread Raw |
In response to | Re: Unexpected result count from update statement on partitioned table (Laurenz Albe <laurenz.albe@cybertec.at>) |
Responses |
Re: Unexpected result count from update statement on partitioned table
|
List | pgsql-general |
Laurenz Albe <laurenz.albe@cybertec.at> writes: > The subquery is executed twice, and the two executions obviously don't > return the same results. I am at a loss for an explanation ... Yeah, this is a fairly fundamental shortcoming in inheritance_planner(): it supposes that it can duplicate the whole query for each target table. If you have a sub-SELECT that generates unstable results, then the duplicated copies don't necessarily generate the same results. And multiple executions of a sub-SELECT with "for update skip locked" are guaranteed to not give the same results, because the second one will skip the row(s) already locked by the first one. It seems to work as desired if you stick the unstable result into a CTE: =# explain with sub as (select id from task_parent where reserved = false and task_type = 1 or task_type = 2 order by task_timestamp limit 50 for update skip locked) update task_parent set reserved = true from sub where sub.id = task_parent.id returning task_parent.id; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------ Update on task_parent (cost=6.30..10069.93 rows=100 width=57) Update on task_child_1 task_parent_1 Update on task_child_2 task_parent_2 CTE sub -> Limit (cost=0.85..4.68 rows=50 width=26) -> LockRows (cost=0.85..38252.82 rows=500000 width=26) -> Merge Append (cost=0.85..33252.82 rows=500000 width=26) Sort Key: task_parent_3.task_timestamp -> Index Scan using task_child_1_task_timestamp_idx on task_child_1 task_parent_4 (cost=0.42..14123.60rows=249960 width=26) Filter: (((NOT reserved) AND (task_type = 1)) OR (task_type = 2)) -> Index Scan using task_child_2_task_timestamp_idx on task_child_2 task_parent_5 (cost=0.42..14129.20rows=250040 width=26) Filter: (((NOT reserved) AND (task_type = 1)) OR (task_type = 2)) -> Hash Join (cost=1.62..5032.07 rows=50 width=57) Hash Cond: (task_parent_1.id = sub.id) -> Seq Scan on task_child_1 task_parent_1 (cost=0.00..4092.60 rows=249960 width=24) -> Hash (cost=1.00..1.00 rows=50 width=40) -> CTE Scan on sub (cost=0.00..1.00 rows=50 width=40) -> Hash Join (cost=1.62..5033.18 rows=50 width=57) Hash Cond: (task_parent_2.id = sub.id) -> Seq Scan on task_child_2 task_parent_2 (cost=0.00..4093.40 rows=250040 width=24) -> Hash (cost=1.00..1.00 rows=50 width=40) -> CTE Scan on sub (cost=0.00..1.00 rows=50 width=40) (22 rows) It's been obvious for some time that inheritance_planner() needs to be nuked from orbit, because aside from this fundamental semantic issue it's got horrible performance problems with large inheritance trees (ie many partitions). We might finally get that done for v14 --- at least, there's a patch in the queue about it. In existing releases, I recommend the CTE solution. regards, tom lane
pgsql-general by date: