Re: Unexpected result count from update statement on partitioned table - Mailing list pgsql-general

From Tom Lane
Subject Re: Unexpected result count from update statement on partitioned table
Date
Msg-id 167832.1608318979@sss.pgh.pa.us
Whole thread Raw
In response to Re: Unexpected result count from update statement on partitioned table  (Laurenz Albe <laurenz.albe@cybertec.at>)
Responses Re: Unexpected result count from update statement on partitioned table
List pgsql-general
Laurenz Albe <laurenz.albe@cybertec.at> writes:
> The subquery is executed twice, and the two executions obviously don't
> return the same results.  I am at a loss for an explanation ...

Yeah, this is a fairly fundamental shortcoming in inheritance_planner():
it supposes that it can duplicate the whole query for each target table.
If you have a sub-SELECT that generates unstable results, then the
duplicated copies don't necessarily generate the same results.
And multiple executions of a sub-SELECT with "for update skip locked"
are guaranteed to not give the same results, because the second one
will skip the row(s) already locked by the first one.

It seems to work as desired if you stick the unstable result into a CTE:

=# explain
with sub as (select id
  from task_parent
  where reserved = false
    and task_type = 1 or task_type = 2
  order by task_timestamp
  limit 50
  for update skip locked)
update task_parent
set reserved = true
from sub
where sub.id = task_parent.id
returning task_parent.id;
                                                                      QUERY PLAN
                              

------------------------------------------------------------------------------------------------------------------------------------------------------
 Update on task_parent  (cost=6.30..10069.93 rows=100 width=57)
   Update on task_child_1 task_parent_1
   Update on task_child_2 task_parent_2
   CTE sub
     ->  Limit  (cost=0.85..4.68 rows=50 width=26)
           ->  LockRows  (cost=0.85..38252.82 rows=500000 width=26)
                 ->  Merge Append  (cost=0.85..33252.82 rows=500000 width=26)
                       Sort Key: task_parent_3.task_timestamp
                       ->  Index Scan using task_child_1_task_timestamp_idx on task_child_1 task_parent_4
(cost=0.42..14123.60rows=249960 width=26) 
                             Filter: (((NOT reserved) AND (task_type = 1)) OR (task_type = 2))
                       ->  Index Scan using task_child_2_task_timestamp_idx on task_child_2 task_parent_5
(cost=0.42..14129.20rows=250040 width=26) 
                             Filter: (((NOT reserved) AND (task_type = 1)) OR (task_type = 2))
   ->  Hash Join  (cost=1.62..5032.07 rows=50 width=57)
         Hash Cond: (task_parent_1.id = sub.id)
         ->  Seq Scan on task_child_1 task_parent_1  (cost=0.00..4092.60 rows=249960 width=24)
         ->  Hash  (cost=1.00..1.00 rows=50 width=40)
               ->  CTE Scan on sub  (cost=0.00..1.00 rows=50 width=40)
   ->  Hash Join  (cost=1.62..5033.18 rows=50 width=57)
         Hash Cond: (task_parent_2.id = sub.id)
         ->  Seq Scan on task_child_2 task_parent_2  (cost=0.00..4093.40 rows=250040 width=24)
         ->  Hash  (cost=1.00..1.00 rows=50 width=40)
               ->  CTE Scan on sub  (cost=0.00..1.00 rows=50 width=40)
(22 rows)

It's been obvious for some time that inheritance_planner() needs to
be nuked from orbit, because aside from this fundamental semantic
issue it's got horrible performance problems with large inheritance
trees (ie many partitions).  We might finally get that done for v14
--- at least, there's a patch in the queue about it.  In existing
releases, I recommend the CTE solution.

            regards, tom lane



pgsql-general by date:

Previous
From: "Joel Jacobson"
Date:
Subject: Re: Avoid undesired flattening of jsonb arrays?
Next
From: Tom Lane
Date:
Subject: Re: SV: SV: SV: Problem with ssl and psql in Postgresql 13