Re: delete statement returning too many results - Mailing list pgsql-general

From Tom Lane
Subject Re: delete statement returning too many results
Date
Msg-id 632644.1669655513@sss.pgh.pa.us
Whole thread Raw
In response to Re: delete statement returning too many results  ("David G. Johnston" <david.g.johnston@gmail.com>)
Responses Re: delete statement returning too many results  (Harmen <harmen@lijzij.de>)
List pgsql-general
"David G. Johnston" <david.g.johnston@gmail.com> writes:
> There is a nice big caution regarding the default read committed isolation
> mode, order by, and for update, in the documentation, but I cannot work out
> exactly why this example seems to be triggering it.

The <caution> is talking about a rather different scenario.

I managed to reproduce this locally.  I find that initially, with an
empty queue table, you get a query plan like

 Delete on queue  (cost=0.38..8.42 rows=1 width=38)
   ->  Nested Loop  (cost=0.38..8.42 rows=1 width=38)
         ->  HashAggregate  (cost=0.23..0.24 rows=1 width=40)
               Group Key: "ANY_subquery".id
               ->  Subquery Scan on "ANY_subquery"  (cost=0.15..0.22 rows=1 width=40)
                     ->  Limit  (cost=0.15..0.21 rows=1 width=14)
                           ->  LockRows  (cost=0.15..74.15 rows=1200 width=14)
                                 ->  Index Scan using queue_pkey on queue queue_1  (cost=0.15..62.15 rows=1200
width=14)
         ->  Index Scan using queue_pkey on queue  (cost=0.15..8.17 rows=1 width=14)
               Index Cond: (id = "ANY_subquery".id)

which is fine because the LockRows bit will be run only once.

However, after the table's been stomped on for awhile (and probably
not till after autovacuum runs), that switches to

 Delete on queue  (cost=0.25..16.31 rows=1 width=38)
   ->  Nested Loop Semi Join  (cost=0.25..16.31 rows=1 width=38)
         Join Filter: (queue.id = "ANY_subquery".id)
         ->  Index Scan using queue_pkey on queue  (cost=0.12..8.14 rows=1 width=14)
         ->  Subquery Scan on "ANY_subquery"  (cost=0.12..8.16 rows=1 width=40)
               ->  Limit  (cost=0.12..8.15 rows=1 width=14)
                     ->  LockRows  (cost=0.12..8.15 rows=1 width=14)
                           ->  Index Scan using queue_pkey on queue queue_1  (cost=0.12..8.14 rows=1 width=14)

and then you start to get failures, because each re-execution of
the subquery produces a fresh row thanks to the silent SKIP LOCKED.

So basically it's unsafe to run the sub-select more than once,
but the query as written leaves it up to the planner whether
to do that.  I'd suggest rephrasing as

WITH target_rows AS MATERIALIZED (
     SELECT id
     FROM queue
     ORDER BY id
     LIMIT 1
     FOR UPDATE
     SKIP LOCKED
)
DELETE FROM queue
  WHERE id IN (SELECT * FROM target_rows)
RETURNING *;

            regards, tom lane



pgsql-general by date:

Previous
From: "David G. Johnston"
Date:
Subject: Re: delete statement returning too many results
Next
From: Kirk Wolak
Date:
Subject: Re: Getting PSQL in Windows to support TAB/Autocomplete via modified readline...