Thread: Why does delete from table where not exists (select 1 from ... LIMIT 1) perform badly?

Why does delete from table where not exists (select 1 from ... LIMIT 1) perform badly?

From
Palle Girgensohn
Date:
Hi,

I've read about the reason for this before, but cannot find a reference
to it now.

How come the planner treats the
delete from table where not extists(select 1 from table2 where ... LIMIT 1)

so differently, and usually badly, when the LIMIT 1 is there. In older
version of postgresql, I remember that the effect was the opposite, a
limit 1 would actually perform substantially better. Hence we have old
code (and old habits), where the LIMIT 1 is still used.

Shouldn't the planner really understand that the intention is the same
in these two queries?



-- bad:
DELETE FROM iup_locked_gradings ilg
WHERE NOT EXISTS ( SELECT 1 FROM iup_locked_subject ils   WHERE ils.locked_gradings_id = ilg.locked_gradings_id LIMIT 1
)
;

-- good:
DELETE FROM iup_locked_gradings ilg
WHERE NOT EXISTS ( SELECT 1 FROM iup_locked_subject ils   WHERE ils.locked_gradings_id = ilg.locked_gradings_id )
;


pp=# begin; explain DELETE FROM iup_locked_gradings ilg WHERE NOT EXISTS
(SELECT 1 FROM iup_locked_subject ils WHERE ils.locked_gradings_id =
ilg.locked_gradings_id LIMIT 1);
BEGIN                                         QUERY PLAN

-----------------------------------------------------------------------------------------------Delete
(cost=0.00..523542963.48rows=291737 width=6)  ->  Seq Scan on iup_locked_gradings ilg  (cost=0.00..523542963.48 
rows=291737 width=6)        Filter: (NOT (SubPlan 1))        SubPlan 1          ->  Limit  (cost=0.00..897.27 rows=1
width=0)               ->  Seq Scan on iup_locked_subject ils 
(cost=0.00..18842.76 rows=21 width=0)                      Filter: (locked_gradings_id = $0)
(7 rows)


pp=# begin; explain DELETE FROM iup_locked_gradings ilg WHERE NOT EXISTS
(SELECT 1 FROM iup_locked_subject ils WHERE ils.locked_gradings_id =
ilg.locked_gradings_id );
BEGIN                                           QUERY PLAN

---------------------------------------------------------------------------------------------------Delete
(cost=31705.39..47934.47rows=553737 width=12)  ->  Hash Anti Join  (cost=31705.39..47934.47 rows=553737 width=12)
Hash Cond: (ilg.locked_gradings_id = ils.locked_gradings_id)        ->  Seq Scan on iup_locked_gradings ilg
(cost=0.00..6677.44
rows=583474 width=10)        ->  Hash  (cost=15776.83..15776.83 rows=1226373 width=10)              ->  Seq Scan on
iup_locked_subjectils 
(cost=0.00..15776.83 rows=1226373 width=10)
(6 rows)

pp=#










chalmers=# \d iup_locked_gradings                                       Table "public.iup_locked_gradings"       Column
      |  Type   | 
Modifiers

----------------------+---------+----------------------------------------------------------------------------------locked_gradings_id
 | integer | not null default 
nextval('iup_locked_gradings_locked_gradings_id_seq'::regclass)type                 | integer |description          |
text   |name                 | text    |original_gradings_id | integer | 
Indexes:   "iup_locked_gradings_pkey" PRIMARY KEY, btree (locked_gradings_id),
tablespace "opt"
Referenced by:   TABLE "iup_locked_subject" CONSTRAINT
"iup_locked_subject_locked_gradings_id_fkey" FOREIGN KEY
(locked_gradings_id) REFERENCES iup_locked_gradings(locked_gradings_id)
ON UPDATE CASCADE ON DELETE SET NULL
Tablespace: "opt"




chalmers=# \d iup_locked_subject                                      Table "public.iup_locked_subject"      Column
  |  Type   | 
Modifiers

---------------------+---------+--------------------------------------------------------------------------------locked_subject_id
 | integer | not null default 
nextval('iup_locked_subject_locked_subject_id_seq'::regclass)name                | text    | not nulllink_url
| text    |description         | text    |use_measures        | boolean | not null default truelocked_gradings_id  |
integer|original_subject_id | integer |use_fail_warning    | boolean | not null default false 
Indexes:   "iup_locked_subject_pkey" PRIMARY KEY, btree (locked_subject_id),
tablespace "opt"
Foreign-key constraints:   "iup_locked_subject_locked_gradings_id_fkey" FOREIGN KEY
(locked_gradings_id) REFERENCES iup_locked_gradings(locked_gradings_id)
ON UPDATE CASCADE ON DELETE SET NULL
Referenced by:
Tablespace: "opt"






Palle Girgensohn <girgen@pingpong.net> writes:
> How come the planner treats the
> delete from table where not extists(select 1 from table2 where ... LIMIT 1)

> so differently, and usually badly, when the LIMIT 1 is there.

Because it can't optimize it into an antijoin.

> In older
> version of postgresql, I remember that the effect was the opposite, a
> limit 1 would actually perform substantially better. Hence we have old
> code (and old habits), where the LIMIT 1 is still used.

Well, you're basically forcing it into the same type of plan you would
have gotten before antijoins were implemented (circa 8.4), so I don't
see that this is a regression.  But I'd get rid of the LIMIT 1 if I were
you.  It's been a *very* long time since that was a net benefit in an
EXISTS subquery, if indeed it ever was --- AFAIR, even the earliest PG
versions that understood about optimizing for fast-start plans would do
so in an EXISTS subquery, with or without any LIMIT.
        regards, tom lane