Thread: Why does delete from table where not exists (select 1 from ... LIMIT 1) perform badly?
Why does delete from table where not exists (select 1 from ... LIMIT 1) perform badly?
From
Palle Girgensohn
Date:
Hi, I've read about the reason for this before, but cannot find a reference to it now. How come the planner treats the delete from table where not extists(select 1 from table2 where ... LIMIT 1) so differently, and usually badly, when the LIMIT 1 is there. In older version of postgresql, I remember that the effect was the opposite, a limit 1 would actually perform substantially better. Hence we have old code (and old habits), where the LIMIT 1 is still used. Shouldn't the planner really understand that the intention is the same in these two queries? -- bad: DELETE FROM iup_locked_gradings ilg WHERE NOT EXISTS ( SELECT 1 FROM iup_locked_subject ils WHERE ils.locked_gradings_id = ilg.locked_gradings_id LIMIT 1 ) ; -- good: DELETE FROM iup_locked_gradings ilg WHERE NOT EXISTS ( SELECT 1 FROM iup_locked_subject ils WHERE ils.locked_gradings_id = ilg.locked_gradings_id ) ; pp=# begin; explain DELETE FROM iup_locked_gradings ilg WHERE NOT EXISTS (SELECT 1 FROM iup_locked_subject ils WHERE ils.locked_gradings_id = ilg.locked_gradings_id LIMIT 1); BEGIN QUERY PLAN -----------------------------------------------------------------------------------------------Delete (cost=0.00..523542963.48rows=291737 width=6) -> Seq Scan on iup_locked_gradings ilg (cost=0.00..523542963.48 rows=291737 width=6) Filter: (NOT (SubPlan 1)) SubPlan 1 -> Limit (cost=0.00..897.27 rows=1 width=0) -> Seq Scan on iup_locked_subject ils (cost=0.00..18842.76 rows=21 width=0) Filter: (locked_gradings_id = $0) (7 rows) pp=# begin; explain DELETE FROM iup_locked_gradings ilg WHERE NOT EXISTS (SELECT 1 FROM iup_locked_subject ils WHERE ils.locked_gradings_id = ilg.locked_gradings_id ); BEGIN QUERY PLAN ---------------------------------------------------------------------------------------------------Delete (cost=31705.39..47934.47rows=553737 width=12) -> Hash Anti Join (cost=31705.39..47934.47 rows=553737 width=12) Hash Cond: (ilg.locked_gradings_id = ils.locked_gradings_id) -> Seq Scan on iup_locked_gradings ilg (cost=0.00..6677.44 rows=583474 width=10) -> Hash (cost=15776.83..15776.83 rows=1226373 width=10) -> Seq Scan on iup_locked_subjectils (cost=0.00..15776.83 rows=1226373 width=10) (6 rows) pp=# chalmers=# \d iup_locked_gradings Table "public.iup_locked_gradings" Column | Type | Modifiers ----------------------+---------+----------------------------------------------------------------------------------locked_gradings_id | integer | not null default nextval('iup_locked_gradings_locked_gradings_id_seq'::regclass)type | integer |description | text |name | text |original_gradings_id | integer | Indexes: "iup_locked_gradings_pkey" PRIMARY KEY, btree (locked_gradings_id), tablespace "opt" Referenced by: TABLE "iup_locked_subject" CONSTRAINT "iup_locked_subject_locked_gradings_id_fkey" FOREIGN KEY (locked_gradings_id) REFERENCES iup_locked_gradings(locked_gradings_id) ON UPDATE CASCADE ON DELETE SET NULL Tablespace: "opt" chalmers=# \d iup_locked_subject Table "public.iup_locked_subject" Column | Type | Modifiers ---------------------+---------+--------------------------------------------------------------------------------locked_subject_id | integer | not null default nextval('iup_locked_subject_locked_subject_id_seq'::regclass)name | text | not nulllink_url | text |description | text |use_measures | boolean | not null default truelocked_gradings_id | integer|original_subject_id | integer |use_fail_warning | boolean | not null default false Indexes: "iup_locked_subject_pkey" PRIMARY KEY, btree (locked_subject_id), tablespace "opt" Foreign-key constraints: "iup_locked_subject_locked_gradings_id_fkey" FOREIGN KEY (locked_gradings_id) REFERENCES iup_locked_gradings(locked_gradings_id) ON UPDATE CASCADE ON DELETE SET NULL Referenced by: Tablespace: "opt"
Re: Why does delete from table where not exists (select 1 from ... LIMIT 1) perform badly?
From
Tom Lane
Date:
Palle Girgensohn <girgen@pingpong.net> writes: > How come the planner treats the > delete from table where not extists(select 1 from table2 where ... LIMIT 1) > so differently, and usually badly, when the LIMIT 1 is there. Because it can't optimize it into an antijoin. > In older > version of postgresql, I remember that the effect was the opposite, a > limit 1 would actually perform substantially better. Hence we have old > code (and old habits), where the LIMIT 1 is still used. Well, you're basically forcing it into the same type of plan you would have gotten before antijoins were implemented (circa 8.4), so I don't see that this is a regression. But I'd get rid of the LIMIT 1 if I were you. It's been a *very* long time since that was a net benefit in an EXISTS subquery, if indeed it ever was --- AFAIR, even the earliest PG versions that understood about optimizing for fast-start plans would do so in an EXISTS subquery, with or without any LIMIT. regards, tom lane