Re: Anti join confusion - Mailing list pgsql-hackers
From | wenhui qiu |
---|---|
Subject | Re: Anti join confusion |
Date | |
Msg-id | CAGjGUAKCgRzFgu-6bDATzAmeFDVjq_WbT9mbud36vXdUxJ+Jpg@mail.gmail.com Whole thread Raw |
In response to | Re: Anti join confusion (Tender Wang <tndrwang@gmail.com>) |
Responses |
Re: Anti join confusion
|
List | pgsql-hackers |
Hi Tender Wang
Thank you very much for the analysis.Do we have plans for NOT IN subquery pull up?
Thanks
On Mon, Feb 24, 2025 at 4:00 PM Tender Wang <tndrwang@gmail.com> wrote:
wenhui qiu <qiuwenhuifx@gmail.com> 于2025年2月24日周一 15:10写道:HI Tom , Tender WangBut I saw the path test case are support ,Hi wenhui,In the above picture, it is NOT EXISTS sub-query, which can be pulled up, but NOT IN sub-query, which can not now.As Tom said:"NOT IN is not convertible to an anti-join: the semantics are wrongbecause of the way it treats nulls in the sub-select's output."and my test case set the id is primary key also SubPlantest=# explain analyze SELECT T1.id,T1.K1 FROM join1 t1 WHERE T1.id NOT IN (SELECT T2.id FROM join2 t2 where T2.ID = 1000 );
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------
Seq Scan on join1 t1 (cost=8.45..37446.94 rows=610100 width=8) (actual time=0.071..209.875 rows=1220199 loops=1)
Filter: (NOT (ANY (id = (hashed SubPlan 1).col1)))
Rows Removed by Filter: 1
Buffers: shared hit=22190
SubPlan 1
-> Index Only Scan using join2_pkey on join2 t2 (cost=0.42..8.44 rows=1 width=4) (actual time=0.027..0.029 rows=1 loops=1)
Index Cond: (id = 1000)
Heap Fetches: 0
Buffers: shared hit=4
Planning Time: 0.245 ms
Execution Time: 247.906 ms
(11 rows)
test=# explain SELECT T1.id,T1.K1 FROM join1 t1 WHERE T1.id NOT IN (SELECT T2.id FROM join2 t2 where T2.ID > 10000 );
QUERY PLAN
-------------------------------------------------------------------------------------------------------------
Gather (cost=1000.42..8414945643.30 rows=610100 width=8)
Workers Planned: 2
-> Parallel Seq Scan on join1 t1 (cost=0.42..8414883633.30 rows=254208 width=8)
Filter: (NOT (ANY (id = (SubPlan 1).col1)))
SubPlan 1
-> Materialize (cost=0.42..31028.21 rows=829768 width=4)
-> Index Only Scan using join2_pkey on join2 t2 (cost=0.42..23637.37 rows=829768 width=4)
Index Cond: (id > 10000)
(8 rows)
test=# explain SELECT T1.id,T1.K1 FROM join1 t1 WHERE T1.id NOT IN (SELECT T2.id FROM join2 t2 where T2.ID < 10000 );
QUERY PLAN
----------------------------------------------------------------------------------------------
Seq Scan on join1 t1 (cost=327.05..37765.54 rows=610100 width=8)
Filter: (NOT (ANY (id = (hashed SubPlan 1).col1)))
SubPlan 1
-> Index Only Scan using join2_pkey on join2 t2 (cost=0.42..301.22 rows=10331 width=4)
Index Cond: (id < 10000)
(5 rows)
test=#--Thanks,Tender Wang
Attachment
pgsql-hackers by date: