NOT IN >2hrs vs EXCEPT < 2 sec. - Mailing list pgsql-performance

From Kevin Traster
Subject NOT IN >2hrs vs EXCEPT < 2 sec.
Date
Msg-id 72188cf00901282301x6abba690p38c3dff0df4bc338@mail.gmail.com
Whole thread Raw
Responses Re: NOT IN >2hrs vs EXCEPT < 2 sec.
Re: NOT IN >2hrs vs EXCEPT < 2 sec.
List pgsql-performance

2 questions:

1) Different costs for same actions. Doing an explain on 2 nearly identical queries both involving the same Index scan on same table has 2 widely different costs for same Index scan  303375872.86 vs. 12576.70

2) Simple query using NOT IN (subquery)was killed after 2 hrs, using the same query (query) except (query) ran in < 2 sec.

Summary:

On devel box (Unix PG version 8.3.5) with no other database activity or system activity after immediately completing a vacuum analyze.

The original query (below) was running for over 2 hrs and was killed.: 

select distinct ciknum into tmpnocikinowner from cik where ciknum not in (select cik from owner_cik_master);

est total cost: 303375872.86, for Index Scan: 303375616.75

Simple query broken down: explain select distinct ciknum into tmpnocikinowner from cik ;

est total cost: 12576.70, for Index Scan: 12064.49

and

select cik from owner_cik_master

est total cost: 2587.36, for Index Scan: N/A

Actual time, the query was killed after 2hrs,

However, we ran:

select distinct ciknum into tmpnocikinowner from cik ; - actual time 861.487 ms
 (select ciknum from tmpnocikinowner)  except (select cik from owner_cik_master);  - actual time 1328.094 ms
 
 
 
##### Console log below with details ######

devel=# explain select distinct ciknum into tmpnocikinowner from cik where ciknum not in (select cik from owner_cik_master);
                                           QUERY PLAN                                          
------------------------------------------------------------------------------------------------
 Unique  (cost=3506.21..303375872.86 rows=71946 width=8)
   ->  Index Scan using cik_ciknum_idx on cik  (cost=3506.21..303375616.75 rows=102444 width=8)
         Filter: (NOT (subplan))
         SubPlan
           ->  Materialize  (cost=3506.21..6002.40 rows=186019 width=4)
                 ->  Seq Scan on owner_cik_master  (cost=0.00..2684.19 rows=186019 width=4)
(6 rows)

Time: 0.723 ms
devel=# explain select ciknum into tmpnocikinowner from cik where ciknum not in (select cik from owner_cik_master);        
                                      QUERY PLAN                                     
--------------------------------------------------------------------------------------
 Seq Scan on cik  (cost=3506.21..303367660.13 rows=102444 width=8)
   Filter: (NOT (subplan))
   SubPlan
     ->  Materialize  (cost=3506.21..6002.40 rows=186019 width=4)
           ->  Seq Scan on owner_cik_master  (cost=0.00..2684.19 rows=186019 width=4)
(5 rows)

Time: 0.588 ms
devel=# explain select ciknum::int into tmpnocikinowner from cik where ciknum::int not in (select cik::int from owner_cik_master);
                                      QUERY PLAN                                     
--------------------------------------------------------------------------------------
 Seq Scan on cik  (cost=3506.21..303368428.46 rows=102444 width=8)
   Filter: (NOT (subplan))
   SubPlan
     ->  Materialize  (cost=3506.21..6002.40 rows=186019 width=4)
           ->  Seq Scan on owner_cik_master  (cost=0.00..2684.19 rows=186019 width=4)
(5 rows)

Time: 0.918 ms
devel=# explain select ciknum into tmpnocikinowner from cik ;                                                                                         QUERY PLAN                        
-----------------------------------------------------------
 Seq Scan on cik  (cost=0.00..4107.87 rows=204887 width=8)
(1 row)

Time: 0.438 ms
devel=# explain select distinct ciknum into tmpnocikinowner from cik ;
                                       QUERY PLAN                                       
-----------------------------------------------------------------------------------------
 Unique  (cost=0.00..12576.70 rows=143891 width=8)
   ->  Index Scan using cik_ciknum_idx on cik  (cost=0.00..12064.49 rows=204887 width=8)
(2 rows)

Time: 0.468 ms
devel=#  select distinct ciknum into tmpnocikinowner from cik ;      
SELECT
Time: 861.487 ms

devel=# explain select ciknum from tmpnocikinowner  where ciknum not in (select cik from owner_cik_master);
                                      QUERY PLAN                                     
--------------------------------------------------------------------------------------
 Seq Scan on tmpnocikinowner  (cost=3506.21..261092922.31 rows=88168 width=8)
   Filter: (NOT (subplan))
   SubPlan
     ->  Materialize  (cost=3506.21..6002.40 rows=186019 width=4)
           ->  Seq Scan on owner_cik_master  (cost=0.00..2684.19 rows=186019 width=4)
(5 rows)

Time: 0.629 ms

devel=# explain select cik from owner_cik_master;
                               QUERY PLAN                              
------------------------------------------------------------------------
 Seq Scan on owner_cik_master  (cost=0.00..2684.19 rows=186019 width=4)
(1 row)

Time: 0.415 ms
devel=# explain select ciknum from tmpnocikinowner;;
                              QUERY PLAN                              
-----------------------------------------------------------------------
 Seq Scan on tmpnocikinowner  (cost=0.00..2587.36 rows=176336 width=8)
(1 row)

Time: 0.413 ms
devel=# explain (select ciknum from tmpnocikinowner)  except (select cik from owner_cik_master);                   
                                           QUERY PLAN                                          
------------------------------------------------------------------------------------------------
 SetOp Except  (cost=47309.23..49121.00 rows=36236 width=8)
   ->  Sort  (cost=47309.23..48215.12 rows=362355 width=8)
         Sort Key: "*SELECT* 1".ciknum
         ->  Append  (cost=0.00..8895.10 rows=362355 width=8)
               ->  Subquery Scan "*SELECT* 1"  (cost=0.00..4350.72 rows=176336 width=8)
                     ->  Seq Scan on tmpnocikinowner  (cost=0.00..2587.36 rows=176336 width=8)
               ->  Subquery Scan "*SELECT* 2"  (cost=0.00..4544.38 rows=186019 width=4)
                     ->  Seq Scan on owner_cik_master  (cost=0.00..2684.19 rows=186019 width=4)
(8 rows)

Time: 0.625 ms
devel=#  (select ciknum from tmpnocikinowner)  except (select cik from owner_cik_master);      
 ciknum
--------
(0 rows)

Time: 1328.094 ms

pgsql-performance by date:

Previous
From: Scott Marlowe
Date:
Subject: Re: [PERFORMANCE] Buying hardware
Next
From: Scott Marlowe
Date:
Subject: Re: NOT IN >2hrs vs EXCEPT < 2 sec.