Re: Recheck condition - Mailing list pgsql-general

From Josh Harrison
Subject Re: Recheck condition
Date
Msg-id 8d89ea1d0711300827o21c04b74se0804006baade785@mail.gmail.com
Whole thread Raw
In response to Re: Recheck condition  (Gregory Stark <stark@enterprisedb.com>)
Responses Re: Recheck condition  (Martijn van Oosterhout <kleptog@svana.org>)
List pgsql-general

Or using an IN or EXISTS query:

SELECT person_id
 FROM person
 WHERE column1=1
  AND column2='62'
  AND person_id IN (
        SELECT person_id
          FROM person
         WHERE column1=1
           AND column2='189'
      )

or

SELECT person_id
 FROM person AS parent
 WHERE column1=1
  AND column2='62'
  AND EXISTS (
        SELECT 1
          FROM person
         WHERE parent.person_id = person_id
           AND column1=1
           AND column2='189'
      )

Thanks for your reply
The query with IN  gave this plan and took 1m19sec to give the result which is slightly more than the intersect query(40 sec). The other query with exists takes way long time for results. All these queries does a heap scan for intermediate results...right? Is there a way to get them not to use the heap for intermediate result and go to heap only for final data? This will drastically improve the performance but Im not sure if postgres can do that? Will creating the index in a different way and/or rewriting the query in a different way achieve this result?

Thanks
jo

QUERY PLAN                                                                                                                                                                   
 -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=705823.44..1182434.64 rows=43631 width=4) (actual time=26443.675..52055.698 rows=140464 loops=1)                                                            
   Hash Cond: (public.person.patient_id = public.person.patient_id)                                                                                   
   ->  Bitmap Heap Scan on person  (cost=17886.42..492557.97 rows=381993 width=4) (actual time=442.934..25779.601 rows=327498 loops=1)                            
         Recheck Cond: ((column1 = 1) AND ((column2)::text = '62'::text))                                                                   
         ->  Bitmap Index Scan on person_idx  (cost= 0.00..17790.92 rows=381993 width=0) (actual time=403.869..403.869 rows=327498 loops=1)                 
               Index Cond: ((column1 = 1) AND ((column2)::text = '62'::text))                                                               
   ->  Hash  (cost=687933.35..687933.35 rows=294 width=4) (actual time=26000.635..26000.635 rows=6568 loops=1)                                                                
         ->  HashAggregate  (cost= 687930.41..687933.35 rows=294 width=4) (actual time=25992.971..25996.471 rows=6568 loops=1)                                                 
               ->  Bitmap Heap Scan on person  (cost=156754.24..679555.96 rows=3349781 width=4) (actual time=3202.251..23974.389 rows=3429228 loops=1)            
                     Recheck Cond: ((column1 = 1) AND ((column2)::text = '189'::text))                                                      
                     ->  Bitmap Index Scan on person_idx  (cost=0.00..155916.80 rows=3349781 width=0) (actual time=3145.912..3145.912 rows=3429228 loops=1)
                           Index Cond: ((column1 = 1) AND ((column2)::text = '189'::text))                                                  
 Total runtime: 52094.598 ms                                              

pgsql-general by date:

Previous
From: Martijn van Oosterhout
Date:
Subject: Re: Record variable not behaving as expected (bug?)
Next
From: Martijn van Oosterhout
Date:
Subject: Re: Recheck condition