Re: why? - Mailing list pgsql-hackers

From Hannu Krosing
Subject Re: why?
Date
Msg-id 1027616948.28648.2.camel@taru.tm.ee
Whole thread Raw
In response to why?  ("John Liu" <johnl@synthesys.com>)
List pgsql-hackers
On Thu, 2002-07-25 at 15:55, John Liu wrote:
> I've two queries -
> 
> 1. emrxdbs=# explain select * from patient A  where exists (select NULL from
> patient B where B.mrn=A.mrn and B.dob=A.dob and B.sex=A.sex and
> B.lastname=A.lastname and B.firstname=A.firstname group by B.mrn, B.dob,
> B.sex, B.lastname, B.firstname having A.patseq < max(B.patseq)) limit 10;
> NOTICE:  QUERY PLAN:
> 
> Limit  (cost=0.00..121.50 rows=10 width=141)
>   ->  Seq Scan on patient a  (cost=0.00..6955296.53 rows=572430 width=141)
>         SubPlan
>           ->  Aggregate  (cost=6.03..6.05 rows=1 width=42)
>                 ->  Group  (cost=6.03..6.05 rows=1 width=42)
>                       ->  Sort  (cost=6.03..6.03 rows=1 width=42)
>                             ->  Index Scan using patient_name_idx on patient
> b  (cost=0.00..6.02 rows=1 width=42)
> 
> 2. emrxdbs=# explain select * from patient A  where exists (select NULL from
> patient B where B.mrn=A.mrn and B.dob=A.dob and B.sex=A.sex and
> B.lastname=A.lastname and B.firstname=A.firstname and B.mrn='3471585'  group
> by B.mrn, B.dob, B.sex, B.lastname, B.firstname having A.patseq <
> max(B.patseq)) limit 10;
> NOTICE:  QUERY PLAN:
> 
> Limit  (cost=0.00..121.45 rows=10 width=141)
>   ->  Seq Scan on patient a  (cost=0.00..6951997.59 rows=572430 width=141)
>         SubPlan
>           ->  Aggregate  (cost=6.03..6.05 rows=1 width=42)
>                 ->  Group  (cost=6.03..6.04 rows=1 width=42)
>                       ->  Sort  (cost=6.03..6.03 rows=1 width=42)
>                             ->  Index Scan using patient_mrnfac_idx on
> patient b  (cost=0.00..6.02 rows=1 width=42)
> 
> The first query results come back fairly quick, the 2nd one just sits there
> forever.
> It looks similar in the two query plans.

It seems that using patient_mrnfac_idx instead of patient_name_idx is
not a good choice in your case ;(

try moving the B.mrn='3471585' from FROM to HAVING and hope that this
makes the DB use the same plan as for the first query

select * from patient A where exists (   select NULL     from patient B    where B.mrn=A.mrn      and B.dob=A.dob
andB.sex=A.sex      and B.lastname=A.lastname      and B.firstname=A.firstname    group by B.mrn, B.dob, B.sex,
B.lastname,B.firstname   having A.patseq < max(B.patseq)      and B.mrn='3471585') limit 10;
 

-----------
Hannu




pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Oracle Decode Function
Next
From: Hannu Krosing
Date:
Subject: creating aggregates that work on composite types (whole tuples)