Re: [SQL] Subselect performance - Mailing list pgsql-sql

From Bruce Momjian
Subject Re: [SQL] Subselect performance
Date
Msg-id 199909211918.PAA14232@candle.pha.pa.us
Whole thread Raw
In response to Re: [SQL] Subselect performance  (Stuart Rison <rison@biochemistry.ucl.ac.uk>)
Responses Re: [SQL] Subselect performance  (Daniel Lopez <ridruejo@atm9.com.dtu.dk>)
List pgsql-sql
> Oh, OK, that's very possible.  I was always under the impression (very
> possibly missguided) that the reason it took a long time to do a "in
> (select...)" was that the sub-select was actually executed for every row
> in 'a' so that you ended up doing:
> 
> 1x sequential scan of a
> ax select on c
> 
> whereas if you did the sub-select ide[endently and cut-and-pasted the
> obtained set into the "in (...)" you were in point of fact just doing:
> 
> 1X sequential scan of a (each of them with loads of OR statements).
> 
> therefore saving "ax select" time.
> 
> Bruce, I appologise if I've completely missunderstood what's going on and
> that your e-mail was all about correcting me.  I don't have a good grasp
> of seq-scan vs. (nested-)joins vs. hash joins vs. mergejoins etc.
> (although any pointers on where to get a crash course in these would be
> greatly appreciated).  

See src/backend/optimizer/README for more info on join types.

>  
> > This is a major performance problem, one that is known, and one that
> > should be fixed, but I am sounding like a broken record.
> 
> yeah, again appologise if this has been discussed to death in the past and
> I missed it all (or it went over my head ;) )
> 
> > The solution is to allow the subquery results to be mergejoined(sorted),
> > or hashjoined with the outer query.
> 

It is not your fault.  We frequently get reports of this type, and the
behavior of the subquery is very non-intuitive.  You would think that a
subquery and a join would have the same performance, but because of the
limitation of subqueries as being nested loop joined, this is not the
case, and subqueries are slower.  We tell people to rewrite their query
as EXISTS, but by the time we tell them that, they have already spent
much time trying to figure out why the query is so slow, and I am sure
many people don't even know about the EXISTS workaround.

I hate to have a feature that is not 100% optimized, but this is what we
have with subqueries at this time.

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


pgsql-sql by date:

Previous
From: "Jackson, DeJuan"
Date:
Subject: RE: [SQL] Search Engine with PHP
Next
From: Bruce Momjian
Date:
Subject: Re: [SQL] Re:Create table doesn't always respect atomicity of transactions