Home > mailing lists

Re: need some help understanding sloq query - Mailing list pgsql-sql

From	Esger Abbink
Subject	Re: need some help understanding sloq query
Date	December 6, 2001 07:06:23
Msg-id	200112061334.OAA03446@fikkie.vesc.nl Whole thread Raw
In response to	need some help understanding sloq query (Esger Abbink <esger@vesc.nl>)
Responses	Re: need some help understanding sloq query
List	pgsql-sql

Tree view


> Esger,
> 
> > select some_other_fields from ents e, qtys q where e.set_id =
> > q.set_id and
> > e.e_id = q.e_id and e.set_id in (select set_id from datasets
> > where update_id in (select cur_update_id from current)) and
> > q.other_field = some_const ;
> > 
> > this query takes ages :(
> 
> I'm not surprised.  You're doing a nested subselect with the IN
> operator; frankly, you couldn't make a relatively simple query any
> slower than the above.  It looks like the query parser is doing its best
> to optimize, but you've forced it to compare every row in ents JOIN qtys
> against the whole datasets table.  What's wrong with:
> 
> SELECT some_fields 
> FROM ents JOIN qtys USING (set_id)
>     JOIN datasets USING (set_id)
>     JOIN current ON datasets.update_id = current.cur_update_id
> WHERE other_field = some_constant;
> 

i figured i wasnt doing things "right" ;)

unfortunately the query above takes down my postgres (7.0.3 tried on 2 servers). 1 join no problem, 2 joins results
in:

DEBUG:  Data Base System is in production state at Thu Dec  6 11:42:18 2001
Server process (pid 1095) exited with status 11 at Thu Dec  6 11:45:08 2001
Terminating any active server processes...
Server processes were terminated at Thu Dec  6 11:45:08 2001
Reinitializing shared memory and semaphores
The Data Base System is starting up

this is directly related to executing the query (or just the part with the first 2 joins) and is 100% reproducible
sofar.


secondly, from the costs explain gives for the query part i can get through (1st join: merge join costs 7800, index
scancosts 2000, index scan costs 4200) it seems that doing a simpler variant of my query (see the other post
-followup-)with IN replaced by '=' is a lot faster. the resulting query would be something like:
 

select some_other_fields from ents e, qtys q where 
e.set_id = q.set_id and
e.e_id = q.e_id and 
e.set_id = X and
q.other_field = some_const ;

this query has negligable costs (nested loop cost 200, index scan cost 50, index scan cost 4 on test db) but should be
envelopedin some for-loop construct to feed it the values for X after which the answers should be "glued" together. the
setof X's has no theoritical maximum that i can recall atm but in practice it currently is always size 1 and i dont
expectmuch if any growth. i cant however assume its size 1.
 


(the above does sound like a complex answer to a simple problem again though..)

> That allows Tom's magic query engine to do its work optimizing.
> Assuming, of course, that there *are* indexes on update_id and
> datasets.set_id ...
> 
> I heartily recommend that you read Celko's "SQL for Smarties" (book
> reviews: http://techdocs.postgresql.org/bookreviews.php ).  You seem to
> have fallen into the trap of using complex queries to answer simple
> questions, and your database performance is suffering because of it.

will investigate.

> 
> -Josh Berkus
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo@postgresql.org so that your
> message can get through to the mailing list cleanly
> 
> 
> 

-- 
NeoMail - Webmail that doesn't suck... as much.
http://neomail.sourceforge.net

pgsql-sql by date:

From: Stephan Szabo
Date: 05 December 2001, 23:27:17
Subject: Re: Select into

From: James Thornton
Date: 06 December 2001, 10:17:29
Subject: update returns 1, but no changes have been made

Re: need some help understanding sloq query - Mailing list pgsql-sql

Previous

Next