[6.5.2] join problems ... - Mailing list pgsql-hackers

From The Hermit Hacker
Subject [6.5.2] join problems ...
Date
Msg-id Pine.BSF.4.10.9909190203390.27097-100000@thelab.hub.org
Whole thread Raw
List pgsql-hackers
Morning...
This weekend, up at a clients site working with them on improving
database performance.  They are currently running MySQL and I'm trying to
convince them to switch over to PostgreSQL, for various features that they
just don't have with MySQL...
One of the 'queries' that they are currently doing with MySQL
consists of two queries that I can reduce down to one using subqueries,
but its so slow that its ridiculous...so I figured I'd throw it out as a
problem to hopefully solve?
The query I'm starting out with is works out as:
               SELECT id, name, url \                 FROM aecCategory \                WHERE ppid='$indid' \
      AND pid='$divid'";
 
The results of this get fed into a while look that takes the id
returned and pushes them into:
        SELECT distinct b.indid, b.divid, b.catid, a.id, a.mid \         FROM aecEntMain a, aecWebEntry b \
WHERE(a.id=b.id AND a.mid=b.mid) \          AND (a.status like 'active%' and b.status like 'active%')          AND
(a.statuslike '%active:ALL%' and b.status like '%active:ALL%')          AND (a.representation like '%:ALL%')
AND(b.indid='$indid' and b.divid='$divid' and b.catid='$catid')";
 
Now, I can/have rewritten this as:

SELECT id, name, url  FROM aecCategory WHERE ppid='$indid'   AND pid='$divid'   AND id IN ( 
SELECT distinct c.id  FROM aecEntMain a, aecWebEntry b, aecCategory c WHERE (a.id=b.id AND a.mid=b.mid and
b.catid=c.id)  AND (a.status like 'active%' and b.status like 'active%')   AND (a.status like '%active:ALL%' and
b.statuslike '%active:ALL%')   AND (a.representation like '%:ALL%')   AND (b.indid='$indid' and b.divid='$divid' and
b.catidIN (        SELECT id FROM aecCategory WHERE ppid='$indid' AND pid='$divid' )   ));";
 
An explain of the above shows:

Index Scan using aeccategory_primary on aeccategory  (cost=8.28 rows=1 width=36)SubPlan ->  Unique  (cost=1283.70
rows=21width=72)   ->  Sort  (cost=1283.70 rows=21 width=72)     ->  Nested Loop  (cost=1283.70 rows=21 width=72)
-> Nested Loop  (cost=1280.70 rows=1 width=60)         ->  Index Scan using aecwebentry_primary on aecwebentry b
            (cost=1278.63 rows=1 width=36)               SubPlan                 ->  Index Scan using
aeccategory_primaryon aeccategory                             (cost=8.28 rows=1 width=12)         ->  Index Scan using
aecentmain_primaryon aecentmain a                     (cost=2.07 rows=348 width=24)       ->  Index Scan using
aeccategory_idon aeccategory c                   (cost=3.00 rows=1170 width=12)
 
Now, a few things bother me with the above explain output, based on me 
hopefully reading this right...
The innermost SubPlan reports an estimated rows returned of 1...the 
actual query returns 59 rows...slightly off?
The one that bothers me is the one that reports 1170 rows returned...if you
look at the query, the only thing that would/should use aeccategory_id is the
line that goes "SELECT distinct c.id"...if I run just that section of the 
query, it yields a result of 55 rows...way off??
All of my queries are currently on static data, after a vacuum analyze has 
been performed...everything is faster if I split things up and do a SELECT
on a per id basis on return values, but, as the list of 'ids' grow, the
number of iterations of the while loop required will slow down the query...
I'm not sure what else to look at towards optimizing the query further,
or is this something that we still are/need to look at in the server itself?
The machine we are working off of right now is an idle Dual-PIII 450Mhz with
512Meg of RAM, very fast SCSI hard drives on a UW controller...and that query
is the only thing running while we test things...so we aren't under-powered :)
ideas?  

Marc G. Fournier                   ICQ#7615664               IRC Nick: Scrappy
Systems Administrator @ hub.org 
primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org 



pgsql-hackers by date:

Previous
From: "J. Michael Roberts"
Date:
Subject: Re: [HACKERS] money;money;money -- spam;spam;spam
Next
From: Thomas Lockhart
Date:
Subject: Re: [HACKERS] case bug?