Re: [HACKERS] [6.5.2] join problems ... - Mailing list pgsql-hackers
From | Mike Mascari |
---|---|
Subject | Re: [HACKERS] [6.5.2] join problems ... |
Date | |
Msg-id | 19990919071724.20897.rocketmail@web119.yahoomail.com Whole thread Raw |
Responses |
Re: [HACKERS] [6.5.2] join problems ...
Re: [HACKERS] [6.5.2] join problems ... Re: [HACKERS] [6.5.2] join problems ... |
List | pgsql-hackers |
The query you've presented is rather convoluted, but if I'm reading your query correctly, it should reduce to a simple, three-way join: SELECT c.id, c.name, c.url FROM aecEntMain a, aecWebEntry b, aecCategory c WHERE a.status LIKE 'active:ALL%' AND a.representation LIKE '%:ALL%' AND b.status LIKE 'active:ALL%' AND b.indid='$indid' AND b.divid='$divid' AND (a.id,a.mid = b.id,b.mid) AND (b.catid,b.indid,b.divid = c.id,c.ppid,c.pid); with the following indexes: aecEntMain: (status) and (id,mid) aecWebEntry: (status), (indid), (divid), and (catid,indid,divid) aecCategory: (id,ppid,pid) Now, there are some differences between the above and what you wrote. For example, the above requires that the status begins with 'active:ALL'. Your query requires the status begin with 'active' and must also contain the pattern 'active:ALL'. So for the above to be equivalent, you can't have a status such as 'active <some stuff> active:ALL'. With respect to subqueries and PostgreSQL, as you know, the IN clause requires a nested scan. If you are going to use subqueries, correlated subqueries using EXISTS clauses can use indexes: SELECT c.id, c.name, c.url FROM aecCategory c WHERE EXISTS ( SELECT a.status FROM aecEntMain a, aecWebEntry b WHERE a.status LIKE 'active:ALL%' AND a.representation LIKE '%:ALL%' AND b.status LIKE 'active:ALL%' AND b.indid='$indid' AND b.divid='$divid' AND (a.id,a.mid = b.id,b.mid) AND (b.catid,b.indid,b.divid = c.id,c.ppid,c.pid)); Unfortunately, the lack of index support in IN subqueries affects more than just the IN subquery clause, since INTERSECT/EXCEPT uses the rewriter to rewrite such queries as UNIONS of two queries with an IN/NOT IN subquery, respectively. This makes the INTERSECT/EXCEPT feature functionally useless except on very small tables. Hope that helps (and is equivalent), Mike Mascari (mascarim@yahoo.com) --- The Hermit Hacker <scrappy@hub.org> wrote: > > Morning... > > This weekend, up at a clients site working with > them on improving > database performance. They are currently running > MySQL and I'm trying to > convince them to switch over to PostgreSQL, for > various features that they > just don't have with MySQL... > > One of the 'queries' that they are currently doing > with MySQL > consists of two queries that I can reduce down to > one using subqueries, > but its so slow that its ridiculous...so I figured > I'd throw it out as a > problem to hopefully solve? > > The query I'm starting out with is works out as: > > SELECT id, name, url \ > FROM aecCategory \ > WHERE ppid='$indid' \ > AND pid='$divid'"; > > The results of this get fed into a while look that > takes the id > returned and pushes them into: > > SELECT distinct b.indid, b.divid, b.catid, > a.id, a.mid \ > FROM aecEntMain a, aecWebEntry b \ > WHERE (a.id=b.id AND a.mid=b.mid) \ > AND (a.status like 'active%' and b.status > like 'active%') > AND (a.status like '%active:ALL%' and > b.status like '%active:ALL%') > AND (a.representation like '%:ALL%') > AND (b.indid='$indid' and > b.divid='$divid' and b.catid='$catid')"; > > Now, I can/have rewritten this as: > > SELECT id, name, url > FROM aecCategory > WHERE ppid='$indid' > AND pid='$divid' > AND id IN ( > SELECT distinct c.id > FROM aecEntMain a, aecWebEntry b, aecCategory c > WHERE (a.id=b.id AND a.mid=b.mid and b.catid=c.id) > AND (a.status like 'active%' and b.status like > 'active%') > AND (a.status like '%active:ALL%' and b.status > like '%active:ALL%') > AND (a.representation like '%:ALL%') > AND (b.indid='$indid' and b.divid='$divid' and > b.catid IN ( > SELECT id FROM aecCategory WHERE > ppid='$indid' AND pid='$divid' ) > ));"; > > An explain of the above shows: > > Index Scan using aeccategory_primary on aeccategory > (cost=8.28 rows=1 width=36) > SubPlan > -> Unique (cost=1283.70 rows=21 width=72) > -> Sort (cost=1283.70 rows=21 width=72) > -> Nested Loop (cost=1283.70 rows=21 > width=72) > -> Nested Loop (cost=1280.70 rows=1 > width=60) > -> Index Scan using aecwebentry_primary > on aecwebentry b > (cost=1278.63 rows=1 width=36) > SubPlan > -> Index Scan using > aeccategory_primary on aeccategory > (cost=8.28 rows=1 > width=12) > -> Index Scan using aecentmain_primary on > aecentmain a > (cost=2.07 rows=348 width=24) > -> Index Scan using aeccategory_id on > aeccategory c > (cost=3.00 rows=1170 width=12) > > Now, a few things bother me with the above explain > output, based on me > hopefully reading this right... > > The innermost SubPlan reports an estimated rows > returned of 1...the > actual query returns 59 rows...slightly off? > > The one that bothers me is the one that reports > 1170 rows returned...if you > look at the query, the only thing that would/should > use aeccategory_id is the > line that goes "SELECT distinct c.id"...if I run > just that section of the > query, it yields a result of 55 rows...way off?? > > All of my queries are currently on static data, > after a vacuum analyze has > been performed...everything is faster if I split > things up and do a SELECT > on a per id basis on return values, but, as the list > of 'ids' grow, the > number of iterations of the while loop required will > slow down the query... > > I'm not sure what else to look at towards > optimizing the query further, > or is this something that we still are/need to look > at in the server itself? > > The machine we are working off of right now is an > idle Dual-PIII 450Mhz with > 512Meg of RAM, very fast SCSI hard drives on a UW > controller...and that query > is the only thing running while we test things...so > we aren't under-powered :) > > ideas? > > Marc G. Fournier ICQ#7615664 > IRC Nick: Scrappy > Systems Administrator @ hub.org > primary: scrappy@hub.org secondary: > scrappy@{freebsd|postgresql}.org __________________________________________________ Do You Yahoo!? Bid and sell for free at http://auctions.yahoo.com
pgsql-hackers by date: