Re: Finding uniques across a big join - Mailing list pgsql-general
From | Jim C. Nasby |
---|---|
Subject | Re: Finding uniques across a big join |
Date | |
Msg-id | 20051130181301.GG13642@nasby.net Whole thread Raw |
In response to | Finding uniques across a big join ("John D. Burger" <john@mitre.org>) |
Responses |
Re: Finding uniques across a big join
|
List | pgsql-general |
It will probably be a win to come up with a list of potential records from each table, instead of after doing the 3-way join. so something like: (SELECT gazPlaceID FROM gazPlaces GROUP BY featureType HAVING count(*)=1) JOIN (SELECT ...) If you post the output of explain (or explain analyze is even better) then people could probably make better suggestions. On Tue, Nov 29, 2005 at 09:58:49PM -0500, John D. Burger wrote: > I could use some help with the following: > > I have a database of geographic entities with attributes spread across > several tables. I need to determine which entities are unique with > respect to some of those attributes. I'm using the following query: > > select p2.gazPlaceID from > (select p1.featureType, n1.placeNameID, c1.containerID > from gazPlaces as p1 > join gazNamings as n1 using (gazPlaceID) > join gazContainers as c1 using (gazPlaceID) > group by p1.featureType, n1.placeNameID, c1.containerID > having count(*) = 1) as uniqs, > gazPlaces as p2 > join gazNamings as n2 using (gazPlaceID) > join gazContainers as c2 using (gazPlaceID) > where uniqs.featureType = p2.featureType > and uniqs.placeNameID = n2.placeNameID > and uniqs.containerID = c2.containerID; > > The basic idea is to compute featureType-placeNameID-containerID > combinations with a three-way join, determine which of those have a > count of 1, and then join that back to the same three-way join to get > the gazPlaceIDs corresponding to the unique combos (whew!). > > gazPlaces has about 6M entries, gazNamings and gazContainers each about > 10M. All of the fields above are integers, and I have indexes on > everything relevant, but the query still takes about eight hours. My > question is not (necessarily) how to improve the efficiency of this > query, but whether anyone can think of a faster way to compute the > uniques. Again, the goal is to find entries in gazPlaces that are the > only ones with their particular combination of feature type, name and > container. > > Any help is appreciated! > > - John Burger > MITRE > > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Have you searched our list archives? > > http://archives.postgresql.org > -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
pgsql-general by date: