Home > mailing lists

Finding uniques across a big join - Mailing list pgsql-general

From	John D. Burger
Subject	Finding uniques across a big join
Date	November 30, 2005 01:58:38
Msg-id	5fd4cbcbd2fa723217de3a724c77affc@mitre.org Whole thread Raw
Responses	Re: Finding uniques across a big join Re: Finding uniques across a big join Re: Finding uniques across a big join
List	pgsql-general

Tree view

I could use some help with the following:

I have a database of geographic entities with attributes spread across
several tables.  I need to determine which entities are unique with
respect to some of those attributes.  I'm using the following query:

select p2.gazPlaceID from
   (select p1.featureType, n1.placeNameID, c1.containerID
     from gazPlaces as p1
       join gazNamings as n1 using (gazPlaceID)
       join gazContainers as c1 using (gazPlaceID)
     group by p1.featureType, n1.placeNameID, c1.containerID
     having count(*) = 1) as uniqs,
   gazPlaces as p2
     join gazNamings as n2 using (gazPlaceID)
     join gazContainers as c2 using (gazPlaceID)
   where uniqs.featureType = p2.featureType
   and uniqs.placeNameID = n2.placeNameID
   and uniqs.containerID = c2.containerID;

The basic idea is to compute featureType-placeNameID-containerID
combinations with a three-way join, determine which of those have a
count of 1, and then join that back to the same three-way join to get
the gazPlaceIDs corresponding to the unique combos (whew!).

gazPlaces has about 6M entries, gazNamings and gazContainers each about
10M.  All of the fields above are integers, and I have indexes on
everything relevant, but the query still takes about eight hours.  My
question is not (necessarily) how to improve the efficiency of this
query, but whether anyone can think of a faster way to compute the
uniques.  Again, the goal is to find entries in gazPlaces that are the
only ones with their particular combination of feature type, name and
container.

Any help is appreciated!

- John Burger
   MITRE

pgsql-general by date:

From: Chris Gordon
Date: 30 November 2005, 00:18:31
Subject: Writing output to a file

From: Michael Fuhr
Date: 30 November 2005, 02:41:52
Subject: Re: errors with 8.1 make on Solaris

Finding uniques across a big join - Mailing list pgsql-general

Previous

Next