Re: Query for filtering records - Mailing list pgsql-sql
From | Jean-Luc Lachance |
---|---|
Subject | Re: Query for filtering records |
Date | |
Msg-id | 3DED0BA2.17DF0B8C@nsd.ca Whole thread Raw |
In response to | Query for filtering records (eric soroos <eric-psql@soroos.net>) |
List | pgsql-sql |
Eric try: select num, p1,p2 ... from contacts inner join groups using (contacts.num=groups.contactNum) where groups.groupNum=a andcontact.p3=b and not exists ( select 1 from groups g2 where g2.contactNum = groups.contactNum and g2.groupNum!= a); or select num, p1,p2 ... from contacts inner join groups using (contacts.num=groups.contactNum) where groups.groupNum=a andcontact.p3=b and groups.groupNum in ( select contactNum from groups group by contactNum having count(*) = 1); The IN version may still be faster as the sub-select should be evaluated only once; JLL eric soroos wrote: > > I'm having trouble subtracting groups from other groups. > > I've got a data model that has the following essential features: > > create table contacts (num int, properties....); > create table groups (groupNum int, contactNum int); > > Where not all contacts will be in a group, some groups will contain most contacts, and there will be something like hundredsof groups and tens of thousands of contacts. I allow people to build groups using criteria, which I need to programaticallytranslate to sql. > > One somewhat common pattern is: > > Select all contacts in group a, who have property b, and who aren't in groups c,d,e,f... > > My first shot was subqueries: > > select num, p1,p2 ... from contacts > inner join groups using (contacts.num=groups.contactNum) > where groups.groupNum=a > and contact.p3=b > and not num in (select contactNum from groups where groupNum=c) > and not num in (select contactNum from groups where groupNum=d) > and not num in (select contactNum from groups where groupNum=e) > and not num in (select contactNum from groups where groupNum=f) > > This is .... slow. agonizingly so. > > With an inner join, I'm not convinced that the subtraction is actually correct., but it is much faster. Unfortunatley,faster incorrect answers are rarely helpful. > > Outer joins seem even worse than subselects for speed, but it does appear to give the correct answer. (example with a singlejoin.) > > select num from contacts > left outer join groups > on (contacts.num=groups.contactNum > and groups.groupNum=b) > where > dl_groupDonor._groupNum is null > and p3=c > > I've got to be missing something here, because this is much slower from the (slow) procedural system that I'm porting from. > > I've been avoiding using union / intersect since I don't really ever know what columns are going to be in the query. perhapsI should revisit that decision and try to work around it. > > eric > > ---------------------------(end of broadcast)--------------------------- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)