Thread: SQL confusion
I'm trying to figure out how to do a particular query, and I'm beating my head against a wall. Here's my situation: I'm running postgres 7.3.2 on linux, and making my requests from Perl scripts using DBD::Pg. My table structure is as follows (irrelevant cols removed) CREATE TABLE name ( namecounter integer NOT NULL, firstmiddle character varying(64) NOT NULL, lastname character varying(64)NOT NULL, birthdate date, hh smallint, famnu integer, ); Each row represents a person with a unique namecounter. Families share a famnu, and usually one person in a family is marked as head of household (hh>0), with everyone else hh=0. However, there are a few families with nobody marked as hh, and I'd like to elect one by age. The query I'm trying to do is to pull one person from each household, either the head of household if available, or the eldest if not. I want them sorted by last name, so I'd prefer to find them all in one query, no matter how ugly and nested it has to be. I can pull the list with hh>0 easily enough, but I'm not sure how to pull out the others. I realize that this could be done through some looping in the Perl script, but I'd like to avoid pulling the whole list into memory in case the list gets long. My preference is to just handle one record at a time in Perl if possible. Help? Andrew Ward adward55@yahoo.com __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com
On Sat, Oct 09, 2004 at 01:39:45PM -0700, Andrew Ward wrote: > > I'm running postgres 7.3.2 on linux, and making my > requests from Perl scripts using DBD::Pg. My table > structure is as follows (irrelevant cols removed) > > CREATE TABLE name ( > namecounter integer NOT NULL, > firstmiddle character varying(64) NOT NULL, > lastname character varying(64) NOT NULL, > birthdate date, > hh smallint, > famnu integer, > ); This may not be the fastest query, but it should be quite comprehensible. Lightly tested only on 7.4.5 (do you know why you're not using 7.4.x?). SELECT namecounter -- and whatever other columns you need FROM name outername -- alias so it's clear what the joins do WHERE hh = 1 -- simple case: head of family OR ( hh = 0 -- not a head AND NOT EXISTS ( -- there is no head of this family SELECT namecounter FROM name in1 WHERE hh = 1 AND in1.famnu = outername.famnu) AND birthdate = ( -- this person is as old as theoldest person SELECT MIN(birthdate) FROM name in2 WHERE in2.famnu = outername.famnu) ); Richard
On Sat, Oct 09, 2004 at 01:39:45PM -0700, Andrew Ward wrote: > > CREATE TABLE name ( > namecounter integer NOT NULL, > firstmiddle character varying(64) NOT NULL, > lastname character varying(64) NOT NULL, > birthdate date, > hh smallint, > famnu integer, > ); > > Each row represents a person with a unique > namecounter. Families share a famnu, and usually one > person in a family is marked as head of household > (hh>0), with everyone else hh=0. However, there are a > few families with nobody marked as hh, and I'd like to > elect one by age. The query I'm trying to do is to > pull one person from each household, either the head > of household if available, or the eldest if not. I > want them sorted by last name, so I'd prefer to find > them all in one query, no matter how ugly and nested > it has to be. This should be close to what you need: SELECT DISTINCT ON (lastname, famnu) * FROM name ORDER BY lastname, famnu, COALESCE(hh, 0) DESC, birthdate; The order of the fields in the ORDER BY clause is important, so if you want to sort the results by some other criteria then you might need to do it with a subselect. I used COALESCE on the hh field because a descending sort puts NULL values ahead of non-NULL values -- without COALESCE a person with hh=1 won't be recognized as the head of household if another member of the family has hh=NULL. You might need to modify the query to handle NULL birthdates or to add other tiebreakers. -- Michael Fuhr http://www.fuhr.org/~mfuhr/
This is untested, but it might be enough to get you started: SELECT namecounter FROM name n WHERE NOT EXISTS (SELECT 1FROM nameWHERE hh > 0AND famnu = n.famnu ) GROUP BY famnu HAVING birthdate = min( birthdate ); What I'm trying to do here is grab all families that don't have a head of household, group them by family, and get only the namecounter corresponding to the minimum birthdate for that family. If I recall, I've had some trouble using HAVING with min/max in ways that seem intuitive to me, but this might help get you started. -tfo On Oct 9, 2004, at 3:39 PM, Andrew Ward wrote: > I'm trying to figure out how to do a particular query, > and I'm beating my head against a wall. Here's my > situation: > > I'm running postgres 7.3.2 on linux, and making my > requests from Perl scripts using DBD::Pg. My table > structure is as follows (irrelevant cols removed) > > CREATE TABLE name ( > namecounter integer NOT NULL, > firstmiddle character varying(64) NOT NULL, > lastname character varying(64) NOT NULL, > birthdate date, > hh smallint, > famnu integer, > ); > > Each row represents a person with a unique > namecounter. Families share a famnu, and usually one > person in a family is marked as head of household > (hh>0), with everyone else hh=0. However, there are a > few families with nobody marked as hh, and I'd like to > elect one by age. The query I'm trying to do is to > pull one person from each household, either the head > of household if available, or the eldest if not. I > want them sorted by last name, so I'd prefer to find > them all in one query, no matter how ugly and nested > it has to be. > > I can pull the list with hh>0 easily enough, but I'm > not sure how to pull out the others. > > I realize that this could be done through some looping > in the Perl script, but I'd like to avoid pulling the > whole list into memory in case the list gets long. My > preference is to just handle one record at a time in > Perl if possible. > > Help? > > Andrew Ward > adward55@yahoo.com > > __________________________________________________ > Do You Yahoo!? > Tired of spam? Yahoo! Mail has the best spam protection around > http://mail.yahoo.com > > ---------------------------(end of > broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faqs/FAQ.html