Thread: SQL confusion

SQL confusion

From

Andrew Ward

Date:

10 October 2004, 00:39:52

I'm trying to figure out how to do a particular query,
and I'm beating my head against a wall. Here's my
situation:

I'm running postgres 7.3.2 on linux, and making my
requests from Perl scripts using DBD::Pg. My table
structure is as follows (irrelevant cols removed)

CREATE TABLE name ( namecounter integer NOT NULL, firstmiddle character varying(64) NOT NULL, lastname character
varying(64)NOT NULL, birthdate date, hh smallint, famnu integer,

);

Each row represents a person with a unique
namecounter. Families share a famnu, and usually one
person in a family is marked as head of household
(hh>0), with everyone else hh=0. However, there are a
few families with nobody marked as hh, and I'd like to
elect one by age. The query I'm trying to do is to
pull one person from each household, either the head
of household if available, or the eldest if not. I
want them sorted by last name, so I'd prefer to find
them all in one query, no matter how ugly and nested
it has to be.

I can pull the list with hh>0 easily enough, but I'm
not sure how to pull out the others.

I realize that this could be done through some looping
in the Perl script, but I'd like to avoid pulling the
whole list into memory in case the list gets long. My
preference is to just handle one record at a time in
Perl if possible.

Help?

Andrew Ward
adward55@yahoo.com

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

Re: SQL confusion

From

Richard Poole

Date:

10 October 2004, 01:12:03

On Sat, Oct 09, 2004 at 01:39:45PM -0700, Andrew Ward wrote:
> 
> I'm running postgres 7.3.2 on linux, and making my
> requests from Perl scripts using DBD::Pg.  My table
> structure is as follows (irrelevant cols removed)
> 
> CREATE TABLE name (
>     namecounter integer NOT NULL,
>     firstmiddle character varying(64) NOT NULL,
>     lastname character varying(64) NOT NULL,
>     birthdate date,
>     hh smallint,
>     famnu integer,
> );

This may not be the fastest query, but it should be quite comprehensible.
Lightly tested only on 7.4.5 (do you know why you're not using 7.4.x?).

SELECT namecounter       -- and whatever other columns you need
FROM name outername      -- alias so it's clear what the joins do
WHERE hh = 1             -- simple case: head of family
OR (   hh = 0           -- not a head   AND NOT EXISTS ( -- there is no head of this family       SELECT namecounter
  FROM name in1        WHERE hh = 1 AND in1.famnu = outername.famnu)   AND birthdate = ( -- this person is as old as
theoldest person       SELECT MIN(birthdate)       FROM name in2        WHERE in2.famnu = outername.famnu)
 
);


Richard

Re: SQL confusion

From

Michael Fuhr

Date:

10 October 2004, 02:00:10

On Sat, Oct 09, 2004 at 01:39:45PM -0700, Andrew Ward wrote:
> 
> CREATE TABLE name (
>     namecounter integer NOT NULL,
>     firstmiddle character varying(64) NOT NULL,
>     lastname character varying(64) NOT NULL,
>     birthdate date,
>     hh smallint,
>     famnu integer,
> );
> 
> Each row represents a person with a unique
> namecounter.  Families share a famnu, and usually one
> person in a family is marked as head of household
> (hh>0), with everyone else hh=0.  However, there are a
> few families with nobody marked as hh, and I'd like to
> elect one by age.  The query I'm trying to do is to
> pull one person from each household, either the head
> of household if available, or the eldest if not.  I
> want them sorted by last name, so I'd prefer to find
> them all in one query, no matter how ugly and nested
> it has to be.

This should be close to what you need:

SELECT DISTINCT ON (lastname, famnu) *
FROM name
ORDER BY lastname, famnu, COALESCE(hh, 0) DESC, birthdate;

The order of the fields in the ORDER BY clause is important, so if
you want to sort the results by some other criteria then you might
need to do it with a subselect.

I used COALESCE on the hh field because a descending sort puts NULL
values ahead of non-NULL values -- without COALESCE a person with
hh=1 won't be recognized as the head of household if another member
of the family has hh=NULL.

You might need to modify the query to handle NULL birthdates or to
add other tiebreakers.

-- 
Michael Fuhr
http://www.fuhr.org/~mfuhr/

Re: SQL confusion

From

Thomas F.O'Connell

Date:

12 October 2004, 21:02:13

This is untested, but it might be enough to get you started:

SELECT namecounter
FROM name n
WHERE NOT EXISTS (SELECT 1FROM nameWHERE hh > 0AND famnu = n.famnu
)
GROUP BY famnu
HAVING birthdate = min( birthdate );

What I'm trying to do here is grab all families that don't have a head 
of household, group them by family, and get only the namecounter 
corresponding to the minimum birthdate for that family.

If I recall, I've had some trouble using HAVING with min/max in ways 
that seem intuitive to me, but this might help get you started.

-tfo

On Oct 9, 2004, at 3:39 PM, Andrew Ward wrote:

> I'm trying to figure out how to do a particular query,
> and I'm beating my head against a wall.  Here's my
> situation:
>
> I'm running postgres 7.3.2 on linux, and making my
> requests from Perl scripts using DBD::Pg.  My table
> structure is as follows (irrelevant cols removed)
>
> CREATE TABLE name (
>     namecounter integer NOT NULL,
>     firstmiddle character varying(64) NOT NULL,
>     lastname character varying(64) NOT NULL,
>     birthdate date,
>     hh smallint,
>     famnu integer,
> );
>
> Each row represents a person with a unique
> namecounter.  Families share a famnu, and usually one
> person in a family is marked as head of household
> (hh>0), with everyone else hh=0.  However, there are a
> few families with nobody marked as hh, and I'd like to
> elect one by age.  The query I'm trying to do is to
> pull one person from each household, either the head
> of household if available, or the eldest if not.  I
> want them sorted by last name, so I'd prefer to find
> them all in one query, no matter how ugly and nested
> it has to be.
>
> I can pull the list with hh>0 easily enough, but I'm
> not sure how to pull out the others.
>
> I realize that this could be done through some looping
> in the Perl script, but I'd like to avoid pulling the
> whole list into memory in case the list gets long.  My
> preference is to just handle one record at a time in
> Perl if possible.
>
> Help?
>
> Andrew Ward
> adward55@yahoo.com
>
> __________________________________________________
> Do You Yahoo!?
> Tired of spam?  Yahoo! Mail has the best spam protection around
> http://mail.yahoo.com
>
> ---------------------------(end of 
> broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
>                http://www.postgresql.org/docs/faqs/FAQ.html