Re: Need a SQL to create sets of hobbies - Mailing list pgsql-sql

From Aaron Bono
Subject Re: Need a SQL to create sets of hobbies
Date
Msg-id bf05e51c0609201138t6e453c5pf85863a3a422cf08@mail.gmail.com
Whole thread Raw
In response to Need a SQL to create sets of hobbies  ("CN" <cnliou9@fastmail.fm>)
List pgsql-sql
On 9/20/06, CN <cnliou9@fastmail.fm> wrote:
Hi!

CREATE TABLE x1 (name TEXT, hobby TEXT, PRIMARY KEY(name,hobby));

INSERT INTO x1 VALUES('John','music');
INSERT INTO x1 VALUES('John','arts');
INSERT INTO x1 VALUES('Bob','arts');
INSERT INTO x1 VALUES('Bob','music');
INSERT INTO x1 VALUES('Rocky','copmputer');
INSERT INTO x1 VALUES('Steve','arts');
INSERT INTO x1 VALUES('Steve','football');
INSERT INTO x1 VALUES('Tom','computer');
INSERT INTO x1 VALUES('Tom','music');

select * from x1;

name  |  hobby
-------+----------
John  | music
John  | arts
Bob   | arts
Bob   | music
Rocky | computer
Steve | arts
Steve | football
Tom   | computer
Tom   | music
(9 rows)

John and Bob have the same hobbies - music and arts. So music and arts
are treated as one set of hobbies.

Rocky has an unique set of interest - computer.

Steve also likes arts just as John and Bob do, but he also has an
exclusive interest - football. Thus, his set of hobbies is unique -
arts, football.

One of Tom's hobbies, music, overlaps those of John, Bob, and Rocky; but
computer does not. Hence his hobbies, computer and music, forms a new
set of hobbies.

Now we have 4 sets of hobbies:

set 1: music, arts
set 2: computer
set 3: arts, football
set 4: computer, music

I am looking for an SQL that creates sets of hobbies in table x2 by
selecting from table x1:

CREATE TABLE x2 (sid INTEGER, hobby TEXT, PRIMARY KEY(sid,hobby));

and makes x2 contain rows:

sid   |  hobby
-------+----------
1      | music
1      | arts
2      | computer
3      | arts
3      | football
4      | computer
4      | music

where gid starts from 1.

 
Your best design is to break this into 3 tables:

person (
   person_id,
   person_name
)

hobby (
   hobby_id,
   hobby_name
)

person_hobby (
   person_id,
   hobby_id
)

Then you can get the list of hobbies for each person like this:

    SELECT
        person.person_id,
        array(
            SELECT hobby_name
            FROM person_hobby
            INNER JOIN hobby ON (hobby.hobby_id = person_hobby.hobby_id)
            WHERE person_hobby.person_id = person.person_id
            ORDER BY hobby_name
        ) AS hobby_list
    FROM person

Then do this to flatten it back out:

SELECT
    -- Gives you a unique id though using a SERIAL on a table would be better
    min(mysub.person_id), 
    hobby.hobby_name
FROM (
    SELECT
        person.person_id,
        array(
            SELECT hobby_name
            FROM person_hobby
            INNER JOIN hobby ON (hobby.hobby_id = person_hobby.hobby_id)
            WHERE person_hobby.person_id = person.person_id
            ORDER BY hobby_name
        ) AS hobby_list
    FROM person
) mysub
INNER JOIN person_hobby ON (person_hobby.person_id = mysub.person_id)
INNER JOIN hobby ON (hobby.hobby_id = person_hobby.hobby_id)
GROUP BY
    mysub.hobby_list,
    hobby.hobby_name

I did not try it so it may require a little tweaking to work.  Also, I don't know what the performance would be like.

==================================================================
   Aaron Bono
   Aranya Software Technologies, Inc.
   http://www.aranya.com
   http://codeelixir.com
==================================================================

pgsql-sql by date:

Previous
From: CG
Date:
Subject: Re: Nested loops are killing throughput
Next
From: TJ O'Donnell
Date:
Subject: ERROR: could not write block 196261 of temporary file: No space left