Thread: Query - student, skill

Query - student, skill

From
Jayadevan M
Date:
Hi,
I have three tables - one with student ids and names, second one with skill ids and names, third one listing which students have which skills.
test=# \d stud
                                 Table "public.stud"
 Column |          Type          |                     Modifiers                    
--------+------------------------+---------------------------------------------------
 id     | integer                | not null default nextval('stud_id_seq'::regclass)
 name   | character varying(100) |

test=# \d stud_skill
   Table "public.stud_skill"
  Column  |  Type   | Modifiers
----------+---------+-----------
 stud_id  | integer |
 skill_id | integer |

test=# \d skill
                                 Table "public.skill"
 Column  |          Type          |                     Modifiers                     
---------+------------------------+----------------------------------------------------
 id      | integer                | not null default nextval('skill_id_seq'::regclass)
 sk_name | character varying(100) |

To fetch records of students who know , say, 'Java' and 'Oracle', is this the best way?
test=# select s.id,s.name  from stud s join  stud_skill s_k on s.id=s_k.stud_id join  skill sk on sk.id=s_k.skill_id where sk_name = 'Java' intersect select s.id,s.name  from stud s join  stud_skill s_k on s.id=s_k.stud_id join  skill sk on sk.id=s_k.skill_id where sk_name = 'Oracle';

Re: Query - student, skill

From
Michael Swierczek
Date:
On Wed, Oct 9, 2013 at 4:46 AM, Jayadevan M <maymala.jayadevan@gmail.com> wrote:
> Hi,
> I have three tables - one with student ids and names, second one with skill
> ids and names, third one listing which students have which skills.
> test=# \d stud
>                                  Table "public.stud"
>  Column |          Type          |                     Modifiers
> --------+------------------------+---------------------------------------------------
>  id     | integer                | not null default
> nextval('stud_id_seq'::regclass)
>  name   | character varying(100) |
>
> test=# \d stud_skill
>    Table "public.stud_skill"
>   Column  |  Type   | Modifiers
> ----------+---------+-----------
>  stud_id  | integer |
>  skill_id | integer |
>
> test=# \d skill
>                                  Table "public.skill"
>  Column  |          Type          |                     Modifiers
> ---------+------------------------+----------------------------------------------------
>  id      | integer                | not null default
> nextval('skill_id_seq'::regclass)
>  sk_name | character varying(100) |
>
> To fetch records of students who know , say, 'Java' and 'Oracle', is this
> the best way?
> test=# select s.id,s.name  from stud s join  stud_skill s_k on
> s.id=s_k.stud_id join  skill sk on sk.id=s_k.skill_id where sk_name = 'Java'
> intersect select s.id,s.name  from stud s join  stud_skill s_k on
> s.id=s_k.stud_id join  skill sk on sk.id=s_k.skill_id where sk_name =
> 'Oracle';
>

I think that would work.  I would do this, though - but I'm not sure
if it's any better:
select s.id, s.name
from
stud s
inner join stud_skill s_kj on s.id = s_kj.id
inner join skill skj on s_kj.skill_id = skj.skill_id
inner join stud_skill s_ko on s.id = s_ko.id
inner join skill sko on s_ko.skill_id = sko.skill_id
where
sko.sk_name = 'Oracle' and skj.sk_name = 'Java'

-Mike


Re: Query - student, skill

From
Kevin Grittner
Date:
Jayadevan M <maymala.jayadevan@gmail.com> wrote:

> To fetch records of students who know , say, 'Java' and 'Oracle',
> is this the best way?
>
> select s.id, s.name
>   from stud s
>   join stud_skill s_k on s.id = s_k.stud_id
>   join skill sk on sk.id = s_k.skill_id
>   where sk_name = 'Java'
> intersect
> select s.id, s.name
>   from stud s
>   join stud_skill s_k on s.id = s_k.stud_id
>   join skill sk on sk.id = s_k.skill_id
>   where sk_name = 'Oracle'
> ;

I think that in most (maybe all?) cases the set operations like
INTERSECT cause the queries on both sides to be executed and the
set operation performed on the results.  It should be faster just
to use two joins to the skill table:

select s.id, s.name
  from stud s
  join stud_skill s_k on s.id = s_k.stud_id
  join skill sk1 on sk1.id = s_k.skill_id and sk1.sk_name = 'Java'
  join skill sk2 on sk2.id = s_k.skill_id and sk2.sk_name = 'Oracle'
;

If the skill names are not unique, you might want to throw a
DISTINCT in there, too.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Query - student, skill

From
Jayadevan M
Date:
Thanks. Since the number of conditions and the type of conditions (AND, OR ) etc are not known beforehand, I felt that generating the query as above (aliasing) may be a bit more difficult   compared to generating a list of similar INTERSECTs or UNIONs. Since we have equijoins and and an exact match on skill, it should get executed fast?
I got another way of doing this also.....
WHERE SKILL IN ('JAVA','ORACLE') group by student_id having count(*) = 2. If there are 3 skills, do a count(*)=3.
By the way, combination of student_id/skill will be unique in that table.
Any other ideas? I am collecting options - will check performance and use the best.



On Wed, Oct 9, 2013 at 8:12 PM, Kevin Grittner <kgrittn@ymail.com> wrote:
Jayadevan M <maymala.jayadevan@gmail.com> wrote:

> To fetch records of students who know , say, 'Java' and 'Oracle',
> is this the best way?
>
> select s.id, s.name
>   from stud s
>   join stud_skill s_k on s.id = s_k.stud_id
>   join skill sk on sk.id = s_k.skill_id
>   where sk_name = 'Java'
> intersect
> select s.id, s.name
>   from stud s
>   join stud_skill s_k on s.id = s_k.stud_id
>   join skill sk on sk.id = s_k.skill_id
>   where sk_name = 'Oracle'
> ;

I think that in most (maybe all?) cases the set operations like
INTERSECT cause the queries on both sides to be executed and the
set operation performed on the results.  It should be faster just
to use two joins to the skill table:

select s.id, s.name
  from stud s
  join stud_skill s_k on s.id = s_k.stud_id
  join skill sk1 on sk1.id = s_k.skill_id and sk1.sk_name = 'Java'
  join skill sk2 on sk2.id = s_k.skill_id and sk2.sk_name = 'Oracle'
;

If the skill names are not unique, you might want to throw a
DISTINCT in there, too.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Query - student, skill

From
Kevin Grittner
Date:
Jayadevan M <maymala.jayadevan@gmail.com> wrote:

> Since the number of conditions and the type of conditions (AND,
> OR ) etc are not known beforehand, I felt that generating the
> query as above (aliasing) may be a bit more difficult   compared
> to generating a list of similar INTERSECTs or UNIONs. Since we
> have equijoins and and an exact match on skill, it should get
> executed fast?

The alternative I showed should be faster, but it might not be by
enough to matter.  Populating a database with enough data to match
what you expect in production and running some benchmarks is the
best way to find out.

> I got another way of doing this also.....
> WHERE SKILL IN ('JAVA','ORACLE') group by student_id having
> count(*) = 2. If there are 3 skills, do a count(*)=3.

I considered that before posting my suggestion.  It should work.

> By the way, combination of student_id/skill will be unique in
> that table. Any other ideas?

It probably will generate a plan very much like my earlier
suggestion, especially since the combination is unique, but you
could add EXISTS tests to the WHERE clause instead of having the
JOINs to the skill table.

> I am collecting options - will check performance and use the
> best.

Good plan.  :-)  But make sure you have row sizes, row counts,
distributions, and correlations roughly comparable to what you
expect in production when you compare performance.  And be sure you
have tuned the system.  If any query is not performing as well as
you expected, read this and post to the pgsql-performance list:

http://wiki.postgresql.org/wiki/SlowQueryQuestions

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company