Thread: Query - student, skill
Hi,
I have three tables - one with student ids and names, second one with skill ids and names, third one listing which students have which skills. test=# \d stud
Table "public.stud"
Column | Type | Modifiers
--------+------------------------+---------------------------------------------------
id | integer | not null default nextval('stud_id_seq'::regclass)
name | character varying(100) |
test=# \d stud_skill
Table "public.stud_skill"
Column | Type | Modifiers
----------+---------+-----------
stud_id | integer |
skill_id | integer |
test=# \d skill
Table "public.skill"
Column | Type | Modifiers
---------+------------------------+----------------------------------------------------
id | integer | not null default nextval('skill_id_seq'::regclass)
sk_name | character varying(100) |
test=# select s.id,s.name from stud s join stud_skill s_k on s.id=s_k.stud_id join skill sk on sk.id=s_k.skill_id where sk_name = 'Java' intersect select s.id,s.name from stud s join stud_skill s_k on s.id=s_k.stud_id join skill sk on sk.id=s_k.skill_id where sk_name = 'Oracle';
On Wed, Oct 9, 2013 at 4:46 AM, Jayadevan M <maymala.jayadevan@gmail.com> wrote: > Hi, > I have three tables - one with student ids and names, second one with skill > ids and names, third one listing which students have which skills. > test=# \d stud > Table "public.stud" > Column | Type | Modifiers > --------+------------------------+--------------------------------------------------- > id | integer | not null default > nextval('stud_id_seq'::regclass) > name | character varying(100) | > > test=# \d stud_skill > Table "public.stud_skill" > Column | Type | Modifiers > ----------+---------+----------- > stud_id | integer | > skill_id | integer | > > test=# \d skill > Table "public.skill" > Column | Type | Modifiers > ---------+------------------------+---------------------------------------------------- > id | integer | not null default > nextval('skill_id_seq'::regclass) > sk_name | character varying(100) | > > To fetch records of students who know , say, 'Java' and 'Oracle', is this > the best way? > test=# select s.id,s.name from stud s join stud_skill s_k on > s.id=s_k.stud_id join skill sk on sk.id=s_k.skill_id where sk_name = 'Java' > intersect select s.id,s.name from stud s join stud_skill s_k on > s.id=s_k.stud_id join skill sk on sk.id=s_k.skill_id where sk_name = > 'Oracle'; > I think that would work. I would do this, though - but I'm not sure if it's any better: select s.id, s.name from stud s inner join stud_skill s_kj on s.id = s_kj.id inner join skill skj on s_kj.skill_id = skj.skill_id inner join stud_skill s_ko on s.id = s_ko.id inner join skill sko on s_ko.skill_id = sko.skill_id where sko.sk_name = 'Oracle' and skj.sk_name = 'Java' -Mike
Jayadevan M <maymala.jayadevan@gmail.com> wrote: > To fetch records of students who know , say, 'Java' and 'Oracle', > is this the best way? > > select s.id, s.name > from stud s > join stud_skill s_k on s.id = s_k.stud_id > join skill sk on sk.id = s_k.skill_id > where sk_name = 'Java' > intersect > select s.id, s.name > from stud s > join stud_skill s_k on s.id = s_k.stud_id > join skill sk on sk.id = s_k.skill_id > where sk_name = 'Oracle' > ; I think that in most (maybe all?) cases the set operations like INTERSECT cause the queries on both sides to be executed and the set operation performed on the results. It should be faster just to use two joins to the skill table: select s.id, s.name from stud s join stud_skill s_k on s.id = s_k.stud_id join skill sk1 on sk1.id = s_k.skill_id and sk1.sk_name = 'Java' join skill sk2 on sk2.id = s_k.skill_id and sk2.sk_name = 'Oracle' ; If the skill names are not unique, you might want to throw a DISTINCT in there, too. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Thanks. Since the number of conditions and the type of conditions (AND, OR ) etc are not known beforehand, I felt that generating the query as above (aliasing) may be a bit more difficult compared to generating a list of similar INTERSECTs or UNIONs. Since we have equijoins and and an exact match on skill, it should get executed fast?
I got another way of doing this also.....
WHERE SKILL IN ('JAVA','ORACLE') group by student_id having count(*) = 2. If there are 3 skills, do a count(*)=3.I got another way of doing this also.....
On Wed, Oct 9, 2013 at 8:12 PM, Kevin Grittner <kgrittn@ymail.com> wrote:
Jayadevan M <maymala.jayadevan@gmail.com> wrote:> select s.id, s.name
> To fetch records of students who know , say, 'Java' and 'Oracle',
> is this the best way?
>> from stud s> where sk_name = 'Oracle'
> join stud_skill s_k on s.id = s_k.stud_id
> join skill sk on sk.id = s_k.skill_id
> where sk_name = 'Java'
> intersect
> select s.id, s.name
> from stud s
> join stud_skill s_k on s.id = s_k.stud_id
> join skill sk on sk.id = s_k.skill_id
> ;
I think that in most (maybe all?) cases the set operations like
INTERSECT cause the queries on both sides to be executed and the
set operation performed on the results. It should be faster just
to use two joins to the skill table:
select s.id, s.name
join skill sk1 on sk1.id = s_k.skill_id and sk1.sk_name = 'Java'
join skill sk2 on sk2.id = s_k.skill_id and sk2.sk_name = 'Oracle'
;
If the skill names are not unique, you might want to throw a
DISTINCT in there, too.
--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Jayadevan M <maymala.jayadevan@gmail.com> wrote: > Since the number of conditions and the type of conditions (AND, > OR ) etc are not known beforehand, I felt that generating the > query as above (aliasing) may be a bit more difficult compared > to generating a list of similar INTERSECTs or UNIONs. Since we > have equijoins and and an exact match on skill, it should get > executed fast? The alternative I showed should be faster, but it might not be by enough to matter. Populating a database with enough data to match what you expect in production and running some benchmarks is the best way to find out. > I got another way of doing this also..... > WHERE SKILL IN ('JAVA','ORACLE') group by student_id having > count(*) = 2. If there are 3 skills, do a count(*)=3. I considered that before posting my suggestion. It should work. > By the way, combination of student_id/skill will be unique in > that table. Any other ideas? It probably will generate a plan very much like my earlier suggestion, especially since the combination is unique, but you could add EXISTS tests to the WHERE clause instead of having the JOINs to the skill table. > I am collecting options - will check performance and use the > best. Good plan. :-) But make sure you have row sizes, row counts, distributions, and correlations roughly comparable to what you expect in production when you compare performance. And be sure you have tuned the system. If any query is not performing as well as you expected, read this and post to the pgsql-performance list: http://wiki.postgresql.org/wiki/SlowQueryQuestions -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company