Re: subselect syntax - Mailing list pgsql-novice
From | Michael Glaesemann |
---|---|
Subject | Re: subselect syntax |
Date | |
Msg-id | 1821C578-8D90-4837-9894-2F1E0264F7EE@seespotcode.net Whole thread Raw |
In response to | subselect syntax (Steve Lefevre <lefevre.10@osu.edu>) |
List | pgsql-novice |
On Jun 6, 2007, at 16:39 , Steve Lefevre wrote: > I have a table of projects, users, and user_projects. The table > user_projects creates a many-to-many relationship between users and > projects. > > I'm creating a select list on a web form for adding a new user to a > project. I want to select users from the user table, who aren't > already listed in the join table under that project. What your looking for is sometimes called a semijoin: getting everything that *isn't* associated with something. SQL doesn't have a semijoin operator, but there are a couple of workarounds. First, here's the setup I'm using: CREATE TABLE users (user_id TEXT PRIMARY KEY); INSERT INTO users (user_id) VALUES ('fred') , ('wilma') , ('barney') , ('betty'); CREATE TABLE projects ( project_id TEXT PRIMARY KEY); INSERT INTO projects (project_id) VALUES ('foo') , ('bar') , ('baz'); CREATE TABLE user_projects ( user_id TEXT NOT NULL REFERENCES users , project_id TEXT NOT NULL REFERENCES projects , PRIMARY KEY (user_id, project_id) ); INSERT INTO user_projects (user_id, project_id) VALUES ('fred', 'foo') , ('wilma', 'foo') , ('barney', 'bar') , ('betty', 'bar') , ('wilma', 'baz') , ('betty', 'baz'); > Here's my select: > SELECT * FROM users > WHERE user_id <> $current_user_id > AND user_id <> ( SELECT user_id FROM user_projects WHERE project_id = > $project_id ) > > This query returns no rows, even on projects that have no records in > the user_projects table! Oh, really? I get an error when I try to run your example: SELECT * FROM users WHERE user_id <> 'barney' AND user_id <> ( SELECT user_id FROM user_projects WHERE project_id = 'baz'); ERROR: more than one row returned by a subquery used as an expression Note that this will work (at least in 8.2.4) if the subquery only returns a single row, but you'd run into problems as soon as you have more than one user affiliated with a given project. One of the ways to get a semijoin is using EXCEPT. We want to get all of the users *except* those affiliated with a certain project (in this case 'baz'): -- get all users SELECT user_id FROM users EXCEPT -- all of the users who are associated with project baz SELECT user_id FROM user_projects WHERE project_id = 'baz'; user_id --------- barney fred (2 rows) Let's say barney is the current user, so we add a restriction: SELECT user_id FROM ( -- get all users SELECT user_id FROM users EXCEPT -- all of the users who are associated with project baz SELECT user_id FROM user_projects WHERE project_id = 'baz' ) non_baz_users WHERE user_id <> 'barney'; user_id --------- fred (1 row) We can also reformulate that by moving the restriction up into the first query: -- get all users other than barney SELECT user_id FROM users WHERE user_id <> 'barney' EXCEPT -- all of the users who are associated with project baz SELECT user_id FROM user_projects WHERE project_id = 'baz'; user_id --------- fred (1 row) Another way to do it is to use a LEFT JOIN, taking advantage of the fact that columns will be filled with NULL where there is no association. This is often faster than using EXCEPT. SELECT user_id, project_id FROM users LEFT JOIN ( -- users for project baz SELECT * FROM user_projects WHERE project_id = 'baz') u_p USING (user_id); user_id | project_id ---------+------------ fred | wilma | baz barney | betty | baz (4 rows) Note that all users are returned. Now we just restrict the result to those whose project_id is null (because they're not associated with project baz) and those who aren't barney: SELECT user_id FROM users LEFT JOIN ( -- users for project baz SELECT * FROM user_projects WHERE project_id = 'baz') u_p USING (user_id) WHERE project_id IS NULL -- all users who aren't associated with project baz -- so their project_id column is NULL AND user_id <> 'barney'; -- and we don't want barney user_id --------- fred (1 row) A third way would have been to rewrite your query using NOT IN rather than <>: SELECT * FROM users WHERE user_id <> 'barney' AND user_id NOT IN ( SELECT user_id FROM user_projects WHERE project_id = 'baz'); user_id --------- fred (1 row) Hope that helps. Michael Glaesemann grzm seespotcode net
pgsql-novice by date: