Thread: Collapsing multiple subqueries into one
I have two tables: CREATE TABLE items ( root_id integer NOT NULL, id serial NOT NULL, -- Other fields... CONSTRAINT items_pkey PRIMARY KEY (root_id, id) ) CREATE TABLE votes ( root_id integer NOT NULL, item_id integer NOT NULL, user_id integer NOT NULL, type smallint NOT NULL, direction smallint, CONSTRAINT votes_pkey PRIMARY KEY (root_id, item_id, user_id, type), CONSTRAINT votes_root_id_fkey FOREIGN KEY (root_id, item_id) REFERENCES items (root_id, id) MATCH SIMPLE ON UPDATE CASCADE ON DELETE CASCADE, -- Other constraints... ) I'm trying to, in a single query, pull out all items of a particular root_id along with a few arrays of user_ids of the users who voted in particular ways. The following query does what I need: SELECT *, ARRAY(SELECT user_id from votes where root_id = i.root_id AND item_id = i.id AND type = 0 AND direction = 1) as upvoters, ARRAY(SELECT user_id from votes where root_id = i.root_id AND item_id = i.id AND type = 0 AND direction = -1) as downvoters, ARRAY(SELECT user_id from votes where root_id = i.root_id AND item_id = i.id AND type = 1) as favoriters FROM items i WHERE root_id = 1 ORDER BY id The problem is that I'm using three subqueries to get the information I need when it seems like I should be able to do the same in one. I thought that Postgres (I'm using 8.4) might be smart enough to collapse them all into a single query for me, but looking at the explain output in pgAdmin it looks like that's not happening - it's running multiple primary key lookups on the votes table instead. I feel like I could rework this query to be more efficient, but I'm not sure how. Any pointers?
This might help you: http://www.postgresql.org/docs/8.4/static/queries-with.html On 24/08/2011, at 9:54 AM, Chris Hanks wrote: > I have two tables: > > CREATE TABLE items > ( > root_id integer NOT NULL, > id serial NOT NULL, > -- Other fields... > > CONSTRAINT items_pkey PRIMARY KEY (root_id, id) > ) > > CREATE TABLE votes > ( > root_id integer NOT NULL, > item_id integer NOT NULL, > user_id integer NOT NULL, > type smallint NOT NULL, > direction smallint, > > CONSTRAINT votes_pkey PRIMARY KEY (root_id, item_id, user_id, type), > CONSTRAINT votes_root_id_fkey FOREIGN KEY (root_id, item_id) > REFERENCES items (root_id, id) MATCH SIMPLE > ON UPDATE CASCADE ON DELETE CASCADE, > -- Other constraints... > ) > > I'm trying to, in a single query, pull out all items of a particular > root_id along with a few arrays of user_ids of the users who voted in > particular ways. The following query does what I need: > > SELECT *, > ARRAY(SELECT user_id from votes where root_id = i.root_id AND > item_id = i.id AND type = 0 AND direction = 1) as upvoters, > ARRAY(SELECT user_id from votes where root_id = i.root_id AND > item_id = i.id AND type = 0 AND direction = -1) as downvoters, > ARRAY(SELECT user_id from votes where root_id = i.root_id AND > item_id = i.id AND type = 1) as favoriters > FROM items i > WHERE root_id = 1 > ORDER BY id > > The problem is that I'm using three subqueries to get the information > I need when it seems like I should be able to do the same in one. I > thought that Postgres (I'm using 8.4) might be smart enough to > collapse them all into a single query for me, but looking at the > explain output in pgAdmin it looks like that's not happening - it's > running multiple primary key lookups on the votes table instead. I > feel like I could rework this query to be more efficient, but I'm not > sure how. > > Any pointers? > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general
Thanks Royce. I put together another query using a WITH statement that's also working: WITH v AS ( SELECT item_id, type, direction, array_agg(user_id) as user_ids FROM votes WHERE root_id = 5305 GROUP BY type, direction, item_id ORDER BY type, direction, item_id ) SELECT *, (SELECT user_ids from v where item_id = i.id AND type = 0 AND direction = 1) as upvoters, (SELECT user_ids from v where item_id = i.id AND type = 0 AND direction = -1) as downvoters, (SELECT user_ids from v where item_id = i.id AND type = 1) as favoriters FROM items i WHERE root_id = 5305 ORDER BY id It feels more sensible to me, but it's slightly slower than my initial attempt (15 ms vs. 13 ms, when running as a prepared statement to avoid any query parsing overhead, and averaging the time over several thousand queries). I'm not sure why...? On Tue, Aug 23, 2011 at 8:14 PM, Royce Ausburn <royce.ml@inomial.com> wrote: > This might help you: > > http://www.postgresql.org/docs/8.4/static/queries-with.html > > On 24/08/2011, at 9:54 AM, Chris Hanks wrote: > >> I have two tables: >> >> CREATE TABLE items >> ( >> root_id integer NOT NULL, >> id serial NOT NULL, >> -- Other fields... >> >> CONSTRAINT items_pkey PRIMARY KEY (root_id, id) >> ) >> >> CREATE TABLE votes >> ( >> root_id integer NOT NULL, >> item_id integer NOT NULL, >> user_id integer NOT NULL, >> type smallint NOT NULL, >> direction smallint, >> >> CONSTRAINT votes_pkey PRIMARY KEY (root_id, item_id, user_id, type), >> CONSTRAINT votes_root_id_fkey FOREIGN KEY (root_id, item_id) >> REFERENCES items (root_id, id) MATCH SIMPLE >> ON UPDATE CASCADE ON DELETE CASCADE, >> -- Other constraints... >> ) >> >> I'm trying to, in a single query, pull out all items of a particular >> root_id along with a few arrays of user_ids of the users who voted in >> particular ways. The following query does what I need: >> >> SELECT *, >> ARRAY(SELECT user_id from votes where root_id = i.root_id AND >> item_id = i.id AND type = 0 AND direction = 1) as upvoters, >> ARRAY(SELECT user_id from votes where root_id = i.root_id AND >> item_id = i.id AND type = 0 AND direction = -1) as downvoters, >> ARRAY(SELECT user_id from votes where root_id = i.root_id AND >> item_id = i.id AND type = 1) as favoriters >> FROM items i >> WHERE root_id = 1 >> ORDER BY id >> >> The problem is that I'm using three subqueries to get the information >> I need when it seems like I should be able to do the same in one. I >> thought that Postgres (I'm using 8.4) might be smart enough to >> collapse them all into a single query for me, but looking at the >> explain output in pgAdmin it looks like that's not happening - it's >> running multiple primary key lookups on the votes table instead. I >> feel like I could rework this query to be more efficient, but I'm not >> sure how. >> >> Any pointers? >> >> -- >> Sent via pgsql-general mailing list (pgsql-general@postgresql.org) >> To make changes to your subscription: >> http://www.postgresql.org/mailpref/pgsql-general > >
On 24/08/2011, at 4:44 PM, Chris Hanks wrote: > Thanks Royce. I put together another query using a WITH statement > that's also working: > > WITH v AS ( > SELECT item_id, type, direction, array_agg(user_id) as user_ids > FROM votes > WHERE root_id = 5305 > GROUP BY type, direction, item_id > ORDER BY type, direction, item_id > ) > SELECT *, > (SELECT user_ids from v where item_id = i.id AND type = 0 AND > direction = 1) as upvoters, > (SELECT user_ids from v where item_id = i.id AND type = 0 AND > direction = -1) as downvoters, > (SELECT user_ids from v where item_id = i.id AND type = 1) as favoriters > FROM items i > WHERE root_id = 5305 > ORDER BY id > > It feels more sensible to me, but it's slightly slower than my initial > attempt (15 ms vs. 13 ms, when running as a prepared statement to > avoid any query parsing overhead, and averaging the time over several > thousand queries). I'm not sure why...? I'm not sure, Chris - perhaps others on the mailing list can answer this? > > > > On Tue, Aug 23, 2011 at 8:14 PM, Royce Ausburn <royce.ml@inomial.com> wrote: >> This might help you: >> >> http://www.postgresql.org/docs/8.4/static/queries-with.html >> >> On 24/08/2011, at 9:54 AM, Chris Hanks wrote: >> >>> I have two tables: >>> >>> CREATE TABLE items >>> ( >>> root_id integer NOT NULL, >>> id serial NOT NULL, >>> -- Other fields... >>> >>> CONSTRAINT items_pkey PRIMARY KEY (root_id, id) >>> ) >>> >>> CREATE TABLE votes >>> ( >>> root_id integer NOT NULL, >>> item_id integer NOT NULL, >>> user_id integer NOT NULL, >>> type smallint NOT NULL, >>> direction smallint, >>> >>> CONSTRAINT votes_pkey PRIMARY KEY (root_id, item_id, user_id, type), >>> CONSTRAINT votes_root_id_fkey FOREIGN KEY (root_id, item_id) >>> REFERENCES items (root_id, id) MATCH SIMPLE >>> ON UPDATE CASCADE ON DELETE CASCADE, >>> -- Other constraints... >>> ) >>> >>> I'm trying to, in a single query, pull out all items of a particular >>> root_id along with a few arrays of user_ids of the users who voted in >>> particular ways. The following query does what I need: >>> >>> SELECT *, >>> ARRAY(SELECT user_id from votes where root_id = i.root_id AND >>> item_id = i.id AND type = 0 AND direction = 1) as upvoters, >>> ARRAY(SELECT user_id from votes where root_id = i.root_id AND >>> item_id = i.id AND type = 0 AND direction = -1) as downvoters, >>> ARRAY(SELECT user_id from votes where root_id = i.root_id AND >>> item_id = i.id AND type = 1) as favoriters >>> FROM items i >>> WHERE root_id = 1 >>> ORDER BY id >>> >>> The problem is that I'm using three subqueries to get the information >>> I need when it seems like I should be able to do the same in one. I >>> thought that Postgres (I'm using 8.4) might be smart enough to >>> collapse them all into a single query for me, but looking at the >>> explain output in pgAdmin it looks like that's not happening - it's >>> running multiple primary key lookups on the votes table instead. I >>> feel like I could rework this query to be more efficient, but I'm not >>> sure how. >>> >>> Any pointers? >>> >>> -- >>> Sent via pgsql-general mailing list (pgsql-general@postgresql.org) >>> To make changes to your subscription: >>> http://www.postgresql.org/mailpref/pgsql-general >> >> > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general