Thread: Optimize query: time of "single * IN(many)" > time of "many * IN(single)"
Hello! Can anyone help me out with the following situation: (a) a single query with 550 id's in the IN-clause resulting into 800+ seconds; (b) 550 queries with a single id in the IN-clause resulting into overall time of <60 seconds; The table consists of 950.000 records, and the resultset consists of 205.000 records. >> Why is there such an extreme difference in time? >> And is there a way to reduce the difference in time? More information about the situation is below. Thank you for your help and time! =) Postgres-version 7.3.1 The query is like: SELECT a_id, b_id, score, c_id, d_id FROM tbl_scores WHERE a_id IN(...) UNION SELECT a_id, b_id, score, c_id, d_id FROM tbl_scores_alike WHERE a_id IN(...) Definition of tables is like: ___Fields CREATE TABLE public.tbl_scores ( id int4 DEFAULT nextval('"tbl_scores_id_seq"'::text) NOT NULL, a_id int4, b_id int4, score int4, d_id int8, CONSTRAINT tbl_scores_pkey UNIQUE (id), dc date DEFAULT now(), c_id int4, INITIALLY IMMEDIATE ) WITHOUT OIDS; ___Index CREATE INDEX tbl_scores_idx10 ON tbl_scores USING btree (a_id, b_id, score, c_id, d_id); Things that I tried to reduce the time of situation (a) - single * IN(many): * vacuum of the database; hardly any improvement. * selecting a single field in the resultset (a_id) instead of all fields; hardly any improvement. * only querying one table, skipping the UNION; hardly any improvement; * ... what would you try? _________________________________________________________________ MSN Zoeken helpt je om de gekste dingen te vinden! http://search.msn.nl
Firstly you should always provide an explain from your query before posting to this list. I think the problem is that in <7.4 PG did not use indexes for IN queries. This has been fixed in 7.4. On Thu, 2004-01-08 at 05:44, Paul Janssen wrote: > Hello! > > Can anyone help me out with the following situation: > (a) a single query with 550 id's in the IN-clause resulting into 800+ > seconds; > (b) 550 queries with a single id in the IN-clause resulting into overall > time of <60 seconds; > The table consists of 950.000 records, and the resultset consists of 205.000 > records. > > >> Why is there such an extreme difference in time? > >> And is there a way to reduce the difference in time? > More information about the situation is below. > > Thank you for your help and time! =) > > Postgres-version > 7.3.1 > > The query is like: > SELECT a_id, b_id, score, c_id, d_id > FROM tbl_scores > WHERE a_id IN(...) > UNION > SELECT a_id, b_id, score, c_id, d_id > FROM tbl_scores_alike > WHERE a_id IN(...) > > Definition of tables is like: > ___Fields > CREATE TABLE public.tbl_scores ( > id int4 DEFAULT nextval('"tbl_scores_id_seq"'::text) NOT NULL, > a_id int4, > b_id int4, > score int4, d_id int8, > CONSTRAINT tbl_scores_pkey UNIQUE (id), > dc date DEFAULT now(), > c_id int4, > INITIALLY IMMEDIATE > ) WITHOUT OIDS; > ___Index > CREATE INDEX tbl_scores_idx10 ON tbl_scores USING btree (a_id, b_id, > score, c_id, d_id); > > Things that I tried to reduce the time of situation (a) - single * IN(many): > * vacuum of the database; hardly any improvement. > * selecting a single field in the resultset (a_id) instead of all fields; > hardly any improvement. > * only querying one table, skipping the UNION; hardly any improvement; > * ... what would you try? > > _________________________________________________________________ > MSN Zoeken helpt je om de gekste dingen te vinden! http://search.msn.nl > > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster
On Thu, Jan 08, 2004 at 06:28:14AM -0500, Dave Smith wrote: > Firstly you should always provide an explain from your query before > posting to this list. You mean "while posting", because he can't possible provide the explain before having the means to do so, can he? :-) > I think the problem is that in <7.4 PG did not use > indexes for IN queries. This has been fixed in 7.4. I think what was done is to optimize queries like IN (SELECT ...) but there has not been improvement for IN (1,2,3, ... 550) like he appears to be doing. Maybe something to try is putting the IDs in a (temp?) table and using the first form. -- Alvaro Herrera (<alvherre[a]dcc.uchile.cl>) "Cada quien es cada cual y baja las escaleras como quiere" (JMSerrat)
"Paul Janssen" <postgresuser@hotmail.com> writes: > Can anyone help me out with the following situation: > (a) a single query with 550 id's in the IN-clause resulting into 800+ > seconds; > (b) 550 queries with a single id in the IN-clause resulting into overall > time of <60 seconds; > The table consists of 950.000 records, and the resultset consists of 205.000 > records. > Why is there such an extreme difference in time? Most likely the planner is opting not to use an indexscan in the first case. Could we see EXPLAIN ANALYZE results for both cases? Also, try "SET enable_seqscan TO OFF" and then repeat EXPLAIN ANALYZE for case (a). regards, tom lane