Thread: Very slow query
The following query on some small datasets takes over a second to run. I'd be grateful for some help in understanding the explain output, and to remake the code. Looks like the sort is using up quite a bit of resources. I made an index on boards using columns "b.n_type, b.n_id, b.t_name" but the index was not recorded in explain analyze. (see "testindex" below). I am using PostgreSQL 7.4.2 on i386-pc-linux-gnu (Debian). The query is in a psql function (I've changed variables to hard coded integers below). The db is vacuumed every night. Thanks for any help; Rory SELECT DISTINCT b.n_id as id, b.n_type, CASE b.n_type WHEN 0 THEN 'personal' WHEN 1 THEN 'private' WHEN 2 THEN 'blog' ELSE 'public' END as type, b.t_name as title FROM boards b, people p, objects o WHERE b.b_hidden = 'f' AND ( b.n_type = 3 OR b.n_creator = 71 OR ( b.n_id = o.n_board_id AND o.n_creator = 71 AND o.n_joined > 0 ) ) ORDER BY b.n_type, b.n_id ASC, b.t_name; trial=> \d boards Table "public.boards" Column | Type | Modifiers ---------------+-----------------------------+---------------------------------------------------------- n_id | integer | not null default nextval('public.boards_n_id_seq'::text) b_hidden | boolean | default false dt_created | timestamp without time zone | default ('now'::text)::timestamp(6) with time zone dt_modified | timestamp without time zone | default ('now'::text)::timestamp(6) with time zone t_mobile | character varying(15) | t_email | character varying(50) | n_creator | integer | not null n_type | smallint | not null default 0 t_name | character varying(100) | not null t_description | character varying(500) | n_id_photo | integer | n_bg_colour | integer | default 0 Indexes: "boards_pkey" primary key, btree (n_id) "boards_t_email_key" unique, btree (t_email) "boards_t_mobile_key" unique, btree (t_mobile) "testindex" btree (n_type, n_id, t_name) Foreign-key constraints: "$1" FOREIGN KEY (n_creator) REFERENCES people(n_id) ON UPDATE CASCADE ON DELETE CASCADE "$2" FOREIGN KEY (n_id_photo) REFERENCES photo(n_id) ON UPDATE CASCADE ON DELETE SET NULL Triggers: tr_update_modified_time BEFORE UPDATE ON boards FOR EACH ROW EXECUTE PROCEDURE fn_update_modified_time() trial=> \d people Table "public.people" Column | Type | Modifiers -------------------+-----------------------------+---------------------------------------------------------- n_id | integer | not null default nextval('public.people_n_id_seq'::text) n_object_id | integer | n_objects_counter | integer | default 0 b_hidden | boolean | default false dt_created | timestamp without time zone | default ('now'::text)::timestamp(6) with time zone dt_modified | timestamp without time zone | default ('now'::text)::timestamp(6) with time zone t_nickname | character varying(20) | not null t_firstname | character varying(20) | not null t_surname | character varying(25) | not null t_mobile | character varying(15) | t_email | character varying(50) | t_password | character varying(15) | not null b_registered | boolean | default false n_email_status | smallint | default 0 n_txt_status | smallint | default 0 b_work_hours | boolean | default false b_confirmations | boolean | default true Indexes: "people_pkey" primary key, btree (n_id) "people_t_email_key" unique, btree (t_email) "people_t_mobile_key" unique, btree (t_mobile) "people_t_nickname_key" unique, btree (t_nickname) Foreign-key constraints: "object_chk" FOREIGN KEY (n_object_id) REFERENCES objects(n_id) ON UPDATE CASCADE ON DELETE SET NULL Triggers: tr_update_modified_time BEFORE UPDATE ON people FOR EACH ROW EXECUTE PROCEDURE fn_update_modified_time() trial=> \d objects Table "public.objects" Column | Type | Modifiers ---------------+-----------------------------+----------------------------------------------------------- n_id | integer | not null default nextval('public.objects_n_id_seq'::text) t_text_id | character varying(25) | not null b_hidden | boolean | default false dt_created | timestamp without time zone | default ('now'::text)::timestamp(6) with time zone dt_modified | timestamp without time zone | default ('now'::text)::timestamp(6) with time zone n_creator | integer | not null n_type | smallint | not null default 0 t_name | character varying(100) | not null t_description | text | t_postcode | character varying(10) | n_id_photo | integer | n_board_id | integer | not null n_joined | smallint | b_trigger | boolean | default true Indexes: "objects_pkey" primary key, btree (n_id) Foreign-key constraints: "$1" FOREIGN KEY (n_creator) REFERENCES people(n_id) ON UPDATE CASCADE ON DELETE CASCADE "$2" FOREIGN KEY (n_id_photo) REFERENCES photo(n_id) ON UPDATE CASCADE ON DELETE SET NULL "$3" FOREIGN KEY (n_board_id) REFERENCES boards(n_id) ON UPDATE CASCADE ON DELETE CASCADE Unique (cost=3677.26..3711.48 rows=102 width=18) (actual time=12566.422..13045.404 rows=4 loops=1) -> Sort (cost=3677.26..3684.10 rows=2738 width=18) (actual time=12566.413..12809.928 rows=158254 loops=1) Sort Key: b.n_type, b.n_id, b.t_name, CASE WHEN (b.n_type = 0) THEN 'personal'::text WHEN (b.n_type = 1) THEN 'private'::te xt WHEN (b.n_type = 2) THEN 'blog'::text ELSE 'public'::text END -> Nested Loop (cost=3442.79..3520.93 rows=2738 width=18) (actual time=0.244..1052.180 rows=158254 loops=1) -> Seq Scan on people p (cost=0.00..2.67 rows=67 width=0) (actual time=0.012..0.709 rows=67 loops=1) -> Materialize (cost=3442.79..3443.20 rows=41 width=18) (actual time=0.005..6.528 rows=2362 loops=67) -> Nested Loop (cost=3.27..3442.79 rows=41 width=18) (actual time=0.216..273.709 rows=2362 loops=1) Join Filter: ((("inner".n_id = "outer".n_board_id) OR ("inner".n_type = 3) OR ("inner".n_creator= 71)) A ND (("outer".n_creator = 71) OR ("inner".n_type = 3) OR ("inner".n_creator = 71)) AND (("outer".n_joined > 0) OR ("inner".n_type= 3 ) OR ("inner".n_creator = 71))) -> Seq Scan on objects o (cost=0.00..27.87 rows=787 width=10) (actual time=0.005..1.372 rows=787loops= 1) -> Materialize (cost=3.27..4.29 rows=102 width=22) (actual time=0.001..0.105 rows=102 loops=787) -> Seq Scan on boards b (cost=0.00..3.27 rows=102 width=22) (actual time=0.017..0.352rows=102 lo ops=1) Filter: (b_hidden = false) Total runtime: 13051.065 ms -- Rory Campbell-Lange <rory@campbell-lange.net> <www.campbell-lange.net>
Sorry for replying to my own post, but I'm anxious for an answer. Should I provide other information? Thanks Rory On 10/05/04, Rory Campbell-Lange (rory@campbell-lange.net) wrote: > The following query on some small datasets takes over a second to run. > I'd be grateful for some help in understanding the explain output, and > to remake the code. > > Looks like the sort is using up quite a bit of resources. I made an > index on boards using columns "b.n_type, b.n_id, b.t_name" but the index > was not recorded in explain analyze. (see "testindex" below). > > I am using PostgreSQL 7.4.2 on i386-pc-linux-gnu (Debian). The query is > in a psql function (I've changed variables to hard coded integers > below). The db is vacuumed every night. > > Thanks for any help; > Rory > > > SELECT DISTINCT > b.n_id as id, > b.n_type, > CASE b.n_type WHEN 0 THEN 'personal' > WHEN 1 THEN 'private' > WHEN 2 THEN 'blog' > ELSE 'public' > END as type, > b.t_name as title > FROM > boards b, people p, objects o > WHERE > b.b_hidden = 'f' > AND > ( > b.n_type = 3 > OR > b.n_creator = 71 > OR > ( b.n_id = o.n_board_id > AND > o.n_creator = 71 > AND > o.n_joined > 0 > ) > ) > ORDER BY > b.n_type, b.n_id ASC, b.t_name; > > trial=> \d boards > Table "public.boards" > Column | Type | Modifiers > ---------------+-----------------------------+---------------------------------------------------------- > n_id | integer | not null default nextval('public.boards_n_id_seq'::text) > b_hidden | boolean | default false > dt_created | timestamp without time zone | default ('now'::text)::timestamp(6) with time zone > dt_modified | timestamp without time zone | default ('now'::text)::timestamp(6) with time zone > t_mobile | character varying(15) | > t_email | character varying(50) | > n_creator | integer | not null > n_type | smallint | not null default 0 > t_name | character varying(100) | not null > t_description | character varying(500) | > n_id_photo | integer | > n_bg_colour | integer | default 0 > Indexes: > "boards_pkey" primary key, btree (n_id) > "boards_t_email_key" unique, btree (t_email) > "boards_t_mobile_key" unique, btree (t_mobile) > "testindex" btree (n_type, n_id, t_name) > Foreign-key constraints: > "$1" FOREIGN KEY (n_creator) REFERENCES people(n_id) ON UPDATE CASCADE ON DELETE CASCADE > "$2" FOREIGN KEY (n_id_photo) REFERENCES photo(n_id) ON UPDATE CASCADE ON DELETE SET NULL > Triggers: > tr_update_modified_time BEFORE UPDATE ON boards FOR EACH ROW EXECUTE PROCEDURE fn_update_modified_time() > > trial=> \d people > Table "public.people" > Column | Type | Modifiers > -------------------+-----------------------------+---------------------------------------------------------- > n_id | integer | not null default nextval('public.people_n_id_seq'::text) > n_object_id | integer | > n_objects_counter | integer | default 0 > b_hidden | boolean | default false > dt_created | timestamp without time zone | default ('now'::text)::timestamp(6) with time zone > dt_modified | timestamp without time zone | default ('now'::text)::timestamp(6) with time zone > t_nickname | character varying(20) | not null > t_firstname | character varying(20) | not null > t_surname | character varying(25) | not null > t_mobile | character varying(15) | > t_email | character varying(50) | > t_password | character varying(15) | not null > b_registered | boolean | default false > n_email_status | smallint | default 0 > n_txt_status | smallint | default 0 > b_work_hours | boolean | default false > b_confirmations | boolean | default true > Indexes: > "people_pkey" primary key, btree (n_id) > "people_t_email_key" unique, btree (t_email) > "people_t_mobile_key" unique, btree (t_mobile) > "people_t_nickname_key" unique, btree (t_nickname) > Foreign-key constraints: > "object_chk" FOREIGN KEY (n_object_id) REFERENCES objects(n_id) ON UPDATE CASCADE ON DELETE SET NULL > Triggers: > tr_update_modified_time BEFORE UPDATE ON people FOR EACH ROW EXECUTE PROCEDURE fn_update_modified_time() > > trial=> \d objects > Table "public.objects" > Column | Type | Modifiers > ---------------+-----------------------------+----------------------------------------------------------- > n_id | integer | not null default nextval('public.objects_n_id_seq'::text) > t_text_id | character varying(25) | not null > b_hidden | boolean | default false > dt_created | timestamp without time zone | default ('now'::text)::timestamp(6) with time zone > dt_modified | timestamp without time zone | default ('now'::text)::timestamp(6) with time zone > n_creator | integer | not null > n_type | smallint | not null default 0 > t_name | character varying(100) | not null > t_description | text | > t_postcode | character varying(10) | > n_id_photo | integer | > n_board_id | integer | not null > n_joined | smallint | > b_trigger | boolean | default true > Indexes: > "objects_pkey" primary key, btree (n_id) > Foreign-key constraints: > "$1" FOREIGN KEY (n_creator) REFERENCES people(n_id) ON UPDATE CASCADE ON DELETE CASCADE > "$2" FOREIGN KEY (n_id_photo) REFERENCES photo(n_id) ON UPDATE CASCADE ON DELETE SET NULL > "$3" FOREIGN KEY (n_board_id) REFERENCES boards(n_id) ON UPDATE CASCADE ON DELETE CASCADE > > > > Unique (cost=3677.26..3711.48 rows=102 width=18) (actual time=12566.422..13045.404 rows=4 loops=1) > -> Sort (cost=3677.26..3684.10 rows=2738 width=18) (actual time=12566.413..12809.928 rows=158254 loops=1) > Sort Key: b.n_type, b.n_id, b.t_name, CASE WHEN (b.n_type = 0) THEN 'personal'::text WHEN (b.n_type = 1) THEN'private'::te > xt WHEN (b.n_type = 2) THEN 'blog'::text ELSE 'public'::text END > -> Nested Loop (cost=3442.79..3520.93 rows=2738 width=18) (actual time=0.244..1052.180 rows=158254 loops=1) > -> Seq Scan on people p (cost=0.00..2.67 rows=67 width=0) (actual time=0.012..0.709 rows=67 loops=1) > -> Materialize (cost=3442.79..3443.20 rows=41 width=18) (actual time=0.005..6.528 rows=2362 loops=67) > -> Nested Loop (cost=3.27..3442.79 rows=41 width=18) (actual time=0.216..273.709 rows=2362 loops=1) > Join Filter: ((("inner".n_id = "outer".n_board_id) OR ("inner".n_type = 3) OR ("inner".n_creator= 71)) A > ND (("outer".n_creator = 71) OR ("inner".n_type = 3) OR ("inner".n_creator = 71)) AND (("outer".n_joined > 0) OR ("inner".n_type= 3 > ) OR ("inner".n_creator = 71))) > -> Seq Scan on objects o (cost=0.00..27.87 rows=787 width=10) (actual time=0.005..1.372 rows=787loops= > 1) > -> Materialize (cost=3.27..4.29 rows=102 width=22) (actual time=0.001..0.105 rows=102 loops=787) > -> Seq Scan on boards b (cost=0.00..3.27 rows=102 width=22) (actual time=0.017..0.352rows=102 lo > ops=1) > Filter: (b_hidden = false) > Total runtime: 13051.065 ms > > > > -- > Rory Campbell-Lange > <rory@campbell-lange.net> > <www.campbell-lange.net> > > ---------------------------(end of broadcast)--------------------------- > TIP 8: explain analyze is your friend > -- Rory Campbell-Lange <rory@campbell-lange.net> <www.campbell-lange.net>
[snip] Rory Campbell-Lange wrote: >> >> >>SELECT DISTINCT >> b.n_id as id, >> b.n_type, >> CASE b.n_type WHEN 0 THEN 'personal' >> WHEN 1 THEN 'private' >> WHEN 2 THEN 'blog' >> ELSE 'public' >> END as type, >> b.t_name as title >> FROM >> boards b, people p, objects o >> WHERE >> b.b_hidden = 'f' >> AND >> ( >> b.n_type = 3 >> OR >> b.n_creator = 71 >> OR >> ( b.n_id = o.n_board_id >> AND >> o.n_creator = 71 >> AND >> o.n_joined > 0 >> ) >> ) >> ORDER BY >> b.n_type, b.n_id ASC, b.t_name; >> You don't seem to be joining the people table to the boards or objects table. In fact the people table is never referenced anywhere but the FROM clause. This might be why it is not running so well. People seems to be referenced by both boards and objects, so does there need to be a join between all 3 tables? [snip] >> "$1" FOREIGN KEY (n_creator) REFERENCES people(n_id) ON UPDATE CASCADE ON DELETE CASCADE [snip] >> "$1" FOREIGN KEY (n_creator) REFERENCES people(n_id) ON UPDATE CASCADE ON DELETE CASCADE It is this "missing" join which seems to cause 158254 rows to appear in this nested loop, when PG only reckons it is going to see 2738 rows >> -> Nested Loop (cost=3442.79..3520.93 rows=2738 width=18) (actual time=0.244..1052.180 rows=158254 loops=1) I cant remember what the correct term is, but basically all rows from the seq scan combines with all rows from the materialize, which is what is causing the 150,000 odd rows to appear (is that called a cross product anyone?) i.e 67 rows from the seq x 2362 rows from the materialize = 158254 output rows for the nested loop to chug through. >> -> Seq Scan on people p (cost=0.00..2.67 rows=67 width=0) (actual time=0.012..0.709 rows=67 loops=1) >> -> Materialize (cost=3442.79..3443.20 rows=41 width=18) (actual time=0.005..6.528 rows=2362 loops=67) I reckon fix the unconstrained join and your query might run a little faster. >> -> Nested Loop (cost=3.27..3442.79 rows=41 width=18) (actual time=0.216..273.709 rows=2362 loops=1) The estimate for this nested loop seems a little off, note the guesstimate in the first set of brackets of 41, and compare with the actual result in the second set of 2362. Have you vacuum analyzed recently? HTH Nick
On 10/05/2004 22:30 Rory Campbell-Lange wrote: > Sorry for replying to my own post, but I'm anxious for an answer. Should > I provide other information? Look carefully at your column types. I can see several smallint columns in there WHERE clause which are not expicitely typed as such. -- Paul Thomas +------------------------------+---------------------------------------------+ | Thomas Micro Systems Limited | Software Solutions for Business | | Computer Consultants | http://www.thomas-micro-systems-ltd.co.uk | +------------------------------+---------------------------------------------+
Rory Campbell-Lange wrote: >Wow, this is an education! My php profiler now shows: > >0.02 db->db_board_listing C: /var/www/trial/php/db.php:175 M: 867024 >0.02 ob_start C: /var/www/trial/php/db.php:238 M: 867024 >0.02 pg_exec C: /var/www/trial/php/db.php:239 M: 908328 >1.24 pg_result_status C: /var/www/trial/php/db.php:240 M: 908392 >1.24 ob_end_clean C: /var/www/trial/php/db.php:241 M: 908416 >1.24 db->result_checker C: /var/www/trial/php/db.php:243 M: 867472 >1.24 is_resource C: /var/www/trial/php/db.php:1317 M: 867472 >1.24 pg_numrows C: /var/www/trial/php/db.php:248 M: 867568 >1.24 pg_fetch_array C: /var/www/trial/php/db.php:250 M: 867600 >1.24 pg_fetch_array C: /var/www/trial/php/db.php:250 M: 867760 > >and now shows: > >C: /var/www/trial/php/core.php:151 M: 867040 > 0.02 db->db_board_listing C: /var/www/trial/php/db.php:175 M: 867040 > 0.02 ob_start C: /var/www/trial/php/db.php:238 M: 867040 > 0.02 pg_exec C: /var/www/trial/php/db.php:239 M: 908344 > 0.14 pg_result_status C: /var/www/trial/php/db.php:240 M: 908408 > 0.14 ob_end_clean C: /var/www/trial/php/db.php:241 M: 908432 > 0.14 db->result_checker C: /var/www/trial/php/db.php:243 M: 867488 > 0.14 is_resource C: /var/www/trial/php/db.php:1317 M: 867488 > 0.14 pg_numrows C: /var/www/trial/php/db.php:248 M: 867584 > 0.14 pg_fetch_array C: /var/www/trial/php/db.php:250 M: 867616 > 0.14 pg_fetch_array C: /var/www/trial/php/db.php:250 M: 867776 > >This is a saving of 1.1 seconds! > > > Thats a bit better ;-) >>On 10/05/2004 22:30 Rory Campbell-Lange wrote: >> >> >>>Sorry for replying to my own post, but I'm anxious for an answer. Should >>>I provide other information? >>> >>> >>Look carefully at your column types. I can see several smallint columns in >>there WHERE clause which are not expicitely typed as such. >> >> > >I'm not sure how to do this, Paul. do I do b.n_id::smallint ? Is >smallint not implied? > >Thanks >Rory > > > Not quite. Explicit casts are needed when you have any numbers in the WHERE condition and the columns are not of type integer/int4. For example I have tweaked your query. WHERE b.b_hidden = 'f' AND ( b.n_type = 3::smallint OR b.n_creator = 71 OR ( b.n_id = o.n_board_id AND o.n_creator = 71 AND o.n_joined > 0::smallint ) ) Note that b.n_creator and o.n_creator do not need explicit casts because they are both of type integer anyway. You could of course put them in for clarity. PG only casts the numbers to integer's, and not to smallint or bigint, which basically means it does not use any indexes on that column. This is fixed in 7.4 I believe, which you seem to be running anyway so you might not be affected. HTH Nick
On 11/05/04, Nick Barr (nicky@chuckie.co.uk) wrote: > Rory Campbell-Lange wrote: > > > Look carefully at your column types. I can see several smallint > > > columns in there WHERE clause which are not expicitely typed as > > > such. > > I'm not sure how to do this, Paul. do I do b.n_id::smallint ? Is > > smallint not implied? > Not quite. Explicit casts are needed when you have any numbers in the > WHERE condition and the columns are not of type integer/int4. For > example I have tweaked your query. ... > Note that b.n_creator and o.n_creator do not need explicit casts because > they are both of type integer anyway. You could of course put them in > for clarity. PG only casts the numbers to integer's, and not to smallint > or bigint, which basically means it does not use any indexes on that > column. This is fixed in 7.4 I believe, which you seem to be running > anyway so you might not be affected. Mmm. Seems like I should try profiling with and without the cast. Maybe there is little value in defining a column as a smallint, other than checking the length of input. Thanks again for your help, Rory -- Rory Campbell-Lange <rory@campbell-lange.net> <www.campbell-lange.net>
Rory Campbell-Lange <rory@campbell-lange.net> writes: > The following query on some small datasets takes over a second to run. > I'd be grateful for some help in understanding the explain output, and > to remake the code. I think the problem is you have an unconstrained join against "people p", which is wasting a bunch of time generating duplicate records (and then sorting them, and then discarding them in the DISTINCT step...) regards, tom lane