Thread: Order by (for 15 rows) adds 30 seconds to query time
Dear All, I don't know if this is a stupid question, or not, but I can't understand the following. I have a pretty simple query, which runs in about 7ms SELECT * FROM h.inventory WHERE demand_id =289276563; The result of this is a 15 row x 17 column table. However, I want this to be sorted by id, so I changed the query to: SELECT * FROM h.inventory WHERE demand_id =289276563 ORDER BY id; which makes it take 32 seconds! That surprises me - I'd expect the ORDER BY to be the last thing that runs, and for a sort of such a small dataset to be almost instantaneous. Indeed, if I do ORDER BY random(), then it's fast. The system is running 8.4.1, and is otherwise lightly loaded, I can do this repeatedly with similar results. Is this normal? Have I hit a bug? I attach the view definition, the result set, and the output from explain analyze (both ways). Thanks, Richard View "h.inventory" Column | Type | Modifiers ---------------+--------------------------+----------- id | bigint | material_id | bigint | material_tag | text | material_name | text | location_id | bigint | location_tag | text | location_name | text | qty | integer | divergence | integer | ctime | timestamp with time zone | actor_id | bigint | actor_tag | text | actor_name | text | demand_id | bigint | target_id | bigint | target_tag | text | target_name | text | View definition: SELECT inventory.id, inventory.material_id, h_material.tag AS material_tag, h_material.name AS material_name, inventory.location_id, h_location.tag AS location_tag, h_location.name AS location_name, inventory.qty, inventory.divergence, inventory.ctime, inventory.actor_id, h_actor.tag AS actor_tag, h_actor.name AS actor_name, inventory.demand_id, h_demand.target_id, h_demand.target_tag, h_demand.target_name FROM core.inventory LEFT JOIN h.material h_material ON inventory.material_id = h_material.id LEFT JOIN h.location h_location ON inventory.location_id = h_location.id LEFT JOIN h.actor h_actor ON inventory.actor_id = h_actor.id LEFT JOIN h.demand h_demand ON inventory.demand_id = h_demand.id; id | material_id | material_tag | material_name | location_id | location_tag | location_name | qty | divergence | ctime | actor_id | actor_tag | actor_name | demand_id | target_id | target_tag | target_name -----------+-------------+---------------+---------------+-------------+--------------+------------------------+-----+------------+-------------------------------+----------+-----------+------------------------------+-----------+-----------+----------------+---------------------------------------- 292904293 | 289238938 | 1000001113980 | | 280410977 | 1030576 | Container 1030576 | 0 | 0 | 2009-12-01 14:53:35.010023+00 | 5543 | 139664 | Joanna Mikolajczak | 289276563 | 3153 | 300244 EMBARGO | 300244 308/09 HAYWARDS HEATH / EMBARGO 292904294 | 289238938 | 1000001113980 | | 280410977 | 1030576 | Container 1030576 | 1 | 0 | 2009-12-01 14:53:35.060378+00 | | | | 289276563 | 3153 | 300244 EMBARGO | 300244 308/09 HAYWARDS HEATH / EMBARGO 292904292 | 289238938 | 1000001113980 | | 4008 | 308 | Chute 308 | 0 | 0 | 2009-12-01 14:53:34.925801+00 | 5543 | 139664 | Joanna Mikolajczak | 289276563 | 3153 | 300244 EMBARGO | 300244 308/09 HAYWARDS HEATH / EMBARGO 292817907 | 289238938 | 1000001113980 | | 5137 | 991 | Chute 991 (not needed) | 0 | 0 | 2009-12-01 14:38:00.819189+00 | 6282 | CW 991 | Chute 991 worker | 289276563 | 3153 | 300244 EMBARGO | 300244 308/09 HAYWARDS HEATH / EMBARGO 291755251 | 289238938 | 1000001113980 | | 5137 | 991 | Chute 991 (not needed) | 0 | 0 | 2009-12-01 12:03:05.957424+00 | 6282 | CW 991 | Chute 991 worker | 289276563 | 3153 | 300244 EMBARGO | 300244 308/09 HAYWARDS HEATH / EMBARGO 291543235 | 289238938 | 1000001113980 | | 5137 | 991 | Chute 991 (not needed) | 0 | 0 | 2009-12-01 11:35:19.28846+00 | 6282 | CW 991 | Chute 991 worker | 289276563 | 3153 | 300244 EMBARGO | 300244 308/09 HAYWARDS HEATH / EMBARGO 291524046 | 289238938 | 1000001113980 | | 4008 | 308 | Chute 308 | 0 | 0 | 2009-12-01 11:31:49.40378+00 | | | | 289276563 | 3153 | 300244 EMBARGO | 300244 308/09 HAYWARDS HEATH / EMBARGO 291524045 | 289238938 | 1000001113980 | | 4008 | 308 | Chute 308 | 0 | 0 | 2009-12-01 11:31:49.402217+00 | 6300 | FSC | Flow System Controller (FSC) | 289276563 | 3153 | 300244 EMBARGO | 300244 308/09 HAYWARDS HEATH / EMBARGO 291522067 | 289238938 | 1000001113980 | | 5143 | CAM2 | North Camera | 0 | 0 | 2009-12-01 11:31:22.931085+00 | 6300 | FSC | Flow System Controller (FSC) | 289276563 | 3153 | 300244 EMBARGO | 300244 308/09 HAYWARDS HEATH / EMBARGO 291518675 | 289238938 | 1000001113980 | | 5137 | 991 | Chute 991 (not needed) | 0 | 0 | 2009-12-01 11:30:32.10016+00 | 6282 | CW 991 | Chute 991 worker | 289276563 | 3153 | 300244 EMBARGO | 300244 308/09 HAYWARDS HEATH / EMBARGO 291506067 | 289238938 | 1000001113980 | | 5137 | 991 | Chute 991 (not needed) | 0 | 0 | 2009-12-01 11:26:38.065565+00 | 6282 | CW 991 | Chute 991 worker | 289276563 | 3153 | 300244 EMBARGO | 300244 308/09 HAYWARDS HEATH / EMBARGO 291491123 | 289238938 | 1000001113980 | | 5137 | 991 | Chute 991 (not needed) | 0 | 0 | 2009-12-01 11:21:33.009506+00 | 6282 | CW 991 | Chute 991 worker | 289276563 | 3153 | 300244 EMBARGO | 300244 308/09 HAYWARDS HEATH / EMBARGO 291322415 | 289238938 | 1000001113980 | | 4008 | 308 | Chute 308 | 0 | 0 | 2009-12-01 10:45:08.281846+00 | | | | 289276563 | 3153 | 300244 EMBARGO | 300244 308/09 HAYWARDS HEATH / EMBARGO 291322414 | 289238938 | 1000001113980 | | 4008 | 308 | Chute 308 | 0 | 0 | 2009-12-01 10:45:08.280018+00 | 6300 | FSC | Flow System Controller (FSC) | 289276563 | 3153 | 300244 EMBARGO | 300244 308/09 HAYWARDS HEATH / EMBARGO 291319302 | 289238938 | 1000001113980 | | 5143 | CAM2 | North Camera | 0 | 0 | 2009-12-01 10:44:41.807128+00 | 6300 | FSC | Flow System Controller (FSC) | 289276563 | 3153 | 300244 EMBARGO | 300244 308/09 HAYWARDS HEATH / EMBARGO (15 rows) explain analyze select * from h.inventory where demand_id =289276563; Time: 7.251 ms QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------------------------- Nested Loop Left Join (cost=0.00..726857452.94 rows=806903677108 width=195) (actual time=0.108..0.704 rows=15 loops=1) Join Filter: (core.inventory.material_id = core.material.id) -> Nested Loop Left Join (cost=0.00..183236.84 rows=203176856 width=166) (actual time=0.103..0.588 rows=15 loops=1) Join Filter: (demand.material_id = core.material.id) -> Nested Loop Left Join (cost=0.00..260.03 rows=51160 width=174) (actual time=0.090..0.462 rows=15 loops=1) Join Filter: (core.inventory.location_id = core.location.id) -> Nested Loop Left Join (cost=0.00..146.71 rows=28 width=128) (actual time=0.068..0.286 rows=15 loops=1) -> Nested Loop Left Join (cost=0.00..125.36 rows=28 width=103) (actual time=0.058..0.225 rows=15 loops=1) Join Filter: (core.inventory.demand_id = demand.id) -> Index Scan using inventory_demand_id on inventory (cost=0.00..22.36 rows=28 width=56) (actual time=0.025..0.053 rows=15 loops=1) Index Cond: (demand_id = 289276563) -> Nested Loop Left Join (cost=0.00..3.67 rows=1 width=55) (actual time=0.009..0.010 rows=1 loops=15) -> Index Scan using demand_pkey on demand (cost=0.00..1.89 rows=1 width=24) (actual time=0.005..0.005 rows=1 loops=15) Index Cond: (id = 289276563) -> Index Scan using waypoint_pkey on waypoint (cost=0.00..1.77 rows=1 width=39) (actual time=0.003..0.003 rows=1 loops=15) Index Cond: (demand.target_id = waypoint.id) -> Index Scan using actor_pkey on actor (cost=0.00..0.75 rows=1 width=33) (actual time=0.003..0.003 rows=1 loops=15) Index Cond: (core.inventory.actor_id = actor.id) -> Append (cost=0.00..4.00 rows=4 width=50) (actual time=0.005..0.010 rows=1 loops=15) -> Index Scan using location_pkey on location (cost=0.00..0.56 rows=1 width=72) (actual time=0.001..0.001 rows=0 loops=15) Index Cond: (core.inventory.location_id = core.location.id) -> Index Scan using waypoint_pkey on waypoint location (cost=0.00..1.31 rows=1 width=39) (actual time=0.003..0.003 rows=1 loops=15) Index Cond: (core.inventory.location_id = core.location.id) -> Index Scan using container_pkey on container location (cost=0.00..1.78 rows=1 width=54) (actual time=0.004..0.004 rows=0 loops=15) Index Cond: (core.inventory.location_id = core.location.id) -> Index Scan using supply_pkey on supply location (cost=0.00..0.35 rows=1 width=36) (actual time=0.001..0.001 rows=0 loops=15) Index Cond: (core.inventory.location_id = core.location.id) -> Append (cost=0.00..3.55 rows=2 width=8) (actual time=0.004..0.007 rows=1 loops=15) -> Index Scan using material_pkey on material (cost=0.00..1.78 rows=1 width=8) (actual time=0.004..0.005 rows=1 loops=15) Index Cond: (demand.material_id = core.material.id) -> Index Scan using container_pkey on container material (cost=0.00..1.78 rows=1 width=8) (actual time=0.002..0.002 rows=0 loops=15) Index Cond: (demand.material_id = core.material.id) -> Append (cost=0.00..3.55 rows=2 width=38) (actual time=0.003..0.006 rows=1 loops=15) -> Index Scan using material_pkey on material (cost=0.00..1.78 rows=1 width=22) (actual time=0.003..0.003 rows=1 loops=15) Index Cond: (core.inventory.material_id = core.material.id) -> Index Scan using container_pkey on container material (cost=0.00..1.78 rows=1 width=54) (actual time=0.003..0.003 rows=0 loops=15) Index Cond: (core.inventory.material_id = core.material.id) Total runtime: 0.858 ms (38 rows) explain analyze select * from h.inventory where demand_id =289276563 order by id; Time: 32868.784 ms QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------ Nested Loop Left Join (cost=0.00..727737158.77 rows=806903677108 width=195) (actual time=31739.052..32862.322 rows=15 loops=1) Join Filter: (core.inventory.material_id = core.material.id) -> Nested Loop Left Join (cost=0.00..1062942.67 rows=203176856 width=166) (actual time=31739.044..32862.084 rows=15 loops=1) Join Filter: (demand.material_id = core.material.id) -> Nested Loop Left Join (cost=0.00..879965.86 rows=51160 width=174) (actual time=31739.025..32861.812 rows=15 loops=1) Join Filter: (core.inventory.location_id = core.location.id) -> Nested Loop Left Join (cost=0.00..879852.54 rows=28 width=128) (actual time=31739.006..32861.428 rows=15 loops=1) -> Nested Loop Left Join (cost=0.00..879831.20 rows=28 width=103) (actual time=31738.994..32861.276 rows=15 loops=1) Join Filter: (core.inventory.demand_id = demand.id) -> Index Scan using inventory_pkey on inventory (cost=0.00..879728.20 rows=28 width=56) (actual time=31738.956..32860.738 rows=15 loops=1) Filter: (demand_id = 289276563) -> Nested Loop Left Join (cost=0.00..3.67 rows=1 width=55) (actual time=0.030..0.031 rows=1 loops=15) -> Index Scan using demand_pkey on demand (cost=0.00..1.89 rows=1 width=24) (actual time=0.019..0.019 rows=1 loops=15) Index Cond: (id = 289276563) -> Index Scan using waypoint_pkey on waypoint (cost=0.00..1.77 rows=1 width=39) (actual time=0.008..0.008 rows=1 loops=15) Index Cond: (demand.target_id = waypoint.id) -> Index Scan using actor_pkey on actor (cost=0.00..0.75 rows=1 width=33) (actual time=0.007..0.007 rows=1 loops=15) Index Cond: (core.inventory.actor_id = actor.id) -> Append (cost=0.00..4.00 rows=4 width=50) (actual time=0.010..0.019 rows=1 loops=15) -> Index Scan using location_pkey on location (cost=0.00..0.56 rows=1 width=72) (actual time=0.003..0.003 rows=0 loops=15) Index Cond: (core.inventory.location_id = core.location.id) -> Index Scan using waypoint_pkey on waypoint location (cost=0.00..1.31 rows=1 width=39) (actual time=0.006..0.006 rows=1 loops=15) Index Cond: (core.inventory.location_id = core.location.id) -> Index Scan using container_pkey on container location (cost=0.00..1.78 rows=1 width=54) (actual time=0.006..0.006 rows=0 loops=15) Index Cond: (core.inventory.location_id = core.location.id) -> Index Scan using supply_pkey on supply location (cost=0.00..0.35 rows=1 width=36) (actual time=0.003..0.003 rows=0 loops=15) Index Cond: (core.inventory.location_id = core.location.id) -> Append (cost=0.00..3.55 rows=2 width=8) (actual time=0.011..0.014 rows=1 loops=15) -> Index Scan using material_pkey on material (cost=0.00..1.78 rows=1 width=8) (actual time=0.010..0.011 rows=1 loops=15) Index Cond: (demand.material_id = core.material.id) -> Index Scan using container_pkey on container material (cost=0.00..1.78 rows=1 width=8) (actual time=0.002..0.002 rows=0 loops=15) Index Cond: (demand.material_id = core.material.id) -> Append (cost=0.00..3.55 rows=2 width=38) (actual time=0.004..0.012 rows=1 loops=15) -> Index Scan using material_pkey on material (cost=0.00..1.78 rows=1 width=22) (actual time=0.003..0.004 rows=1 loops=15) Index Cond: (core.inventory.material_id = core.material.id) -> Index Scan using container_pkey on container material (cost=0.00..1.78 rows=1 width=54) (actual time=0.008..0.008 rows=0 loops=15) Index Cond: (core.inventory.material_id = core.material.id) Total runtime: 32862.509 ms (38 rows)
Richard Neill <rn214@cam.ac.uk> wrote: > I'd expect the ORDER BY to be the last thing that runs > Nested Loop Left Join (cost=0.00..727737158.77 > rows=806903677108 width=195) (actual time=31739.052..32862.322 > rows=15 loops=1) It probably would if it knew there were going to be 15 rows to sort. It is estimating that there will be 806,903,677,108 rows, in which case it thinks that using the index will be faster. The question is why it's 10 or 11 orders of magnitude off on the estimate of result rows. Could you show us the table definitions underlying that view? -Kevin
Le mardi 01 décembre 2009 à 18:52 +0000, Richard Neill a écrit : > Is this normal? Have I hit a bug? PostgreSQL query analyzer needs to run a couple of times before it can rewrite and optimize the query. Make sure demand_id, id and join IDs carry indexes. Run EXPLAIN ANALYSE your_query to understand how the parser works and post it back here. Kind regards, Jean-Michel
Attachment
Jean-Michel Pouré wrote: > Le mardi 01 décembre 2009 à 18:52 +0000, Richard Neill a écrit : >> Is this normal? Have I hit a bug? > > PostgreSQL query analyzer needs to run a couple of times before it can > rewrite and optimize the query. Make sure demand_id, id and join IDs > carry indexes. > I did, and they do. This table has been in place for ages, with autovacuum on, and a manual vacuum analyze every night. I checked by running analyze explicitly on all the relevant tables just before posting this. > Run EXPLAIN ANALYSE your_query to understand how the parser works and > post it back here. > Already in previous email :-) > Kind regards, > Jean-Michel Kevin Grittner wrote: > Richard Neill <rn214@cam.ac.uk> wrote: > >> I'd expect the ORDER BY to be the last thing that runs > >> Nested Loop Left Join (cost=0.00..727737158.77 >> rows=806903677108 width=195) (actual time=31739.052..32862.322 >> rows=15 loops=1) > > It probably would if it knew there were going to be 15 rows to sort. > It is estimating that there will be 806,903,677,108 rows, in which > case it thinks that using the index will be faster. The question is > why it's 10 or 11 orders of magnitude off on the estimate of result > rows. Could you show us the table definitions underlying that view? > > -Kevin > Am I wrong in thinking that ORDER BY is always applied after the main query is run? Even if I run it this way: select * from (select * from h.inventory where demand_id =289276563) as sqry order by id; which should(?) surely force it to run the first select, then sort, it's still very slow. On the other hand, it's quick if I do order by id+1 The table definitions are as follows (sorry there are so many). Richard fswcs=# \d h.demand View "h.demand" Column | Type | Modifiers ---------------+---------+----------- id | bigint | target_id | bigint | target_tag | text | target_name | text | material_id | bigint | material_tag | text | material_name | text | qty | integer | benefit | integer | View definition: SELECT demand.id, demand.target_id, h_target_waypoint.tag AS target_tag, h_target_waypoint.name AS target_name, demand.material_id, h_material.tag AS material_tag, h_material.name AS material_name, demand.qty, demand.benefit FROM core.demand LEFT JOIN h.waypoint h_target_waypoint ON demand.target_id = h_target_waypoint.id LEFT JOIN h.material h_material ON demand.material_id = h_material.id; fswcs=# \d core.demand Table "core.demand" Column | Type | Modifiers -------------+---------+-------------------------------- id | bigint | not null default core.new_id() target_id | bigint | not null material_id | bigint | not null qty | integer | not null benefit | integer | not null default 0 Indexes: "demand_pkey" PRIMARY KEY, btree (id) CLUSTER "demand_target_id_key" UNIQUE, btree (target_id, material_id) "demand_material_id" btree (material_id) "demand_target_id" btree (target_id) Foreign-key constraints: "demand_material_id_fkey" FOREIGN KEY (material_id) REFERENCES core.__material_id(id) "demand_target_id_fkey" FOREIGN KEY (target_id) REFERENCES core.waypoint(id) Referenced by: TABLE "viwcs.du_report_contents" CONSTRAINT "du_report_contents_demand_id_fkey" FOREIGN KEY (demand_id) REFERENCES core.demand(id) TABLE "core.inventory" CONSTRAINT "inventory_demand_id_fkey" FOREIGN KEY (demand_id) REFERENCES core.demand(id) TABLE "viwcs.wave_demand" CONSTRAINT "wave_demand_demand_id_fkey" FOREIGN KEY (demand_id) REFERENCES core.demand(id) fswcs=# \d h.waypoint View "h.waypoint" Column | Type | Modifiers -----------+---------+----------- id | bigint | tag | text | name | text | is_router | boolean | is_target | boolean | is_packer | boolean | View definition: SELECT waypoint.id, waypoint.tag, waypoint.name, waypoint.is_router, waypoint.is_target, waypoint.is_packer FROM core.waypoint; fswcs=# \d h.material View "h.material" Column | Type | Modifiers --------+---------+----------- id | bigint | tag | text | name | text | mass | integer | volume | integer | View definition: SELECT material.id, material.tag, material.name, material.mass, material.volume FROM core.material; fswcs=# \d core.wa core.waypoint core.waypoint_name_key core.waypoint_pkey core.waypoint_tag_key fswcs=# \d core.waypoint Table "core.waypoint" Column | Type | Modifiers -----------+---------+-------------------------------- id | bigint | not null default core.new_id() tag | text | not null name | text | not null is_router | boolean | not null is_target | boolean | not null is_packer | boolean | not null Indexes: "waypoint_pkey" PRIMARY KEY, btree (id) CLUSTER "waypoint_tag_key" UNIQUE, btree (tag) "waypoint_name_key" btree (name) Referenced by: TABLE "core.demand" CONSTRAINT "demand_target_id_fkey" FOREIGN KEY (target_id) REFERENCES core.waypoint(id) TABLE "viwcs.du_report" CONSTRAINT "du_report_target_id_fkey" FOREIGN KEY (target_id) REFERENCES core.waypoint(id) TABLE "viwcs.mss_actor_state" CONSTRAINT "mss_actor_state_last_demand_tag_fkey" FOREIGN KEY (last_demand_tag) REFERENCES core.waypoint(tag) TABLE "viwcs.mss_actor_state" CONSTRAINT "mss_actor_state_last_racking_tag_fkey" FOREIGN KEY (last_racking_tag) REFERENCES core.waypoint(tag) TABLE "viwcs.mss_rack_action_queue" CONSTRAINT "mss_rack_action_queue_racking_tag_fkey" FOREIGN KEY (racking_tag) REFERENCES core.waypoint(tag) TABLE "core.route_cache" CONSTRAINT "route_cache_next_hop_id_fkey" FOREIGN KEY (next_hop_id) REFERENCES core.waypoint(id) ON DELETE CASCADE TABLE "core.route_cache" CONSTRAINT "route_cache_router_id_fkey" FOREIGN KEY (router_id) REFERENCES core.waypoint(id) ON DELETE CASCADE TABLE "core.route_cache" CONSTRAINT "route_cache_target_id_fkey" FOREIGN KEY (target_id) REFERENCES core.waypoint(id) ON DELETE CASCADE TABLE "core.route" CONSTRAINT "route_dst_id_fkey" FOREIGN KEY (dst_id) REFERENCES core.waypoint(id) TABLE "core.route" CONSTRAINT "route_src_id_fkey" FOREIGN KEY (src_id) REFERENCES core.waypoint(id) TABLE "viwcs.wave_genreorders_map" CONSTRAINT "wave_genreorders_map_ERR_GENREID_UNKNOWN" FOREIGN KEY (target_id) REFERENCES core.waypoint(id) Triggers: __waypoint__location_id_delete BEFORE DELETE ON core.waypoint FOR EACH ROW EXECUTE PROCEDURE core.__location_id_delete() __waypoint__location_id_insert BEFORE INSERT ON core.waypoint FOR EACH ROW EXECUTE PROCEDURE core.__location_id_insert() __waypoint__location_id_update BEFORE UPDATE ON core.waypoint FOR EACH ROW EXECUTE PROCEDURE core.__location_id_update() __waypoint__tag_id_delete BEFORE DELETE ON core.waypoint FOR EACH ROW EXECUTE PROCEDURE core.__tag_id_delete() __waypoint__tag_id_insert BEFORE INSERT ON core.waypoint FOR EACH ROW EXECUTE PROCEDURE core.__tag_id_insert() __waypoint__tag_id_update BEFORE UPDATE ON core.waypoint FOR EACH ROW EXECUTE PROCEDURE core.__tag_id_update() __waypoint__tag_tag_delete BEFORE DELETE ON core.waypoint FOR EACH ROW EXECUTE PROCEDURE core.__tag_tag_delete() __waypoint__tag_tag_insert BEFORE INSERT ON core.waypoint FOR EACH ROW EXECUTE PROCEDURE core.__tag_tag_insert() __waypoint__tag_tag_update BEFORE UPDATE ON core.waypoint FOR EACH ROW EXECUTE PROCEDURE core.__tag_tag_update() Inherits: core.location fswcs=# \d core.ma core.material core.material_name_key core.material_pkey core.material_tag_key fswcs=# \d core.material Table "core.material" Column | Type | Modifiers --------+---------+-------------------------------- id | bigint | not null default core.new_id() tag | text | not null name | text | not null mass | integer | not null volume | integer | not null Indexes: "material_pkey" PRIMARY KEY, btree (id) "material_tag_key" UNIQUE, btree (tag) "material_name_key" btree (name) Check constraints: "material_mass_check" CHECK (mass >= 0) "material_volume_check" CHECK (volume >= 0) Triggers: __material__material_id_delete BEFORE DELETE ON core.material FOR EACH ROW EXECUTE PROCEDURE core.__material_id_delete() __material__material_id_insert BEFORE INSERT ON core.material FOR EACH ROW EXECUTE PROCEDURE core.__material_id_insert() __material__material_id_update BEFORE UPDATE ON core.material FOR EACH ROW EXECUTE PROCEDURE core.__material_id_update() __material__tag_id_delete BEFORE DELETE ON core.material FOR EACH ROW EXECUTE PROCEDURE core.__tag_id_delete() __material__tag_id_insert BEFORE INSERT ON core.material FOR EACH ROW EXECUTE PROCEDURE core.__tag_id_insert() __material__tag_id_update BEFORE UPDATE ON core.material FOR EACH ROW EXECUTE PROCEDURE core.__tag_id_update() __material__tag_tag_delete BEFORE DELETE ON core.material FOR EACH ROW EXECUTE PROCEDURE core.__tag_tag_delete() __material__tag_tag_insert BEFORE INSERT ON core.material FOR EACH ROW EXECUTE PROCEDURE core.__tag_tag_insert() __material__tag_tag_update BEFORE UPDATE ON core.material FOR EACH ROW EXECUTE PROCEDURE core.__tag_tag_update() Inherits: core.tag
Richard Neill <rn214@cam.ac.uk> wrote: > Am I wrong in thinking that ORDER BY is always applied after the > main query is run? Yes, you are wrong to think that. It compares the costs of various plans, and when it has an index with the high order portion matching your ORDER BY clause, it may think that it can scan that index to generate the correct sequence. If the sort is estimated to be expensive enough compared to the index scan, it will use the index scan and skip the sort. Sorting hundreds of billions of rows can be expensive. > Even if I run it this way: > > select * from (select * from h.inventory where demand_id > =289276563) as sqry order by id; > > which should(?) surely force it to run the first select, then > sort, I wouldn't necessarily assume that. You can EXPLAIN that form of the query and find out easily enough. Does it say: -> Index Scan using inventory_demand_id on inventory (cost=0.00..22.36 rows=28 width=56) (actual time=0.025..0.053 rows=15 loops=1) Index Cond: (demand_id = 289276563) or: -> Index Scan using inventory_pkey on inventory (cost=0.00..879728.20 rows=28 width=56) (actual time=31738.956..32860.738 rows=15 loops=1) Filter: (demand_id = 289276563) > it's quick if I do order by id+1 You don't have an index on id+1. > The table definitions are as follows (sorry there are so many). I'll poke around to try to get a clue why the estimated result rows are so far off, but I may be in over my head there, so hopefully others will look, too. For one thing, I haven't used inheritance, and I don't know how that might be playing into the bad estimates. (At first glance, it does seem to get into trouble about the time it estimates the rows for the outer joins to those.) The real problem to solve here is that it's estimating the rows count for the result so badly. If you need a short-term work-around, you've already discovered that you can keep it from using the index on id for ordering by creating an expression using id which causes it not to consider the index a match. That's kind of ugly to keep long term, though. -Kevin
Dear Kevin, Thanks for a very helpful reply. Kevin Grittner wrote: > Richard Neill <rn214@cam.ac.uk> wrote: > >> Am I wrong in thinking that ORDER BY is always applied after the >> main query is run? > > Yes, you are wrong to think that. It compares the costs of various > plans, and when it has an index with the high order portion matching > your ORDER BY clause, it may think that it can scan that index to > generate the correct sequence. If the sort is estimated to be > expensive enough compared to the index scan, it will use the index > scan and skip the sort. Sorting hundreds of billions of rows can be > expensive. > That makes sense now. >> Even if I run it this way: >> >> select * from (select * from h.inventory where demand_id >> =289276563) as sqry order by id; >> >> which should(?) surely force it to run the first select, then >> sort, > > I wouldn't necessarily assume that. You can EXPLAIN that form of > the query and find out easily enough. Does it say: > > -> Index Scan using inventory_demand_id on > inventory (cost=0.00..22.36 rows=28 width=56) (actual time=0.025..0.053 > rows=15 loops=1) > Index Cond: (demand_id = 289276563) > > or: > > -> Index Scan using inventory_pkey on > inventory (cost=0.00..879728.20 rows=28 width=56) (actual > time=31738.956..32860.738 rows=15 loops=1) > Filter: (demand_id = 289276563) > >> it's quick if I do order by id+1 > > You don't have an index on id+1. > Your explanation is validated by the explain - it only does the sort last iff I use "order by id+1", where there is no index for that. [Aside: using "id+0" also forces a sort.] > > The real problem to solve here is that it's estimating the rows > count for the result so badly. If you need a short-term > work-around, you've already discovered that you can keep it from > using the index on id for ordering by creating an expression using > id which causes it not to consider the index a match. That's kind > of ugly to keep long term, though. > We seem to have a general case of very bad query plans, where in other cases, explain analyze shows that the query-planner's guesses are miles adrift. Others have said that this is symptomatic of a lack of doing analyze, however we are doing quite a lot of analyzing (both through autovacuum, and a manual "vacuum verbose analyze" every night). Our underlying statistical distribution isn't that changeable. Thanks, Richard
On Tue, 1 Dec 2009, Jean-Michel Pouré wrote: > PostgreSQL query analyzer needs to run a couple of times before it can > rewrite and optimize the query. Make sure demand_id, id and join IDs > carry indexes. Huh? At what point does the planner carry over previous plans and use them to further optimise the query? But perhaps the biggest factor here is calling a five table join a "pretty simple query". Matthew -- Prolog doesn't have enough parentheses. -- Computer Science Lecturer
On 2/12/2009 7:08 PM, Matthew Wakeling wrote: > On Tue, 1 Dec 2009, Jean-Michel Pouré wrote: >> PostgreSQL query analyzer needs to run a couple of times before it can >> rewrite and optimize the query. Make sure demand_id, id and join IDs >> carry indexes. > > Huh? At what point does the planner carry over previous plans and use > them to further optimise the query? > > But perhaps the biggest factor here is calling a five table join a > "pretty simple query". Some of those tables are views composed of multiple unions, too, by the looks of things. Doesn't the planner have some ... issues ... with estimation of row counts on joins over unions? Or is my memory just more faulty than usual? -- Craig Ringer
Craig Ringer <craig@postnewspapers.com.au> wrote: > Some of those tables are views composed of multiple unions, too, > by the looks of things. > > Doesn't the planner have some ... issues ... with estimation of > row counts on joins over unions? Or is my memory just more faulty > than usual? So far I can't tell if it's views with unions or (as I suspect) inheritance. The views and tables shown so far reference other objects not yet shown: core.inventory h.location h.actor However, I'm pretty sure that the problem is that the estimated row count explodes for no reason that I can see when the "Nested Loop Left Join" has an "Append" node from a parent table on the right. 28 rows joined to a 4 row append yields 51160 rows? 51160 rows joined to a 2 row append yields 203176856 rows? 203176856 rows joined to a 2 row append yields 806903677108 rows? Something seems funny with the math. I would have expected 28 times 4 times 2 times 2, equaling 448. Still higher than 15, but only by one order of magnitude -- where it might still make relatively sane plan choices. -Kevin
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > Craig Ringer <craig@postnewspapers.com.au> wrote: >> Doesn't the planner have some ... issues ... with estimation of >> row counts on joins over unions? Or is my memory just more faulty >> than usual? > So far I can't tell if it's views with unions or (as I suspect) > inheritance. As of recent versions there shouldn't be a lot of difference between inheritance and UNION ALL subqueries --- they're both "appendrels" to the planner. And yeah, I think the statistical support is pretty crummy. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> wrote: > And yeah, I think the statistical support is pretty crummy. Do you know, off-hand, why the estimated row count for a "Nested Loop Left Join" is not the product of the estimates for the two sides? (I fear I'm missing something important which lead to the current estimates.) Estimates extracted from the problem plan: Nested Loop Left Join (rows=806903677108) -> Nested Loop Left Join (rows=203176856) -> Nested Loop Left Join (rows=51160) -> Nested Loop Left Join (rows=28) -> Append (rows=4) -> Append (rows=2) -> Append (rows=2) -Kevin
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > Estimates extracted from the problem plan: > Nested Loop Left Join (rows=806903677108) > -> Nested Loop Left Join (rows=203176856) > -> Nested Loop Left Join (rows=51160) > -> Nested Loop Left Join (rows=28) > -> Append (rows=4) > -> Append (rows=2) > -> Append (rows=2) That does look weird. Do we have a self-contained test case? I wouldn't necessarily expect the join rowcount to be exactly the product of the input rowcounts, but it shouldn't be that far off, I should think. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> wrote: > That does look weird. Do we have a self-contained test case? Richard, could you capture the schema for the affected tables and views with pg_dump -s and also the related rows from pg_statistic? (The actual table contents aren't needed to see this issue.) -Kevin
Kevin Grittner wrote: > Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> That does look weird. Do we have a self-contained test case? Not at the moment. It seems to only occur with relatively complex joins. > > Richard, could you capture the schema for the affected tables and > views with pg_dump -s and also the related rows from pg_statistic? > (The actual table contents aren't needed to see this issue.) > Here are the relevant parts of the schema - I've cut this out of the source-tree rather than pg_dump, since it seems more readable. Regarding pg_statistic, I don't understand how to find the relevant rows - what am I looking for? (the pg_statistic table is 247M in size). Thanks for your help, Richard THE PROBLEM QUERY ----------------- SELECT * FROM h.inventory WHERE demand_id = 289276563 ORDER BY id; #Note that using core.inventory (which is simpler) does not have the 30-second problem. #In general the h namespace is intended to be a human-readable diagnostic version whereas the core namespace #is optimised for the application h.inventory and core.inventory ------------------------------ -- -- Inventory -- CREATE TABLE core.inventory ( id bigint NOT NULL DEFAULT core.new_id(), material_id bigint NOT NULL, location_id bigint NOT NULL, qty integer NOT NULL, divergence integer NOT NULL DEFAULT 0, ctime timestamp with time zone NOT NULL DEFAULT now(), actor_id bigint NULL, demand_id bigint NULL, PRIMARY KEY ( id ), FOREIGN KEY ( material_id ) REFERENCES core.__material_id ( id ), FOREIGN KEY ( location_id ) REFERENCES core.__location_id ( id ), FOREIGN KEY ( actor_id ) REFERENCES core.actor ( id ), FOREIGN KEY ( demand_id ) REFERENCES core.demand ( id ) ); CREATE INDEX inventory_material_id ON core.inventory ( material_id ); CREATE INDEX inventory_location_id ON core.inventory ( location_id ); CREATE INDEX inventory_actor_id ON core.inventory ( actor_id ); CREATE INDEX inventory_demand_id ON core.inventory ( demand_id ); CREATE OR REPLACE VIEW h.inventory AS SELECT core.inventory.id, core.inventory.material_id, h_material.tag AS material_tag, h_material.name AS material_name, core.inventory.location_id, h_location.tag AS location_tag, h_location.name AS location_name, core.inventory.qty, core.inventory.divergence, core.inventory.ctime, core.inventory.actor_id, h_actor.tag AS actor_tag, h_actor.name AS actor_name, core.inventory.demand_id, h_demand.target_id, h_demand.target_tag, h_demand.target_name FROM core.inventory LEFT OUTER JOIN h.material AS h_material ON core.inventory.material_id = h_material.id LEFT OUTER JOIN h.location AS h_location ON core.inventory.location_id = h_location.id LEFT OUTER JOIN h.actor AS h_actor ON core.inventory.actor_id = h_actor.id LEFT OUTER JOIN h.demand AS h_demand ON core.inventory.demand_id = h_demand.id; h.material and core.material ---------------------------- -- -- Weights, dimensions, and other material data -- -- -- Materials -- CREATE TABLE core.material ( LIKE core.tag INCLUDING DEFAULTS INCLUDING CONSTRAINTS INCLUDING INDEXES, mass integer NOT NULL CHECK ( mass >= 0 ), volume integer NOT NULL CHECK ( volume >= 0 ) ) INHERITS ( core.tag ); SELECT core.inherit_unique_index ( 'material', 'tag', 'id' ); SELECT core.inherit_unique_index ( 'material', 'tag', 'tag' ); SELECT core.create_inheritable_unique_index ( 'material', 'id', 'bigint' ); COMMENT ON COLUMN core.material.mass IS 'Mass in grams'; COMMENT ON COLUMN core.material.volume IS 'Volume in ml'; CREATE OR REPLACE VIEW h.material AS SELECT core.material.id, core.material.tag, core.material.name, core.material.mass, core.material.volume FROM core.material; h.location and core.location ---------------------------- -- -- Locations -- CREATE TABLE core.location ( LIKE core.tag INCLUDING DEFAULTS INCLUDING CONSTRAINTS INCLUDING INDEXES ) INHERITS ( core.tag ); SELECT core.inherit_unique_index ( 'location', 'tag', 'id' ); SELECT core.inherit_unique_index ( 'location', 'tag', 'tag' ); SELECT core.create_inheritable_unique_index ( 'location', 'id', 'bigint' ); CREATE OR REPLACE VIEW h.location AS SELECT core.location.id, core.location.tag, core.location.name FROM core.location; h.actor and core.actor ---------------------- -- -- Actors -- CREATE TABLE core.actor ( LIKE core.tag INCLUDING DEFAULTS INCLUDING CONSTRAINTS INCLUDING INDEXES ) INHERITS ( core.tag ); SELECT core.inherit_unique_index ( 'actor', 'tag', 'id' ); SELECT core.inherit_unique_index ( 'actor', 'tag', 'tag' ); CREATE OR REPLACE VIEW h.actor AS SELECT core.actor.id, core.actor.tag, core.actor.name FROM core.actor; h.demand and core.demand ------------------------- -- -- Demand -- CREATE TABLE core.demand ( id bigint NOT NULL DEFAULT core.new_id(), target_id bigint NOT NULL, material_id bigint NOT NULL, qty integer NOT NULL, -- HACK benefit integer NOT NULL DEFAULT 0, PRIMARY KEY ( id ), UNIQUE ( target_id, material_id ), FOREIGN KEY ( target_id ) REFERENCES core.waypoint ( id ), FOREIGN KEY ( material_id ) REFERENCES core.__material_id ( id ) ); CREATE INDEX demand_target_id ON core.demand ( target_id ); CREATE INDEX demand_material_id ON core.demand ( material_id ); CREATE OR REPLACE VIEW h.demand AS SELECT core.demand.id, core.demand.target_id, h_target_waypoint.tag AS target_tag, h_target_waypoint.name AS target_name, core.demand.material_id, h_material.tag AS material_tag, h_material.name AS material_name, core.demand.qty, core.demand.benefit FROM core.demand LEFT OUTER JOIN h.waypoint AS h_target_waypoint ON core.demand.target_id = h_target_waypoint.id LEFT OUTER JOIN h.material AS h_material ON core.demand.material_id = h_material.id; h.waypoint and core.waypoint ---------------------------- -- -- Waypoints -- CREATE TABLE core.waypoint ( LIKE core.location INCLUDING DEFAULTS INCLUDING CONSTRAINTS INCLUDING INDEXES, is_router boolean NOT NULL, is_target boolean NOT NULL, is_packer boolean NOT NULL ) INHERITS ( core.location ); SELECT core.inherit_unique_index ( 'waypoint', 'tag', 'id' ); SELECT core.inherit_unique_index ( 'waypoint', 'tag', 'tag' ); SELECT core.inherit_unique_index ( 'waypoint', 'location', 'id' ); COMMENT ON COLUMN core.waypoint.is_router IS 'Routing decisions may be made at this waypoint'; COMMENT ON COLUMN core.waypoint.is_target IS 'Routing may be aimed towards this waypoint'; COMMENT ON COLUMN core.waypoint.is_packer IS 'Containerisation takes place at this waypoint'; CREATE OR REPLACE VIEW h.waypoint AS SELECT core.waypoint.id, core.waypoint.tag, core.waypoint.name, core.waypoint.is_router, core.waypoint.is_target, core.waypoint.is_packer FROM core.waypoint;
Richard Neill <rn214@cam.ac.uk> wrote: > Regarding pg_statistic, I don't understand how to find the > relevant rows - what am I looking for? (the pg_statistic table is > 247M in size). I think the only relevant rows would be the ones with starelid = pg_class.oid for a table used in the query, and I think you could further limit it to rows where staattnum = pg_attribute.attnum for a column referenced in the WHERE clause or a JOIN's ON clause (including in the views). To help match them up, and to cover all the bases, listing the related pg_class and pg_attribute rows would help. Hopefully that will allow us to generate the same plan in an EXPLAIN, and then see how it gets such an overblown estimate of the result rows. -Kevin
Kevin Grittner wrote: > Richard Neill <rn214@cam.ac.uk> wrote: > >> Regarding pg_statistic, I don't understand how to find the >> relevant rows - what am I looking for? (the pg_statistic table is >> 247M in size). > > I think the only relevant rows would be the ones with starelid = > pg_class.oid for a table used in the query, and I think you could > further limit it to rows where staattnum = pg_attribute.attnum for a > column referenced in the WHERE clause or a JOIN's ON clause > (including in the views). To help match them up, and to cover all > the bases, listing the related pg_class and pg_attribute rows would > help. > > Hopefully that will allow us to generate the same plan in an > EXPLAIN, and then see how it gets such an overblown estimate of the > result rows. Thanks for your explanation. I ran the query: SELECT * from pg_statistic WHERE starelid IN (SELECT oid FROM pg_class where relname IN ('demand','waypoint','actor','location','material','inventory') ); and it's 228kB compressed, so rather than attaching it, I'm placing it here: http://www.richardneill.org/tmp/pg_statistic.bz2 Likewise, the much smaller (16kB) output from: SELECT * from pg_class where relname IN ('demand','waypoint','actor','location','material','inventory'); SELECT * from pg_attribute ; is at: http://www.richardneill.org/tmp/pg_attribute_pg_class.bz2 P.S. Would it be easier for you if I set up SSH access to a spare machine, with a copy of the database? Thanks very much for your help, Richard