Thread: Order by (for 15 rows) adds 30 seconds to query time

Order by (for 15 rows) adds 30 seconds to query time

From

Richard Neill

Date:

01 December 2009, 14:52:30

Dear All,

I don't know if this is a stupid question, or not, but I can't
understand the following.

I have a pretty simple query, which runs in about 7ms

   SELECT * FROM h.inventory WHERE demand_id =289276563;


The result of this is a 15 row x 17 column table. However, I want this
to be sorted by id, so I changed the query to:


    SELECT * FROM h.inventory WHERE demand_id =289276563 ORDER BY id;

which makes it take 32 seconds!


That surprises me - I'd expect the ORDER BY to be the last thing that
runs, and for a sort of such a small dataset  to be almost
instantaneous. Indeed, if I do ORDER BY random(), then it's fast.

The system is running 8.4.1, and is otherwise lightly loaded, I can do
this repeatedly with similar results.

Is this normal? Have I hit a bug?

I attach the view definition, the result set, and the output from
explain analyze (both ways).

Thanks,

Richard




                   View "h.inventory"
     Column     |           Type           | Modifiers
---------------+--------------------------+-----------
  id            | bigint                   |
  material_id   | bigint                   |
  material_tag  | text                     |
  material_name | text                     |
  location_id   | bigint                   |
  location_tag  | text                     |
  location_name | text                     |
  qty           | integer                  |
  divergence    | integer                  |
  ctime         | timestamp with time zone |
  actor_id      | bigint                   |
  actor_tag     | text                     |
  actor_name    | text                     |
  demand_id     | bigint                   |
  target_id     | bigint                   |
  target_tag    | text                     |
  target_name   | text                     |
View definition:
  SELECT inventory.id, inventory.material_id, h_material.tag AS
material_tag, h_material.name AS material_name, inventory.location_id,
h_location.tag AS location_tag, h_location.name AS location_name,
inventory.qty, inventory.divergence, inventory.ctime,
inventory.actor_id, h_actor.tag AS actor_tag, h_actor.name AS
actor_name, inventory.demand_id, h_demand.target_id,
h_demand.target_tag, h_demand.target_name
    FROM core.inventory
    LEFT JOIN h.material h_material ON inventory.material_id = h_material.id
    LEFT JOIN h.location h_location ON inventory.location_id = h_location.id
    LEFT JOIN h.actor h_actor ON inventory.actor_id = h_actor.id
    LEFT JOIN h.demand h_demand ON inventory.demand_id = h_demand.id;












     id     | material_id | material_tag  | material_name | location_id
| location_tag |     location_name      | qty | divergence |
  ctime             | actor_id | actor_tag |          actor_name
   | demand_id | target_id |   target_tag   |              target_name

-----------+-------------+---------------+---------------+-------------+--------------+------------------------+-----+------------+-------------------------------+----------+-----------+------------------------------+-----------+-----------+----------------+----------------------------------------
  292904293 |   289238938 | 1000001113980 |               |   280410977
| 1030576      | Container 1030576      |   0 |          0 | 2009-12-01
14:53:35.010023+00 |     5543 | 139664    | Joanna Mikolajczak
  | 289276563 |      3153 | 300244 EMBARGO | 300244 308/09 HAYWARDS
HEATH / EMBARGO
  292904294 |   289238938 | 1000001113980 |               |   280410977
| 1030576      | Container 1030576      |   1 |          0 | 2009-12-01
14:53:35.060378+00 |          |           |
  | 289276563 |      3153 | 300244 EMBARGO | 300244 308/09 HAYWARDS
HEATH / EMBARGO
  292904292 |   289238938 | 1000001113980 |               |        4008
| 308          | Chute 308              |   0 |          0 | 2009-12-01
14:53:34.925801+00 |     5543 | 139664    | Joanna Mikolajczak
  | 289276563 |      3153 | 300244 EMBARGO | 300244 308/09 HAYWARDS
HEATH / EMBARGO
  292817907 |   289238938 | 1000001113980 |               |        5137
| 991          | Chute 991 (not needed) |   0 |          0 | 2009-12-01
14:38:00.819189+00 |     6282 | CW 991    | Chute 991 worker
  | 289276563 |      3153 | 300244 EMBARGO | 300244 308/09 HAYWARDS
HEATH / EMBARGO
  291755251 |   289238938 | 1000001113980 |               |        5137
| 991          | Chute 991 (not needed) |   0 |          0 | 2009-12-01
12:03:05.957424+00 |     6282 | CW 991    | Chute 991 worker
  | 289276563 |      3153 | 300244 EMBARGO | 300244 308/09 HAYWARDS
HEATH / EMBARGO
  291543235 |   289238938 | 1000001113980 |               |        5137
| 991          | Chute 991 (not needed) |   0 |          0 | 2009-12-01
11:35:19.28846+00  |     6282 | CW 991    | Chute 991 worker
  | 289276563 |      3153 | 300244 EMBARGO | 300244 308/09 HAYWARDS
HEATH / EMBARGO
  291524046 |   289238938 | 1000001113980 |               |        4008
| 308          | Chute 308              |   0 |          0 | 2009-12-01
11:31:49.40378+00  |          |           |
  | 289276563 |      3153 | 300244 EMBARGO | 300244 308/09 HAYWARDS
HEATH / EMBARGO
  291524045 |   289238938 | 1000001113980 |               |        4008
| 308          | Chute 308              |   0 |          0 | 2009-12-01
11:31:49.402217+00 |     6300 | FSC       | Flow System Controller (FSC)
| 289276563 |      3153 | 300244 EMBARGO | 300244 308/09 HAYWARDS HEATH
/ EMBARGO
  291522067 |   289238938 | 1000001113980 |               |        5143
| CAM2         | North Camera           |   0 |          0 | 2009-12-01
11:31:22.931085+00 |     6300 | FSC       | Flow System Controller (FSC)
| 289276563 |      3153 | 300244 EMBARGO | 300244 308/09 HAYWARDS HEATH
/ EMBARGO
  291518675 |   289238938 | 1000001113980 |               |        5137
| 991          | Chute 991 (not needed) |   0 |          0 | 2009-12-01
11:30:32.10016+00  |     6282 | CW 991    | Chute 991 worker
  | 289276563 |      3153 | 300244 EMBARGO | 300244 308/09 HAYWARDS
HEATH / EMBARGO
  291506067 |   289238938 | 1000001113980 |               |        5137
| 991          | Chute 991 (not needed) |   0 |          0 | 2009-12-01
11:26:38.065565+00 |     6282 | CW 991    | Chute 991 worker
  | 289276563 |      3153 | 300244 EMBARGO | 300244 308/09 HAYWARDS
HEATH / EMBARGO
  291491123 |   289238938 | 1000001113980 |               |        5137
| 991          | Chute 991 (not needed) |   0 |          0 | 2009-12-01
11:21:33.009506+00 |     6282 | CW 991    | Chute 991 worker
  | 289276563 |      3153 | 300244 EMBARGO | 300244 308/09 HAYWARDS
HEATH / EMBARGO
  291322415 |   289238938 | 1000001113980 |               |        4008
| 308          | Chute 308              |   0 |          0 | 2009-12-01
10:45:08.281846+00 |          |           |
  | 289276563 |      3153 | 300244 EMBARGO | 300244 308/09 HAYWARDS
HEATH / EMBARGO
  291322414 |   289238938 | 1000001113980 |               |        4008
| 308          | Chute 308              |   0 |          0 | 2009-12-01
10:45:08.280018+00 |     6300 | FSC       | Flow System Controller (FSC)
| 289276563 |      3153 | 300244 EMBARGO | 300244 308/09 HAYWARDS HEATH
/ EMBARGO
  291319302 |   289238938 | 1000001113980 |               |        5143
| CAM2         | North Camera           |   0 |          0 | 2009-12-01
10:44:41.807128+00 |     6300 | FSC       | Flow System Controller (FSC)
| 289276563 |      3153 | 300244 EMBARGO | 300244 308/09 HAYWARDS HEATH
/ EMBARGO
(15 rows)















explain analyze select * from h.inventory where demand_id =289276563;
Time: 7.251 ms



     QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------------------------
  Nested Loop Left Join  (cost=0.00..726857452.94 rows=806903677108
width=195) (actual time=0.108..0.704 rows=15 loops=1)
    Join Filter: (core.inventory.material_id = core.material.id)
    ->  Nested Loop Left Join  (cost=0.00..183236.84 rows=203176856
width=166) (actual time=0.103..0.588 rows=15 loops=1)
          Join Filter: (demand.material_id = core.material.id)
          ->  Nested Loop Left Join  (cost=0.00..260.03 rows=51160
width=174) (actual time=0.090..0.462 rows=15 loops=1)
                Join Filter: (core.inventory.location_id = core.location.id)
                ->  Nested Loop Left Join  (cost=0.00..146.71 rows=28
width=128) (actual time=0.068..0.286 rows=15 loops=1)
                      ->  Nested Loop Left Join  (cost=0.00..125.36
rows=28 width=103) (actual time=0.058..0.225 rows=15 loops=1)
                            Join Filter: (core.inventory.demand_id =
demand.id)
                            ->  Index Scan using inventory_demand_id on
inventory  (cost=0.00..22.36 rows=28 width=56) (actual time=0.025..0.053
rows=15 loops=1)
                                  Index Cond: (demand_id = 289276563)
                            ->  Nested Loop Left Join  (cost=0.00..3.67
rows=1 width=55) (actual time=0.009..0.010 rows=1 loops=15)
                                  ->  Index Scan using demand_pkey on
demand  (cost=0.00..1.89 rows=1 width=24) (actual time=0.005..0.005
rows=1 loops=15)
                                        Index Cond: (id = 289276563)
                                  ->  Index Scan using waypoint_pkey on
waypoint  (cost=0.00..1.77 rows=1 width=39) (actual time=0.003..0.003
rows=1 loops=15)
                                        Index Cond: (demand.target_id =
waypoint.id)
                      ->  Index Scan using actor_pkey on actor
(cost=0.00..0.75 rows=1 width=33) (actual time=0.003..0.003 rows=1 loops=15)
                            Index Cond: (core.inventory.actor_id = actor.id)
                ->  Append  (cost=0.00..4.00 rows=4 width=50) (actual
time=0.005..0.010 rows=1 loops=15)
                      ->  Index Scan using location_pkey on location
(cost=0.00..0.56 rows=1 width=72) (actual time=0.001..0.001 rows=0 loops=15)
                            Index Cond: (core.inventory.location_id =
core.location.id)
                      ->  Index Scan using waypoint_pkey on waypoint
location  (cost=0.00..1.31 rows=1 width=39) (actual time=0.003..0.003
rows=1 loops=15)
                            Index Cond: (core.inventory.location_id =
core.location.id)
                      ->  Index Scan using container_pkey on container
location  (cost=0.00..1.78 rows=1 width=54) (actual time=0.004..0.004
rows=0 loops=15)
                            Index Cond: (core.inventory.location_id =
core.location.id)
                      ->  Index Scan using supply_pkey on supply
location  (cost=0.00..0.35 rows=1 width=36) (actual time=0.001..0.001
rows=0 loops=15)
                            Index Cond: (core.inventory.location_id =
core.location.id)
          ->  Append  (cost=0.00..3.55 rows=2 width=8) (actual
time=0.004..0.007 rows=1 loops=15)
                ->  Index Scan using material_pkey on material
(cost=0.00..1.78 rows=1 width=8) (actual time=0.004..0.005 rows=1 loops=15)
                      Index Cond: (demand.material_id = core.material.id)
                ->  Index Scan using container_pkey on container
material  (cost=0.00..1.78 rows=1 width=8) (actual time=0.002..0.002
rows=0 loops=15)
                      Index Cond: (demand.material_id = core.material.id)
    ->  Append  (cost=0.00..3.55 rows=2 width=38) (actual
time=0.003..0.006 rows=1 loops=15)
          ->  Index Scan using material_pkey on material
(cost=0.00..1.78 rows=1 width=22) (actual time=0.003..0.003 rows=1 loops=15)
                Index Cond: (core.inventory.material_id = core.material.id)
          ->  Index Scan using container_pkey on container material
(cost=0.00..1.78 rows=1 width=54) (actual time=0.003..0.003 rows=0 loops=15)
                Index Cond: (core.inventory.material_id = core.material.id)
  Total runtime: 0.858 ms
(38 rows)



















explain analyze select * from h.inventory where demand_id =289276563
order by id;
Time: 32868.784 ms

         QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Nested Loop Left Join  (cost=0.00..727737158.77 rows=806903677108
width=195) (actual time=31739.052..32862.322 rows=15 loops=1)
    Join Filter: (core.inventory.material_id = core.material.id)
    ->  Nested Loop Left Join  (cost=0.00..1062942.67 rows=203176856
width=166) (actual time=31739.044..32862.084 rows=15 loops=1)
          Join Filter: (demand.material_id = core.material.id)
          ->  Nested Loop Left Join  (cost=0.00..879965.86 rows=51160
width=174) (actual time=31739.025..32861.812 rows=15 loops=1)
                Join Filter: (core.inventory.location_id = core.location.id)
                ->  Nested Loop Left Join  (cost=0.00..879852.54 rows=28
width=128) (actual time=31739.006..32861.428 rows=15 loops=1)
                      ->  Nested Loop Left Join  (cost=0.00..879831.20
rows=28 width=103) (actual time=31738.994..32861.276 rows=15 loops=1)
                            Join Filter: (core.inventory.demand_id =
demand.id)
                            ->  Index Scan using inventory_pkey on
inventory  (cost=0.00..879728.20 rows=28 width=56) (actual
time=31738.956..32860.738 rows=15 loops=1)
                                  Filter: (demand_id = 289276563)
                            ->  Nested Loop Left Join  (cost=0.00..3.67
rows=1 width=55) (actual time=0.030..0.031 rows=1 loops=15)
                                  ->  Index Scan using demand_pkey on
demand  (cost=0.00..1.89 rows=1 width=24) (actual time=0.019..0.019
rows=1 loops=15)
                                        Index Cond: (id = 289276563)
                                  ->  Index Scan using waypoint_pkey on
waypoint  (cost=0.00..1.77 rows=1 width=39) (actual time=0.008..0.008
rows=1 loops=15)
                                        Index Cond: (demand.target_id =
waypoint.id)
                      ->  Index Scan using actor_pkey on actor
(cost=0.00..0.75 rows=1 width=33) (actual time=0.007..0.007 rows=1 loops=15)
                            Index Cond: (core.inventory.actor_id = actor.id)
                ->  Append  (cost=0.00..4.00 rows=4 width=50) (actual
time=0.010..0.019 rows=1 loops=15)
                      ->  Index Scan using location_pkey on location
(cost=0.00..0.56 rows=1 width=72) (actual time=0.003..0.003 rows=0 loops=15)
                            Index Cond: (core.inventory.location_id =
core.location.id)
                      ->  Index Scan using waypoint_pkey on waypoint
location  (cost=0.00..1.31 rows=1 width=39) (actual time=0.006..0.006
rows=1 loops=15)
                            Index Cond: (core.inventory.location_id =
core.location.id)
                      ->  Index Scan using container_pkey on container
location  (cost=0.00..1.78 rows=1 width=54) (actual time=0.006..0.006
rows=0 loops=15)
                            Index Cond: (core.inventory.location_id =
core.location.id)
                      ->  Index Scan using supply_pkey on supply
location  (cost=0.00..0.35 rows=1 width=36) (actual time=0.003..0.003
rows=0 loops=15)
                            Index Cond: (core.inventory.location_id =
core.location.id)
          ->  Append  (cost=0.00..3.55 rows=2 width=8) (actual
time=0.011..0.014 rows=1 loops=15)
                ->  Index Scan using material_pkey on material
(cost=0.00..1.78 rows=1 width=8) (actual time=0.010..0.011 rows=1 loops=15)
                      Index Cond: (demand.material_id = core.material.id)
                ->  Index Scan using container_pkey on container
material  (cost=0.00..1.78 rows=1 width=8) (actual time=0.002..0.002
rows=0 loops=15)
                      Index Cond: (demand.material_id = core.material.id)
    ->  Append  (cost=0.00..3.55 rows=2 width=38) (actual
time=0.004..0.012 rows=1 loops=15)
          ->  Index Scan using material_pkey on material
(cost=0.00..1.78 rows=1 width=22) (actual time=0.003..0.004 rows=1 loops=15)
                Index Cond: (core.inventory.material_id = core.material.id)
          ->  Index Scan using container_pkey on container material
(cost=0.00..1.78 rows=1 width=54) (actual time=0.008..0.008 rows=0 loops=15)
                Index Cond: (core.inventory.material_id = core.material.id)
  Total runtime: 32862.509 ms
(38 rows)

Re: Order by (for 15 rows) adds 30 seconds to query time

From

"Kevin Grittner"

Date:

01 December 2009, 16:06:47

Richard Neill <rn214@cam.ac.uk> wrote:

> I'd expect the ORDER BY to be the last thing that runs

>   Nested Loop Left Join  (cost=0.00..727737158.77
> rows=806903677108 width=195) (actual time=31739.052..32862.322
> rows=15 loops=1)

It probably would if it knew there were going to be 15 rows to sort.
It is estimating that there will be 806,903,677,108 rows, in which
case it thinks that using the index will be faster.  The question is
why it's 10 or 11 orders of magnitude off on the estimate of result
rows.  Could you show us the table definitions underlying that view?

-Kevin

Re: Order by (for 15 rows) adds 30 seconds to query time

From

Jean-Michel Pouré

Date:

01 December 2009, 16:06:55

Le mardi 01 décembre 2009 à 18:52 +0000, Richard Neill a écrit :
> Is this normal? Have I hit a bug?

PostgreSQL query analyzer needs to run a couple of times before it can
rewrite and optimize the query. Make sure demand_id, id and join IDs
carry indexes.

Run EXPLAIN ANALYSE your_query to understand how the parser works and
post it back here.

Kind regards,
Jean-Michel

Attachment

signature.asc

Re: Order by (for 15 rows) adds 30 seconds to query time

From

Richard Neill

Date:

01 December 2009, 18:46:41


Jean-Michel Pouré wrote:
> Le mardi 01 décembre 2009 à 18:52 +0000, Richard Neill a écrit :
>> Is this normal? Have I hit a bug?
>
> PostgreSQL query analyzer needs to run a couple of times before it can
> rewrite and optimize the query. Make sure demand_id, id and join IDs
> carry indexes.
>

I did, and they do. This table has been in place for ages, with
autovacuum on, and a manual vacuum analyze every night. I checked by
running analyze explicitly on all the relevant tables just before
posting this.

> Run EXPLAIN ANALYSE your_query to understand how the parser works and
> post it back here.
>

Already in previous email :-)

> Kind regards,
> Jean-Michel



Kevin Grittner wrote:
 > Richard Neill <rn214@cam.ac.uk> wrote:
 >
 >> I'd expect the ORDER BY to be the last thing that runs
 >
 >>   Nested Loop Left Join  (cost=0.00..727737158.77
 >> rows=806903677108 width=195) (actual time=31739.052..32862.322
 >> rows=15 loops=1)
 >
 > It probably would if it knew there were going to be 15 rows to sort.
 > It is estimating that there will be 806,903,677,108 rows, in which
 > case it thinks that using the index will be faster.  The question is
 > why it's 10 or 11 orders of magnitude off on the estimate of result
 > rows.  Could you show us the table definitions underlying that view?
 >
 > -Kevin
 >


Am I wrong in thinking that ORDER BY is always applied after the main
query is run?

Even if I run it this way:

select * from (select * from h.inventory where demand_id =289276563) as
sqry order by id;

which should(?) surely force it to run the first select, then sort, it's
still very slow. On the other hand, it's quick if I do order by id+1

The table definitions are as follows (sorry there are so many).


Richard







fswcs=# \d h.demand
            View "h.demand"
     Column     |  Type   | Modifiers
---------------+---------+-----------
  id            | bigint  |
  target_id     | bigint  |
  target_tag    | text    |
  target_name   | text    |
  material_id   | bigint  |
  material_tag  | text    |
  material_name | text    |
  qty           | integer |
  benefit       | integer |
View definition:
  SELECT demand.id, demand.target_id, h_target_waypoint.tag AS
target_tag, h_target_waypoint.name AS target_name, demand.material_id,
h_material.tag AS material_tag, h_material.name AS material_name,
demand.qty, demand.benefit
    FROM core.demand
    LEFT JOIN h.waypoint h_target_waypoint ON demand.target_id =
h_target_waypoint.id
    LEFT JOIN h.material h_material ON demand.material_id = h_material.id;

fswcs=# \d core.demand
                   Table "core.demand"
    Column    |  Type   |           Modifiers
-------------+---------+--------------------------------
  id          | bigint  | not null default core.new_id()
  target_id   | bigint  | not null
  material_id | bigint  | not null
  qty         | integer | not null
  benefit     | integer | not null default 0
Indexes:
     "demand_pkey" PRIMARY KEY, btree (id) CLUSTER
     "demand_target_id_key" UNIQUE, btree (target_id, material_id)
     "demand_material_id" btree (material_id)
     "demand_target_id" btree (target_id)
Foreign-key constraints:
     "demand_material_id_fkey" FOREIGN KEY (material_id) REFERENCES
core.__material_id(id)
     "demand_target_id_fkey" FOREIGN KEY (target_id) REFERENCES
core.waypoint(id)
Referenced by:
     TABLE "viwcs.du_report_contents" CONSTRAINT
"du_report_contents_demand_id_fkey" FOREIGN KEY (demand_id) REFERENCES
core.demand(id)
     TABLE "core.inventory" CONSTRAINT "inventory_demand_id_fkey"
FOREIGN KEY (demand_id) REFERENCES core.demand(id)
     TABLE "viwcs.wave_demand" CONSTRAINT "wave_demand_demand_id_fkey"
FOREIGN KEY (demand_id) REFERENCES core.demand(id)

fswcs=# \d h.waypoint
         View "h.waypoint"
   Column   |  Type   | Modifiers
-----------+---------+-----------
  id        | bigint  |
  tag       | text    |
  name      | text    |
  is_router | boolean |
  is_target | boolean |
  is_packer | boolean |
View definition:
  SELECT waypoint.id, waypoint.tag, waypoint.name, waypoint.is_router,
waypoint.is_target, waypoint.is_packer
    FROM core.waypoint;

fswcs=# \d h.material
       View "h.material"
  Column |  Type   | Modifiers
--------+---------+-----------
  id     | bigint  |
  tag    | text    |
  name   | text    |
  mass   | integer |
  volume | integer |
View definition:
  SELECT material.id, material.tag, material.name, material.mass,
material.volume
    FROM core.material;

fswcs=# \d core.wa
core.waypoint           core.waypoint_name_key  core.waypoint_pkey
core.waypoint_tag_key
fswcs=# \d core.waypoint
                 Table "core.waypoint"
   Column   |  Type   |           Modifiers
-----------+---------+--------------------------------
  id        | bigint  | not null default core.new_id()
  tag       | text    | not null
  name      | text    | not null
  is_router | boolean | not null
  is_target | boolean | not null
  is_packer | boolean | not null
Indexes:
     "waypoint_pkey" PRIMARY KEY, btree (id) CLUSTER
     "waypoint_tag_key" UNIQUE, btree (tag)
     "waypoint_name_key" btree (name)
Referenced by:
     TABLE "core.demand" CONSTRAINT "demand_target_id_fkey" FOREIGN KEY
(target_id) REFERENCES core.waypoint(id)
     TABLE "viwcs.du_report" CONSTRAINT "du_report_target_id_fkey"
FOREIGN KEY (target_id) REFERENCES core.waypoint(id)
     TABLE "viwcs.mss_actor_state" CONSTRAINT
"mss_actor_state_last_demand_tag_fkey" FOREIGN KEY (last_demand_tag)
REFERENCES core.waypoint(tag)
     TABLE "viwcs.mss_actor_state" CONSTRAINT
"mss_actor_state_last_racking_tag_fkey" FOREIGN KEY (last_racking_tag)
REFERENCES core.waypoint(tag)
     TABLE "viwcs.mss_rack_action_queue" CONSTRAINT
"mss_rack_action_queue_racking_tag_fkey" FOREIGN KEY (racking_tag)
REFERENCES core.waypoint(tag)
     TABLE "core.route_cache" CONSTRAINT "route_cache_next_hop_id_fkey"
FOREIGN KEY (next_hop_id) REFERENCES core.waypoint(id) ON DELETE CASCADE
     TABLE "core.route_cache" CONSTRAINT "route_cache_router_id_fkey"
FOREIGN KEY (router_id) REFERENCES core.waypoint(id) ON DELETE CASCADE
     TABLE "core.route_cache" CONSTRAINT "route_cache_target_id_fkey"
FOREIGN KEY (target_id) REFERENCES core.waypoint(id) ON DELETE CASCADE
     TABLE "core.route" CONSTRAINT "route_dst_id_fkey" FOREIGN KEY
(dst_id) REFERENCES core.waypoint(id)
     TABLE "core.route" CONSTRAINT "route_src_id_fkey" FOREIGN KEY
(src_id) REFERENCES core.waypoint(id)
     TABLE "viwcs.wave_genreorders_map" CONSTRAINT
"wave_genreorders_map_ERR_GENREID_UNKNOWN" FOREIGN KEY (target_id)
REFERENCES core.waypoint(id)
Triggers:
     __waypoint__location_id_delete BEFORE DELETE ON core.waypoint FOR
EACH ROW EXECUTE PROCEDURE core.__location_id_delete()
     __waypoint__location_id_insert BEFORE INSERT ON core.waypoint FOR
EACH ROW EXECUTE PROCEDURE core.__location_id_insert()
     __waypoint__location_id_update BEFORE UPDATE ON core.waypoint FOR
EACH ROW EXECUTE PROCEDURE core.__location_id_update()
     __waypoint__tag_id_delete BEFORE DELETE ON core.waypoint FOR EACH
ROW EXECUTE PROCEDURE core.__tag_id_delete()
     __waypoint__tag_id_insert BEFORE INSERT ON core.waypoint FOR EACH
ROW EXECUTE PROCEDURE core.__tag_id_insert()
     __waypoint__tag_id_update BEFORE UPDATE ON core.waypoint FOR EACH
ROW EXECUTE PROCEDURE core.__tag_id_update()
     __waypoint__tag_tag_delete BEFORE DELETE ON core.waypoint FOR EACH
ROW EXECUTE PROCEDURE core.__tag_tag_delete()
     __waypoint__tag_tag_insert BEFORE INSERT ON core.waypoint FOR EACH
ROW EXECUTE PROCEDURE core.__tag_tag_insert()
     __waypoint__tag_tag_update BEFORE UPDATE ON core.waypoint FOR EACH
ROW EXECUTE PROCEDURE core.__tag_tag_update()
Inherits: core.location

fswcs=# \d core.ma
core.material           core.material_name_key  core.material_pkey
core.material_tag_key
fswcs=# \d core.material
                Table "core.material"
  Column |  Type   |           Modifiers
--------+---------+--------------------------------
  id     | bigint  | not null default core.new_id()
  tag    | text    | not null
  name   | text    | not null
  mass   | integer | not null
  volume | integer | not null
Indexes:
     "material_pkey" PRIMARY KEY, btree (id)
     "material_tag_key" UNIQUE, btree (tag)
     "material_name_key" btree (name)
Check constraints:
     "material_mass_check" CHECK (mass >= 0)
     "material_volume_check" CHECK (volume >= 0)
Triggers:
     __material__material_id_delete BEFORE DELETE ON core.material FOR
EACH ROW EXECUTE PROCEDURE core.__material_id_delete()
     __material__material_id_insert BEFORE INSERT ON core.material FOR
EACH ROW EXECUTE PROCEDURE core.__material_id_insert()
     __material__material_id_update BEFORE UPDATE ON core.material FOR
EACH ROW EXECUTE PROCEDURE core.__material_id_update()
     __material__tag_id_delete BEFORE DELETE ON core.material FOR EACH
ROW EXECUTE PROCEDURE core.__tag_id_delete()
     __material__tag_id_insert BEFORE INSERT ON core.material FOR EACH
ROW EXECUTE PROCEDURE core.__tag_id_insert()
     __material__tag_id_update BEFORE UPDATE ON core.material FOR EACH
ROW EXECUTE PROCEDURE core.__tag_id_update()
     __material__tag_tag_delete BEFORE DELETE ON core.material FOR EACH
ROW EXECUTE PROCEDURE core.__tag_tag_delete()
     __material__tag_tag_insert BEFORE INSERT ON core.material FOR EACH
ROW EXECUTE PROCEDURE core.__tag_tag_insert()
     __material__tag_tag_update BEFORE UPDATE ON core.material FOR EACH
ROW EXECUTE PROCEDURE core.__tag_tag_update()
Inherits: core.tag

Re: Order by (for 15 rows) adds 30 seconds to query time

From

"Kevin Grittner"

Date:

01 December 2009, 19:37:01

Richard Neill <rn214@cam.ac.uk> wrote:

> Am I wrong in thinking that ORDER BY is always applied after the
> main query is run?

Yes, you are wrong to think that.  It compares the costs of various
plans, and when it has an index with the high order portion matching
your ORDER BY clause, it may think that it can scan that index to
generate the correct sequence.  If the sort is estimated to be
expensive enough compared to the index scan, it will use the index
scan and skip the sort.  Sorting hundreds of billions of rows can be
expensive.

> Even if I run it this way:
>
> select * from (select * from h.inventory where demand_id
> =289276563) as sqry order by id;
>
> which should(?) surely force it to run the first select, then
> sort,

I wouldn't necessarily assume that.  You can EXPLAIN that form of
the query and find out easily enough.  Does it say:

                            ->  Index Scan using inventory_demand_id on
inventory  (cost=0.00..22.36 rows=28 width=56) (actual time=0.025..0.053
rows=15 loops=1)
                                  Index Cond: (demand_id = 289276563)

or:

                            ->  Index Scan using inventory_pkey on
inventory  (cost=0.00..879728.20 rows=28 width=56) (actual
time=31738.956..32860.738 rows=15 loops=1)
                                  Filter: (demand_id = 289276563)

> it's quick if I do order by id+1

You don't have an index on id+1.

> The table definitions are as follows (sorry there are so many).

I'll poke around to try to get a clue why the estimated result rows
are so far off, but I may be in over my head there, so hopefully
others will look, too.  For one thing, I haven't used inheritance,
and I don't know how that might be playing into the bad estimates.
(At first glance, it does seem to get into trouble about the time it
estimates the rows for the outer joins to those.)

The real problem to solve here is that it's estimating the rows
count for the result so badly.  If you need a short-term
work-around, you've already discovered that you can keep it from
using the index on id for ordering by creating an expression using
id which causes it not to consider the index a match.  That's kind
of ugly to keep long term, though.

-Kevin

Re: Order by (for 15 rows) adds 30 seconds to query time

From

Richard Neill

Date:

02 December 2009, 01:43:09

Dear Kevin,

Thanks for a very helpful reply.

Kevin Grittner wrote:
> Richard Neill <rn214@cam.ac.uk> wrote:
>
>> Am I wrong in thinking that ORDER BY is always applied after the
>> main query is run?
>
> Yes, you are wrong to think that.  It compares the costs of various
> plans, and when it has an index with the high order portion matching
> your ORDER BY clause, it may think that it can scan that index to
> generate the correct sequence.  If the sort is estimated to be
> expensive enough compared to the index scan, it will use the index
> scan and skip the sort.  Sorting hundreds of billions of rows can be
> expensive.
>

That makes sense now.



>> Even if I run it this way:
>>
>> select * from (select * from h.inventory where demand_id
>> =289276563) as sqry order by id;
>>
>> which should(?) surely force it to run the first select, then
>> sort,
>
> I wouldn't necessarily assume that.  You can EXPLAIN that form of
> the query and find out easily enough.  Does it say:
>
>                             ->  Index Scan using inventory_demand_id on
> inventory  (cost=0.00..22.36 rows=28 width=56) (actual time=0.025..0.053
> rows=15 loops=1)
>                                   Index Cond: (demand_id = 289276563)
>
> or:
>
>                             ->  Index Scan using inventory_pkey on
> inventory  (cost=0.00..879728.20 rows=28 width=56) (actual
> time=31738.956..32860.738 rows=15 loops=1)
>                                   Filter: (demand_id = 289276563)
>
>> it's quick if I do order by id+1
>
> You don't have an index on id+1.
>

Your explanation is validated by the explain - it only does the sort
last iff I use "order by id+1", where there is no index for that.

[Aside: using "id+0" also forces a sort.]


>
> The real problem to solve here is that it's estimating the rows
> count for the result so badly.  If you need a short-term
> work-around, you've already discovered that you can keep it from
> using the index on id for ordering by creating an expression using
> id which causes it not to consider the index a match.  That's kind
> of ugly to keep long term, though.
>

We seem to have a general case of very bad query plans, where in other
cases, explain analyze shows that the query-planner's guesses are miles
adrift.

Others have said that this is symptomatic of a lack of doing analyze,
however we are doing quite a lot of analyzing (both through autovacuum,
and a manual "vacuum verbose analyze" every night). Our underlying
statistical distribution isn't that changeable.

Thanks,

Richard

Re: Order by (for 15 rows) adds 30 seconds to query time

From

Matthew Wakeling

Date:

02 December 2009, 07:08:27

On Tue, 1 Dec 2009, Jean-Michel Pouré wrote:
> PostgreSQL query analyzer needs to run a couple of times before it can
> rewrite and optimize the query. Make sure demand_id, id and join IDs
> carry indexes.

Huh? At what point does the planner carry over previous plans and use them
to further optimise the query?

But perhaps the biggest factor here is calling a five table join a "pretty
simple query".

Matthew

--
 Prolog doesn't have enough parentheses. -- Computer Science Lecturer

Re: Order by (for 15 rows) adds 30 seconds to query time

From

Craig Ringer

Date:

02 December 2009, 09:47:14

On 2/12/2009 7:08 PM, Matthew Wakeling wrote:
> On Tue, 1 Dec 2009, Jean-Michel Pouré wrote:
>> PostgreSQL query analyzer needs to run a couple of times before it can
>> rewrite and optimize the query. Make sure demand_id, id and join IDs
>> carry indexes.
>
> Huh? At what point does the planner carry over previous plans and use
> them to further optimise the query?
>
> But perhaps the biggest factor here is calling a five table join a
> "pretty simple query".

Some of those tables are views composed of multiple unions, too, by the
looks of things.

Doesn't the planner have some ... issues ... with estimation of row
counts on joins over unions? Or is my memory just more faulty than usual?

--
Craig Ringer

Re: Order by (for 15 rows) adds 30 seconds to query time

From

"Kevin Grittner"

Date:

02 December 2009, 15:55:24

Craig Ringer <craig@postnewspapers.com.au> wrote:

> Some of those tables are views composed of multiple unions, too,
> by the looks of things.
>
> Doesn't the planner have some ... issues ... with estimation of
> row counts on joins over unions? Or is my memory just more faulty
> than usual?

So far I can't tell if it's views with unions or (as I suspect)
inheritance.  The views and tables shown so far reference other
objects not yet shown:

core.inventory
h.location
h.actor

However, I'm pretty sure that the problem is that the estimated row
count explodes for no reason that I can see when the "Nested Loop
Left Join" has an "Append" node from a parent table on the right.

28 rows joined to a 4 row append yields 51160 rows?
51160 rows joined to a 2 row append yields 203176856 rows?
203176856 rows joined to a 2 row append yields 806903677108 rows?

Something seems funny with the math.  I would have expected 28 times
4 times 2 times 2, equaling 448.  Still higher than 15, but only by
one order of magnitude -- where it might still make relatively sane
plan choices.

-Kevin

Re: Order by (for 15 rows) adds 30 seconds to query time

From

Tom Lane

Date:

02 December 2009, 16:39:16

"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> Craig Ringer <craig@postnewspapers.com.au> wrote:
>> Doesn't the planner have some ... issues ... with estimation of
>> row counts on joins over unions? Or is my memory just more faulty
>> than usual?

> So far I can't tell if it's views with unions or (as I suspect)
> inheritance.

As of recent versions there shouldn't be a lot of difference between
inheritance and UNION ALL subqueries --- they're both "appendrels"
to the planner.  And yeah, I think the statistical support is pretty
crummy.

            regards, tom lane

Re: Order by (for 15 rows) adds 30 seconds to query time

From

"Kevin Grittner"

Date:

02 December 2009, 18:18:12

Tom Lane <tgl@sss.pgh.pa.us> wrote:

> And yeah, I think the statistical support is pretty crummy.

Do you know, off-hand, why the estimated row count for a "Nested
Loop Left Join" is not the product of the estimates for the two
sides?  (I fear I'm missing something important which lead to the
current estimates.)

Estimates extracted from the problem plan:

  Nested Loop Left Join  (rows=806903677108)
    ->  Nested Loop Left Join  (rows=203176856)
          ->  Nested Loop Left Join  (rows=51160)
                ->  Nested Loop Left Join  (rows=28)
                ->  Append  (rows=4)
          ->  Append  (rows=2)
    ->  Append  (rows=2)

-Kevin

Re: Order by (for 15 rows) adds 30 seconds to query time

From

Tom Lane

Date:

02 December 2009, 18:20:41

"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> Estimates extracted from the problem plan:

>   Nested Loop Left Join  (rows=806903677108)
>     ->  Nested Loop Left Join  (rows=203176856)
>           ->  Nested Loop Left Join  (rows=51160)
>                 ->  Nested Loop Left Join  (rows=28)
>                 ->  Append  (rows=4)
>           ->  Append  (rows=2)
>     ->  Append  (rows=2)

That does look weird.  Do we have a self-contained test case?

I wouldn't necessarily expect the join rowcount to be exactly the
product of the input rowcounts, but it shouldn't be that far off,
I should think.

            regards, tom lane

Re: Order by (for 15 rows) adds 30 seconds to query time

From

"Kevin Grittner"

Date:

02 December 2009, 18:23:22

Tom Lane <tgl@sss.pgh.pa.us> wrote:

> That does look weird.  Do we have a self-contained test case?

Richard, could you capture the schema for the affected tables and
views with pg_dump -s and also the related rows from pg_statistic?
(The actual table contents aren't needed to see this issue.)

-Kevin

Re: Order by (for 15 rows) adds 30 seconds to query time

From

Richard Neill

Date:

02 December 2009, 19:04:46


Kevin Grittner wrote:
> Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
>> That does look weird.  Do we have a self-contained test case?

Not at the moment. It seems to only occur with relatively complex joins.

>
> Richard, could you capture the schema for the affected tables and
> views with pg_dump -s and also the related rows from pg_statistic?
> (The actual table contents aren't needed to see this issue.)
>

Here are the relevant parts of the schema - I've cut this out of the
source-tree rather than pg_dump, since it seems more readable.

Regarding  pg_statistic, I don't understand how to find the relevant
rows - what am I looking for? (the pg_statistic table is 247M in size).

Thanks for your help,

Richard

THE PROBLEM QUERY
-----------------

SELECT * FROM h.inventory WHERE demand_id = 289276563 ORDER BY id;

#Note that using core.inventory (which is simpler) does not have the 30-second problem.

#In general the h namespace is intended to be a human-readable diagnostic version whereas the core namespace
#is optimised for the application






h.inventory and core.inventory
------------------------------

--
-- Inventory
--
CREATE TABLE core.inventory (
    id bigint NOT NULL DEFAULT core.new_id(),
    material_id bigint NOT NULL,
    location_id bigint NOT NULL,
    qty integer NOT NULL,
    divergence integer NOT NULL DEFAULT 0,
    ctime timestamp with time zone NOT NULL DEFAULT now(),
    actor_id bigint NULL,
    demand_id bigint NULL,
    PRIMARY KEY ( id ),
    FOREIGN KEY ( material_id ) REFERENCES core.__material_id ( id ),
    FOREIGN KEY ( location_id ) REFERENCES core.__location_id ( id ),
    FOREIGN KEY ( actor_id ) REFERENCES core.actor ( id ),
    FOREIGN KEY ( demand_id ) REFERENCES core.demand ( id )
);
CREATE INDEX inventory_material_id ON core.inventory ( material_id );
CREATE INDEX inventory_location_id ON core.inventory ( location_id );
CREATE INDEX inventory_actor_id ON core.inventory ( actor_id );
CREATE INDEX inventory_demand_id ON core.inventory ( demand_id );
CREATE OR REPLACE VIEW h.inventory AS
SELECT    core.inventory.id,
    core.inventory.material_id,
    h_material.tag AS material_tag,
    h_material.name AS material_name,
    core.inventory.location_id,
    h_location.tag AS location_tag,
    h_location.name AS location_name,
    core.inventory.qty,
    core.inventory.divergence,
    core.inventory.ctime,
    core.inventory.actor_id,
    h_actor.tag AS actor_tag,
    h_actor.name AS actor_name,
    core.inventory.demand_id,
    h_demand.target_id,
    h_demand.target_tag,
    h_demand.target_name
FROM    core.inventory
    LEFT OUTER JOIN h.material AS h_material
        ON core.inventory.material_id = h_material.id
    LEFT OUTER JOIN h.location AS h_location
        ON core.inventory.location_id = h_location.id
    LEFT OUTER JOIN h.actor AS h_actor
        ON core.inventory.actor_id = h_actor.id
    LEFT OUTER JOIN h.demand AS h_demand
        ON core.inventory.demand_id = h_demand.id;





h.material and core.material
----------------------------

--
-- Weights, dimensions, and other material data
--

--
-- Materials
--
CREATE TABLE core.material (
    LIKE core.tag
        INCLUDING DEFAULTS INCLUDING CONSTRAINTS INCLUDING INDEXES,
    mass integer NOT NULL CHECK ( mass >= 0 ),
    volume integer NOT NULL CHECK ( volume >= 0 )
) INHERITS ( core.tag );
SELECT core.inherit_unique_index ( 'material', 'tag', 'id' );
SELECT core.inherit_unique_index ( 'material', 'tag', 'tag' );
SELECT core.create_inheritable_unique_index ( 'material', 'id', 'bigint' );
COMMENT ON COLUMN core.material.mass IS
    'Mass in grams';
COMMENT ON COLUMN core.material.volume IS
    'Volume in ml';
CREATE OR REPLACE VIEW h.material AS
SELECT    core.material.id,
    core.material.tag,
    core.material.name,
    core.material.mass,
    core.material.volume
FROM    core.material;












h.location and core.location
----------------------------

--
-- Locations
--
CREATE TABLE core.location (
    LIKE core.tag
        INCLUDING DEFAULTS INCLUDING CONSTRAINTS INCLUDING INDEXES
) INHERITS ( core.tag );
SELECT core.inherit_unique_index ( 'location', 'tag', 'id' );
SELECT core.inherit_unique_index ( 'location', 'tag', 'tag' );
SELECT core.create_inheritable_unique_index ( 'location', 'id', 'bigint' );
CREATE OR REPLACE VIEW h.location AS
SELECT    core.location.id,
    core.location.tag,
    core.location.name
FROM    core.location;







h.actor and core.actor
----------------------

--
-- Actors
--
CREATE TABLE core.actor (
    LIKE core.tag
        INCLUDING DEFAULTS INCLUDING CONSTRAINTS INCLUDING INDEXES
) INHERITS ( core.tag );
SELECT core.inherit_unique_index ( 'actor', 'tag', 'id' );
SELECT core.inherit_unique_index ( 'actor', 'tag', 'tag' );
CREATE OR REPLACE VIEW h.actor AS
SELECT    core.actor.id,
    core.actor.tag,
    core.actor.name
FROM    core.actor;





h.demand and core.demand
-------------------------

--
-- Demand
--
CREATE TABLE core.demand (
    id bigint NOT NULL DEFAULT core.new_id(),
    target_id bigint NOT NULL,
    material_id bigint NOT NULL,
    qty integer NOT NULL,
    -- HACK
    benefit integer NOT NULL DEFAULT 0,
    PRIMARY KEY ( id ),
    UNIQUE ( target_id, material_id ),
    FOREIGN KEY ( target_id ) REFERENCES core.waypoint ( id ),
    FOREIGN KEY ( material_id ) REFERENCES core.__material_id ( id )
);
CREATE INDEX demand_target_id ON core.demand ( target_id );
CREATE INDEX demand_material_id ON core.demand ( material_id );
CREATE OR REPLACE VIEW h.demand AS
SELECT    core.demand.id,
    core.demand.target_id,
    h_target_waypoint.tag AS target_tag,
    h_target_waypoint.name AS target_name,
    core.demand.material_id,
    h_material.tag AS material_tag,
    h_material.name AS material_name,
    core.demand.qty,
    core.demand.benefit
FROM    core.demand
    LEFT OUTER JOIN h.waypoint AS h_target_waypoint
        ON core.demand.target_id = h_target_waypoint.id
    LEFT OUTER JOIN h.material AS h_material
        ON core.demand.material_id = h_material.id;




h.waypoint and core.waypoint
----------------------------

--
-- Waypoints
--
CREATE TABLE core.waypoint (
    LIKE core.location
        INCLUDING DEFAULTS INCLUDING CONSTRAINTS INCLUDING INDEXES,
    is_router boolean NOT NULL,
    is_target boolean NOT NULL,
    is_packer boolean NOT NULL
) INHERITS ( core.location );
SELECT core.inherit_unique_index ( 'waypoint', 'tag', 'id' );
SELECT core.inherit_unique_index ( 'waypoint', 'tag', 'tag' );
SELECT core.inherit_unique_index ( 'waypoint', 'location', 'id' );
COMMENT ON COLUMN core.waypoint.is_router IS
    'Routing decisions may be made at this waypoint';
COMMENT ON COLUMN core.waypoint.is_target IS
    'Routing may be aimed towards this waypoint';
COMMENT ON COLUMN core.waypoint.is_packer IS
    'Containerisation takes place at this waypoint';
CREATE OR REPLACE VIEW h.waypoint AS
SELECT    core.waypoint.id,
    core.waypoint.tag,
    core.waypoint.name,
    core.waypoint.is_router,
    core.waypoint.is_target,
    core.waypoint.is_packer
FROM    core.waypoint;

Re: Order by (for 15 rows) adds 30 seconds to query time

From

"Kevin Grittner"

Date:

02 December 2009, 19:31:49

Richard Neill <rn214@cam.ac.uk> wrote:

> Regarding  pg_statistic, I don't understand how to find the
> relevant rows - what am I looking for? (the pg_statistic table is
> 247M in size).

I think the only relevant rows would be the ones with starelid =
pg_class.oid for a table used in the query, and I think you could
further limit it to rows where staattnum = pg_attribute.attnum for a
column referenced in the WHERE clause or a JOIN's ON clause
(including in the views).  To help match them up, and to cover all
the bases, listing the related pg_class and pg_attribute rows would
help.

Hopefully that will allow us to generate the same plan in an
EXPLAIN, and then see how it gets such an overblown estimate of the
result rows.

-Kevin

Re: Order by (for 15 rows) adds 30 seconds to query time

From

Richard Neill

Date:

02 December 2009, 21:31:12


Kevin Grittner wrote:
> Richard Neill <rn214@cam.ac.uk> wrote:
>
>> Regarding  pg_statistic, I don't understand how to find the
>> relevant rows - what am I looking for? (the pg_statistic table is
>> 247M in size).
>
> I think the only relevant rows would be the ones with starelid =
> pg_class.oid for a table used in the query, and I think you could
> further limit it to rows where staattnum = pg_attribute.attnum for a
> column referenced in the WHERE clause or a JOIN's ON clause
> (including in the views).  To help match them up, and to cover all
> the bases, listing the related pg_class and pg_attribute rows would
> help.
>
> Hopefully that will allow us to generate the same plan in an
> EXPLAIN, and then see how it gets such an overblown estimate of the
> result rows.


Thanks for your explanation. I ran the query:

SELECT * from pg_statistic WHERE starelid IN
   (SELECT oid FROM pg_class where relname IN
     ('demand','waypoint','actor','location','material','inventory')
   );

and it's 228kB compressed, so rather than attaching it, I'm placing it
here:  http://www.richardneill.org/tmp/pg_statistic.bz2


Likewise, the much smaller (16kB) output from:

SELECT * from pg_class where relname IN
   ('demand','waypoint','actor','location','material','inventory');

SELECT * from pg_attribute ;

is at: http://www.richardneill.org/tmp/pg_attribute_pg_class.bz2



P.S. Would it be easier for you if I set up SSH access to a spare
machine, with a copy of the database?


Thanks very much for your help,

Richard