Thread: Performance problems, bad estimates and plan

Performance problems, bad estimates and plan

From
Allan Wang
Date:
It seems that Postgres is estimating that all rows in a 50k row table
will be returned, but only one should match. The query runs slow because
of the seqscan. When I set enable_seqscan to off, then it does an index
scan and it runs quickly.

I've set the statistics target on the index to 100 and 1000, and they
don't make a difference in the plan. I've also ran VACUUM ANALYZE right
before the query.

Here is my query, output of EXPLAIN ANALYZE, and my tables:
I'm not sure how wrapping will make this look, so I've put it into a
pastebin also, if it makes it easier to read:
http://rafb.net/paste/results/RqeyX523.nln.html

talluria=# explain analyze SELECT t.*, p.name AS owner, c.name FROM
tiles AS t LEFT JOIN cities AS c USING (cityid) LEFT JOIN players p
USING (playerid) WHERE box(t.coord, t.coord) ~= box(point (4,3), point
(4,3));
                                                                  QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------
 Merge Right Join  (cost=119.07..122.13 rows=52 width=55) (actual time=232.777..232.780 rows=1 loops=1)
   Merge Cond: ("outer".playerid = "inner".playerid)
   ->  Index Scan using users_pkey on players p  (cost=0.00..4138.82 rows=56200 width=8) (actual time=0.017..122.409
rows=56200loops=1) 
   ->  Sort  (cost=119.07..119.20 rows=52 width=55) (actual time=0.070..0.072 rows=1 loops=1)
         Sort Key: c.playerid
         ->  Hash Left Join  (cost=1.03..117.59 rows=52 width=55) (actual time=0.045..0.059 rows=1 loops=1)
               Hash Cond: ("outer".cityid = "inner".cityid)
               ->  Index Scan using tiles_coord_key on tiles t  (cost=0.00..116.29 rows=52 width=37) (actual
time=0.014..0.026rows=1 loops=1) 
                     Index Cond: (box(coord, coord) ~= '(4,3),(4,3)'::box)
               ->  Hash  (cost=1.02..1.02 rows=2 width=22) (actual time=0.017..0.017 rows=0 loops=1)
                     ->  Seq Scan on cities c  (cost=0.00..1.02 rows=2 width=22) (actual time=0.008..0.012 rows=2
loops=1)
 Total runtime: 232.893 ms
(12 rows)

talluria=# set enable_seqscan = false;
SET
talluria=# explain analyze SELECT t.*, p.name AS owner, c.name FROM tiles AS t LEFT JOIN cities AS c USING (cityid)
LEFTJOIN players p USING (playerid) WHERE box(t.coord, t.coord) ~= box(point (4,3), point (4,3)); 
                                                                     QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------------
 Merge Left Join  (cost=121.07..124.14 rows=52 width=55) (actual time=0.102..0.105 rows=1 loops=1)
   Merge Cond: ("outer".playerid = "inner".playerid)
   ->  Sort  (cost=121.07..121.20 rows=52 width=55) (actual time=0.076..0.077 rows=1 loops=1)
         Sort Key: c.playerid
         ->  Hash Left Join  (cost=3.03..119.59 rows=52 width=55) (actual time=0.053..0.066 rows=1 loops=1)
               Hash Cond: ("outer".cityid = "inner".cityid)
               ->  Index Scan using tiles_coord_key on tiles t  (cost=0.00..116.29 rows=52 width=37) (actual
time=0.014..0.026rows=1 loops=1) 
                     Index Cond: (box(coord, coord) ~= '(4,3),(4,3)'::box)
               ->  Hash  (cost=3.02..3.02 rows=2 width=22) (actual time=0.026..0.026 rows=0 loops=1)
                     ->  Index Scan using cities_userid_key on cities c  (cost=0.00..3.02 rows=2 width=22) (actual
time=0.016..0.021rows=2 loops=1) 
   ->  Index Scan using users_pkey on players p  (cost=0.00..4138.82 rows=56200 width=8) (actual time=0.012..0.012
rows=1loops=1) 
 Total runtime: 0.200 ms
(12 rows)

talluria=# \d tiles
                                       Table "public.tiles"
 Column |       Type        |                              Modifiers
--------+-------------------+----------------------------------------------------------------------
 tileid | integer           | not null default nextval('tiles_tileid_seq'::text)
 mapid  | integer           | not null default 1
 tile   | character varying | not null default 'field'::character varying
 coord  | point             | not null default point((0)::double precision, (0)::double precision)
 cityid | integer           |
Indexes:
    "times_pkey" PRIMARY KEY, btree (tileid) CLUSTER
    "tiles_cityid_key" btree (cityid)
    "tiles_coord_key" rtree (box(coord, coord))
Foreign-key constraints:
    "tiles_cityid_fkey" FOREIGN KEY (cityid) REFERENCES cities(cityid) ON UPDATE CASCADE ON DELETE SET NULL

talluria=# \d cities
                                   Table "public.cities"
   Column    |         Type          |                      Modifiers
-------------+-----------------------+-----------------------------------------------------
 cityid      | integer               | not null default nextval('cities_cityid_seq'::text)
 playerid    | integer               | not null default 0
 bordercolor | character(6)          | not null default '0000ff'::bpchar
 citystatus  | smallint              | not null default 0
 name        | character varying(30) | not null
Indexes:
    "cities_pkey" PRIMARY KEY, btree (cityid)
    "cities_cityname_uikey" UNIQUE, btree (lower(name::text))
    "cities_userid_key" btree (playerid)
Foreign-key constraints:
    "cities_userid_fkey" FOREIGN KEY (playerid) REFERENCES players(playerid) ON UPDATE CASCADE ON DELETE CASCADE

talluria=# \d players
                                             Table "public.players"
     Column      |          Type          |                              Modifiers
-----------------+------------------------+----------------------------------------------------------------------
 playerid        | integer                | not null default nextval('players_playerid_seq'::text)
 username        | character varying(30)  | not null default ''::character varying
 md5password     | character(32)          | not null default (''::bpchar)::character(1)
 name            | character varying(100) | not null default ''::character varying
 email           | character varying(50)  | not null default ''::character varying
(snipped a few irrelavent columns)
Indexes:
    "users_pkey" PRIMARY KEY, btree (playerid)
    "players_username_key" UNIQUE, btree (username, md5password)
    "users_username_lkey" UNIQUE, btree (lower(username::text))
    "users_coord_key" rtree (box(coord, coord))
Foreign-key constraints:
    "players_stylesheet_fkey" FOREIGN KEY (stylesheet) REFERENCES stylesheets(stylesheetid) ON UPDATE CASCADE ON DELETE
SETDEFAULT 
    "users_arm" FOREIGN KEY (arm) REFERENCES items(itemid) ON UPDATE CASCADE ON DELETE SET NULL
    "users_activefight_pkey" FOREIGN KEY (activefight) REFERENCES monsters(monsterid) ON UPDATE CASCADE ON DELETE SET
NULL
    "players_map_fkey" FOREIGN KEY (map) REFERENCES maps(mapid) ON UPDATE CASCADE ON DELETE SET DEFAULT
    "users_belt" FOREIGN KEY (belt) REFERENCES items(itemid) ON UPDATE CASCADE ON DELETE SET NULL
    "users_body" FOREIGN KEY (body) REFERENCES items(itemid) ON UPDATE CASCADE ON DELETE SET NULL
    "users_head" FOREIGN KEY (head) REFERENCES items(itemid) ON UPDATE CASCADE ON DELETE SET NULL
    "users_lefthand" FOREIGN KEY (lefthand) REFERENCES items(itemid) ON UPDATE CASCADE ON DELETE SET NULL
    "users_leg" FOREIGN KEY (leg) REFERENCES items(itemid) ON UPDATE CASCADE ON DELETE SET NULL
    "users_righthand" FOREIGN KEY (righthand) REFERENCES items(itemid) ON UPDATE CASCADE ON DELETE SET NULL

Thanks in advance for any help,
Allan Wang


Re: Performance problems, bad estimates and plan

From
Tom Lane
Date:
Allan Wang <allanvv@gmail.com> writes:
> It seems that Postgres is estimating that all rows in a 50k row table
> will be returned, but only one should match.

I think this is the same issue fixed here:

2005-04-03 21:43  tgl

    * src/backend/optimizer/path/: costsize.c (REL7_4_STABLE),
    costsize.c (REL8_0_STABLE), costsize.c: In cost_mergejoin, the
    early-exit effect should not apply to the outer side of an outer
    join.  Per andrew@supernews.

Are you running 7.4.8 or 8.0.2 or later?

            regards, tom lane

Re: Performance problems, bad estimates and plan

From
Tom Lane
Date:
Allan Wang <allanvv@gmail.com> writes:
> On Wed, 2005-06-08 at 13:02 -0400, Tom Lane wrote:
>> Are you running 7.4.8 or 8.0.2 or later?

> I'm running 8.0.2 on Gentoo.

Oh, OK [ looks again ... ]  I read the join backward, the issue I was
concerned about would've applied to a right join there not left.

The seqscan vs indexscan difference is a red herring: if you look at the
explain output, the only thing that changes to an indexscan is the scan
on cities, which is only two rows and is not taking any time anyway.
The thing that is taking a long time (or not) is the indexscan over
players.  The planner is expecting that to stop short of completion
(presumably based on comparing the maximum values of playerid in
the two tables) --- and in one plan it does so, so the planner's
logic is apparently correct.

Are there any NULLs in c.playerid?  We found an executor issue recently
that it would not figure out it could stop the scan if there was a NULL
involved.

            regards, tom lane

Re: Performance problems, bad estimates and plan

From
Tom Lane
Date:
[ Please cc your responses to the list; other people may be interested
  in the same problem ]

Allan Wang <allanvv@gmail.com> writes:
> On Wed, 2005-06-08 at 13:39 -0400, Tom Lane wrote:
>> Are there any NULLs in c.playerid?

> Here is the contents of cities:

I'm sorry, what I should've said is "are there any NULLs in c.playerid
in the output of the first LEFT JOIN?"  In practice that means "does
the selected row of tiles actually join to cities?"

            regards, tom lane

Re: Performance problems, bad estimates and plan

From
Tom Lane
Date:
Allan Wang <allanvv@gmail.com> writes:
> No, the tiles row doesn't join with cities:

Uh-huh, so it's the same issue described here:
http://archives.postgresql.org/pgsql-performance/2005-05/msg00219.php

This is fixed in CVS tip but the change was large enough that I'm
disinclined to try to back-port it ...

            regards, tom lane