what's going on here? - Mailing list pgsql-general

From Ben
Subject what's going on here?
Date
Msg-id Pine.LNX.4.10.10103091131040.28803-100000@gilgamesh.eos.SilentMedia.com
Whole thread Raw
Responses Re: what's going on here?
Re: what's going on here?
List pgsql-general
We have two postgres-driven web pages, and we've noticed in the last few
days that one is loading about 10 times faster than the other. Looking
into it, we found out that postgres has very different execution paths for
what is essentially the same query, and this makes no sense to me.

Sorry for the deludge of info here, but maybe it'll explain why this is
happening....

First, here are the explains. Note the only parameter that changes is
playlist.stream, and yet for some reason the second query makes use of
indices which are unrelated to playlist.stream.

music=# explain select users.name as username, playlist.id as id,
paths.len as len, paths.path as path, timestamp(playlist.added) as added,
timestamp(playlist.played) as played from users, playlist, paths where
paths.md5=files.md5 and files.song=playlist.songid and
playlist.userid=users.id and playlist.played is null and playlist.stream=1
order by playlist.added, playlist.id;
NOTICE:  QUERY PLAN:

Sort  (cost=10845.28..10845.28 rows=1857 width=80)
  ->  Hash Join  (cost=10007.93..10744.44 rows=1857 width=80)
        ->  Merge Join  (cost=10006.44..10668.17 rows=1857 width=64)
              ->  Sort  (cost=3754.03..3754.03 rows=2321 width=36)
                    ->  Hash Join  (cost=306.62..3624.28 rows=2321 width=36)
                          ->  Seq Scan on files  (cost=0.00..1072.13 rows=55413 width=16)
                          ->  Hash  (cost=300.81..300.81 rows=2321 width=20)
                                ->  Seq Scan on playlist (cost=0.00..300.81 rows=2321 width=20)
              ->  Sort  (cost=6252.41..6252.41 rows=50617 width=28)
                    ->  Seq Scan on paths  (cost=0.00..1677.17 rows=50617 width=28)
        ->  Hash  (cost=1.39..1.39 rows=39 width=16)
              ->  Seq Scan on users  (cost=0.00..1.39 rows=39 width=16)

EXPLAIN
music=# explain select users.name as username, playlist.id as id,
paths.len as len, paths.path as path, timestamp(playlist.added) as added,
timestamp(playlist.played) as played from users, playlist, paths where
paths.md5=files.md5 and files.song=playlist.songid and
playlist.userid=users.id and playlist.played is null and playlist.stream=2
order by playlist.added, playlist.id;
NOTICE:  QUERY PLAN:

Sort  (cost=1467.44..1467.44 rows=164 width=80)
  ->  Hash Join  (cost=1.49..1461.40 rows=164 width=80)
        ->  Nested Loop  (cost=0.00..1452.85 rows=164 width=64)
              ->  Nested Loop  (cost=0.00..771.76 rows=205 width=36)
                    ->  Seq Scan on playlist  (cost=0.00..300.81 rows=205 width=20)
                    ->  Index Scan using files_song_btree_key on files (cost=0.00..2.28 rows=1 width=16)
              ->  Index Scan using paths_md5_btree_key on paths (cost=0.00..3.31 rows=1 width=28)
        ->  Hash  (cost=1.39..1.39 rows=39 width=16)
              ->  Seq Scan on users  (cost=0.00..1.39 rows=39 width=16)

EXPLAIN


Okay, so now the relevant schema:

                              Table "users"
  Attribute  |   Type    |                    Modifier
-------------+-----------+------------------------------------------------
 id          | integer   | not null default nextval('users_id_seq'::text)
 name        | text      | not null
Indices: users_name_key,
         users_pkey

Index "users_name_key"
 Attribute | Type
-----------+------
 name      | text
unique btree

 Index "users_pkey"
 Attribute |  Type
-----------+---------
 id        | integer
unique btree (primary key)




                            Table "playlist"
 Attribute |  Type   |                     Modifier
-----------+---------+---------------------------------------------------
 id        | integer | not null default nextval('playlist_id_seq'::text)
 songid    | integer |
 userid    | integer |
 added     | integer |
 played    | integer |
 stream    | integer |
Indices: playlist_added_key,
         playlist_pkey,
         playlist_played_key,
         playlist_songid_key

Index "playlist_added_key"
 Attribute |  Type
-----------+---------
 added     | integer
btree

Index "playlist_pkey"
 Attribute |  Type
-----------+---------
 id        | integer
unique btree (primary key)

Index "playlist_played_key"
 Attribute |  Type
-----------+---------
 played    | integer
btree

Index "playlist_songid_key"
 Attribute |  Type
-----------+---------
 songid    | integer
btree


          Table "paths"
 Attribute |   Type    | Modifier
-----------+-----------+----------
 md5       | char(32)  | not null
 path      | text      | not null
 len       | integer   |
Indices: paths_md5_btree_key,
         paths_md5_key,
         paths_path_key

Index "paths_md5_btree_key"
 Attribute |   Type
-----------+----------
 md5       | char(32)
btree

Index "paths_md5_key"
 Attribute |   Type
-----------+----------
 md5       | char(32)
hash

Index "paths_path_key"
 Attribute | Type
-----------+------
 path      | text
btree



          Table "files"
 Attribute |   Type   | Modifier
-----------+----------+----------
 song      | integer  | not null
 md5       | char(32) | not null
Indices: files_md5_btree_key,
         files_md5_key,
         files_song_btree_key,
         files_song_key

Index "files_md5_btree_key"
 Attribute |   Type
-----------+----------
 md5       | char(32)
btree

Index "files_md5_key"
 Attribute |   Type
-----------+----------
 md5       | char(32)
hash

Index "files_song_btree_key"
 Attribute |  Type
-----------+---------
 song      | integer
btree

Index "files_song_key"
 Attribute |  Type
-----------+---------
 song      | integer
hash



Interestingly, the sequential scan on playlist claims to be returning 2000
results for stream=1 and only 200 for stream=2. I'm not sure which part of
the where clause this guess comes from, because the playlist table has
equal numbers of entries for both streams.



pgsql-general by date:

Previous
From: Stephan Szabo
Date:
Subject: Re: functions for triggers: passing parameters
Next
From: Stephan Szabo
Date:
Subject: Re: what's going on here?