Thread: why am I getting a seq scan on this query?

why am I getting a seq scan on this query?

From
Mark Harrison
Date:
I'm expecting this to do an indexed scan... any clue why it's not?  This is
with PG 7.4.

Thanks!!

planb=# explain select id,shotname from df_files where showid=30014515::bigint;
                             QUERY PLAN
-------------------------------------------------------------------
  Seq Scan on df_files  (cost=0.00..791035.45 rows=540370 width=22)
    Filter: (showid = 30014515::bigint)
(2 rows)



planb=# \d df_files;
                Table "public.df_files"
   Column   |            Type             | Modifiers
-----------+-----------------------------+-----------
  id        | bigint                      | not null
  showid    | bigint                      | not null
  shotname  | character varying(256)      | not null
  elemname  | character varying(256)      | not null
  frameno   | character varying(12)       | not null
  ext       | character varying(12)       | not null
  filename  | character varying(256)      | not null
  filesize  | bigint                      |
  locked    | boolean                     |
  timestamp | timestamp without time zone |
Indexes:
     "df_files_pkey" primary key, btree (id)
     "df_files_elemname" btree (elemname)
     "df_files_ext" btree (ext)
     "df_files_filename" btree (filename)
     "df_files_frameno" btree (frameno)
     "df_files_shotname" btree (shotname)
     "df_files_show" btree (showid)
     "df_files_showid" btree (showid)

planb=# select count(*) from df_files where showid=30014515::bigint;
  count
--------
  528362
(1 row)

Time: 420598.071 ms
planb=# select count(*) from df_files;
   count
----------
  24415513
(1 row)

Time: 306554.085 ms

Re: why am I getting a seq scan on this query?

From
"Joshua D. Drake"
Date:
Mark Harrison wrote:
> I'm expecting this to do an indexed scan... any clue why it's not?  This is
> with PG 7.4.

Someone might have a better idea but my guess is that PG things the
seq_scan would be faster. You could try decreasing your
random_page_cost. I have also heard that setting your (although I
haven't tested this) effective_cache_size higher then normal helps in
these scenarios but your mileage may vary.

Sincerely,

Joshua D. Drake



>
> Thanks!!
>
> planb=# explain select id,shotname from df_files where
> showid=30014515::bigint;
>                             QUERY PLAN
> -------------------------------------------------------------------
>  Seq Scan on df_files  (cost=0.00..791035.45 rows=540370 width=22)
>    Filter: (showid = 30014515::bigint)
> (2 rows)
>
>
>
> planb=# \d df_files;
>                Table "public.df_files"
>   Column   |            Type             | Modifiers
> -----------+-----------------------------+-----------
>  id        | bigint                      | not null
>  showid    | bigint                      | not null
>  shotname  | character varying(256)      | not null
>  elemname  | character varying(256)      | not null
>  frameno   | character varying(12)       | not null
>  ext       | character varying(12)       | not null
>  filename  | character varying(256)      | not null
>  filesize  | bigint                      |
>  locked    | boolean                     |
>  timestamp | timestamp without time zone |
> Indexes:
>     "df_files_pkey" primary key, btree (id)
>     "df_files_elemname" btree (elemname)
>     "df_files_ext" btree (ext)
>     "df_files_filename" btree (filename)
>     "df_files_frameno" btree (frameno)
>     "df_files_shotname" btree (shotname)
>     "df_files_show" btree (showid)
>     "df_files_showid" btree (showid)
>
> planb=# select count(*) from df_files where showid=30014515::bigint;
>  count
> --------
>  528362
> (1 row)
>
> Time: 420598.071 ms
> planb=# select count(*) from df_files;
>   count
> ----------
>  24415513
> (1 row)
>
> Time: 306554.085 ms
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: explain analyze is your friend


--
The PostgreSQL Company - Command Prompt, Inc. 1.503.667.4564
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Managed Services, Shared and Dedicated Hosting
Co-Authors: plPHP, plPerlNG - http://www.commandprompt.com/

Re: why am I getting a seq scan on this query?

From
Tom Lane
Date:
"Joshua D. Drake" <jd@commandprompt.com> writes:
> Someone might have a better idea but my guess is that PG things the
> seq_scan would be faster.

That's what it thinks, and it might be right.  This query is fetching 2%
of the table, which is near the crossover point where a seqscan is
faster, assuming that the rows aren't very wide and the target rows are
fairly randomly distributed through the table's pages.

> You could try decreasing your random_page_cost.

First thing to do is force the plan choice (set enable_seqscan = off)
and see what timings you actually get each way.  If the planner really
is guessing materially wrong, then adjusting the cost parameters is
called for.  Don't set them on the basis of a single test case though...

BTW, the bitmap indexscan method available in PG 8.1 can do a lot better
than plain indexscan for scenarios like this, so updating to 8.1 might
be a good answer too.

            regards, tom lane