Re: sequence scan on PK - Mailing list pgsql-performance

From John A Meinel
Subject Re: sequence scan on PK
Date
Msg-id 427E17BC.7050407@arbash-meinel.com
Whole thread Raw
In response to sequence scan on PK  (Jeroen van Iddekinge <iddekingej@lycos.com>)
Responses Re: sequence scan on PK  (Jeroen van Iddekinge <iddekingej@lycos.com>)
List pgsql-performance
Jeroen van Iddekinge wrote:
> Hi,
>
>
> I understand that when a table contains only a few rows it is better to
> do a sequence scan than an index scan. But is this also for a table with
> 99 records?
>

...

> explain select * from tblFolders where id=90;
>                        QUERY PLAN
> -----------------------------------------------------------
> Seq Scan on tblfolders  (cost=0.00..3.24 rows=1 width=50)
>   Filter: (id = 90)
>
>
> (I have analyze table bit still a sequence scan).
>
> With how manys rows it is  ok to do an index scan or sequence scan? How
> is this calculated in pg?
>
> Regards
> Jer

It depends on how many pages need to be read. To do an index scan you
need to read the index pages, and then you read the page where the
actual row resides.

Usually the comment is if you are selecting >5% of the rows, seqscan is
faster than an index scan. If I'm reading your query correctly, it is
estimating needing to read about 3 pages to get the row you are asking
for. If you used an index, it probably would have to read at least that
many pages, and they would not be a sequential read, so it should be slower.

If you want to prove it, try:

\timing
EXPLAIN ANALYZE SELECT * FROM tblFolders WHERE id=90;
EXPLAIN ANALYZE SELECT * FROM tblFolders WHERE id=90;
EXPLAIN ANALYZE SELECT * FROM tblFolders WHERE id=90;

SET enable_seqscan TO OFF;

EXPLAIN ANALYZE SELECT * FROM tblFolders WHERE id=90;
EXPLAIN ANALYZE SELECT * FROM tblFolders WHERE id=90;
EXPLAIN ANALYZE SELECT * FROM tblFolders WHERE id=90;


Run multiple times to make sure everything is cached, and take the
fastest time. On your machine it might be true that the index scan is
slightly faster than the seqscan in this exact circumstance. But I have
the feeling the time is trivially different, and if you had say 70 rows
it would favor seqscan. Probably somewhere at 150-200 rows it will
switch on it's own.

You could tweak with several settings to get it to do an index scan
earlier, but these would probably break other queries. You don't need to
tune for 100 rows, more like 100k or 100M.

John
=:->

Attachment

pgsql-performance by date:

Previous
From: Jeroen van Iddekinge
Date:
Subject: sequence scan on PK
Next
From: Jeroen van Iddekinge
Date:
Subject: Re: sequence scan on PK