Home > mailing lists

Re: index scan on =, but not < ? - Mailing list pgsql-performance

From	John Arbash Meinel
Subject	Re: index scan on =, but not < ?
Date	March 8, 2005 19:01:41
Msg-id	422DF680.1000409@arbash-meinel.com Whole thread Raw
In response to	index scan on =, but not < ? ("Rick Schumeyer" <rschumeyer@ieee.org>)
List	pgsql-performance

Tree view

Rick Schumeyer wrote:

> I have two index questions. The first is about an issue that has been
> recently discussed,
>
> and I just wanted to be sure of my understanding. Functions like
> count(), max(), etc. will
>
> use sequential scans instead of index scans because the index doesn’t
> know which rows
>
> are actually visible…is this correct?
>
Actually, index scans are chosen whenever the cost is expected to be
cheaper than a sequential scan. This is generally about < 10% of the
total number of rows.

> Second:
>
> I created an index in a table with over 10 million rows.
>
> The index is on field x, which is a double.
>
> The following command, as I expected, results in an index scan:
>
> =# explain select * from data where x = 0;
>
> QUERY PLAN
>
> -------------------------------------------------------------------------
>
> Index Scan using data_x_ix on data (cost=0.00..78.25 rows=19 width=34)
>
> Index Cond: (x = 0::double precision)
>
> (2 rows)
>
Since you have 10m rows, when it expects to get only 19 rows, it is much
faster to use an index.

> But this command, in which the only difference if > instead of =, is a
> sequential scan.
>
> =# explain select * from data where x > 0;
>
> QUERY PLAN
>
> ------------------------------------------------------------------
>
> Seq Scan on data (cost=0.00..1722605.20 rows=62350411 width=34)
>
> Filter: (x > 0::double precision)
>
> (2 rows)
>
Here, pg expects to find 62M rows (you must have significantly more than
10M rows). In this case a sequential scan is much faster than an indexed
one, so that's what pg does.

> Why is this?
>
> (This is with pg 8.0.1 on a PC running FC3 with 1GB ram…if it matters)
>
If you think there is truly a performance problem, try attaching the
results of "explain analyze" in which we might be able to tell you that
your statistics inaccurate (run vacuum analyze if you haven't).

John
=:->

Attachment

signature.asc

pgsql-performance by date:

From: Thomas F.O'Connell
Date: 08 March 2005, 18:52:48
Subject: Re: index scan on =, but not < ?

From: "Rick Schumeyer"
Date: 08 March 2005, 19:02:56
Subject: Re: index scan on =, but not < ?

Re: index scan on =, but not < ? - Mailing list pgsql-performance

Attachment

Previous

Next