Home > mailing lists

Re: Sequential Scan with LIMIT - Mailing list pgsql-performance

From	Tom Lane
Subject	Re: Sequential Scan with LIMIT
Date	October 24, 2004 21:11:58
Msg-id	17689.1098648713@sss.pgh.pa.us Whole thread Raw
In response to	Sequential Scan with LIMIT (John Meinel <john@johnmeinel.com>)
Responses	Re: Sequential Scan with LIMIT Re: Sequential Scan with LIMIT
List	pgsql-performance

Tree view

John Meinel <john@johnmeinel.com> writes:
> I was looking into another problem, and I found something that surprised
> me. If I'm doing "SELECT * FROM mytable WHERE col = 'myval' LIMIT 1.".
> Now "col" is indexed, by mytable has 500,000 rows, and 'myval' occurs
> maybe 100,000 times. Without the LIMIT, this query should definitely do
> a sequential scan.

> But with the LIMIT, doesn't it know that it will return at max 1 value,
> and thus be able to use the index?

But the LIMIT will cut the cost of the seqscan case too.  Given the
numbers you posit above, about one row in five will have 'myval', so a
seqscan can reasonably expect to hit the first matching row in the first
page of the table.  This is still cheaper than doing an index scan
(which must require reading at least one index page plus at least one
table page).

The test case you are showing is probably suffering from nonrandom
placement of this particular data value; which is something that the
statistics we keep are too crude to detect.

            regards, tom lane

pgsql-performance by date:

From: John Meinel
Date: 24 October 2004, 20:28:00
Subject: Sequential Scan with LIMIT

From: John Meinel
Date: 24 October 2004, 22:51:59
Subject: Re: Sequential Scan with LIMIT

Re: Sequential Scan with LIMIT - Mailing list pgsql-performance

Previous

Next