Home > mailing lists

Re: Why LIMIT after scanning the table? - Mailing list pgsql-performance

From	Stephan Szabo
Subject	Re: Why LIMIT after scanning the table?
Date	April 30, 2003 13:19:41
Msg-id	20030430071605.X81999-100000@megazone23.bigpanda.com Whole thread Raw
In response to	Why LIMIT after scanning the table? ("Jim C. Nasby" <jim@nasby.net>)
List	pgsql-performance

Tree view

On Wed, 30 Apr 2003, Jim C. Nasby wrote:

> I'm doing something where I just need to know if we have more than 100
> rows in a table. Not wanting to scan the whole table, I thought I'd get
> cute...
>
> explain select count(*)
>     FROM (SELECT * FROM email_rank WHERE project_id = :ProjectID LIMIT 100) AS t1;
>                                      QUERY PLAN
> -------------------------------------------------------------------------------------
>  Aggregate  (cost=111.32..111.32 rows=1 width=48)
>    ->  Subquery Scan t1  (cost=0.00..111.07 rows=100 width=48)
>          ->  Limit  (cost=0.00..111.07 rows=100 width=48)
>                ->  Seq Scan on email_rank  (cost=0.00..76017.40 rows=68439 width=48)
>                      Filter: (project_id = 24)
>
> The idea is that the inner-most query would only read the first 100 rows
> it finds, then stop. Instead, if explain is to be believed (and speed
> testing seems to indicate it's accurate), we'll read the entire table,
> *then* pick the first 100 rows. Why is that?

I'd suggest looking at explain analyze rather than explain.  In most cases
I've seen what it'll actually grab is limit+1 rows (I think cvs will only
grab limit) in the actual rows.  It shows you the full count for the
sequence scan in explain, but notice that the limit cost is lower than
that of the sequence scan.

pgsql-performance by date:

From: Manfred Koizar
Date: 30 April 2003, 13:17:25
Subject: Re: More tablescanning fun

From: Tom Lane
Date: 30 April 2003, 13:22:25
Subject: Re: Why LIMIT after scanning the table?

Re: Why LIMIT after scanning the table? - Mailing list pgsql-performance

Previous

Next