Re: Sequential Scans - Mailing list pgsql-general

From Richard Huxton
Subject Re: Sequential Scans
Date
Msg-id 200303070958.16916.dev@archonet.com
Whole thread Raw
In response to Sequential Scans  (Ericson Smith <eric@did-it.com>)
List pgsql-general
On Thursday 06 Mar 2003 9:15 pm, Ericson Smith wrote:
> Hi,
>
> I have a table with about 3.2 Million records.
>
> There is one process that I run that needs to process batches of records
> 1000 at a time out of a set of approximately 220,000 records.
>
> So my query looks like this:
> SELECT a.*, b.url FROM listings a, urls b WHERE a.urlindex=b.index AND
> a.haslid=1 ORDER BY a.index LIMIT 1000 OFFSET 0;
>
> Doing the above query with an offset of up to 5000 (the 5th batch) shows
> (with EXPLAIN) that index scans are being used.
>
> Exceeding an OFFSET of 5000 produces sequential scans. The whole process
> goes horribly slow at that point.

To get to an offset of 5000 it has to find the first 4999 and then ignore
them. There comes a point at which the planner decides the cost of grabbing
many index items exceeds the cost of just reading the table. It might be in
your case the planner's estimating the wrong changeover point - read up on
the runtime environment settings, I think CPU_INDEX_TUPLE_COST may be your
friend (see archives for discussion).

In your case though, you might want to look at using a cursor and then
fetching blocks of 1000 rows at a time.

--
  Richard Huxton

pgsql-general by date:

Previous
From: "Mr.F"
Date:
Subject: New Interface for Win
Next
From: Justin Clift
Date:
Subject: Re: designer tool for pgsql