Re: overzealous sorting? - Mailing list pgsql-performance

From Mark Kirkwood
Subject Re: overzealous sorting?
Date
Msg-id 4E8195D3.8010809@catalyst.net.nz
Whole thread Raw
In response to Re: overzealous sorting?  (anthony.shipman@symstream.com)
Responses Re: overzealous sorting?  (anthony.shipman@symstream.com)
List pgsql-performance
On 27/09/11 22:05, anthony.shipman@symstream.com wrote:
>
> What I really want is to just read a sequence of records in timestamp order
> between two timestamps. The number of records to be read may be in the
> millions totalling more than 1GB of data so I'm trying to read them a slice
> at a time but I can't get PG to do just this.
>
> If I use offset and limit to grab a slice of the records from a large
> timestamp range then PG will grab all of the records in the range, sort them
> on disk and return just the slice I want. This is absurdly slow.
>
> The query that I've shown is one of a sequence of queries with the timestamp
> range progressing in steps of 1 hour through the timestamp range. All I want
> PG to do is find the range in the index, find the matching records in the
> table and return them. All of the planner's cleverness just seems to get in
> the way.
>

It is not immediately clear that the planner is making the wrong choices
here. Index scans are not always the best choice, it depends heavily on
the correlation of the column concerned to the physical order of the
table's heap file. I suspect the reason for the planner choosing the
bitmap scan is that said correlation is low (consult pg_stats to see).
Now if you think that the table's heap data is cached anyway, then this
is not such an issue - but you have to tell the planner that by reducing
random_page_cost (as advised previously). Give it a try and report back!

regards

Mark

pgsql-performance by date:

Previous
From: anthony.shipman@symstream.com
Date:
Subject: Re: overzealous sorting?
Next
From: Marti Raudsepp
Date:
Subject: Re: Ineffective autovacuum