Re: overzealous sorting? - Mailing list pgsql-performance

From Marc Cousin
Subject Re: overzealous sorting?
Date
Msg-id 20110927170004.784d7814@marco-dalibo
Whole thread Raw
In response to Re: overzealous sorting?  (anthony.shipman@symstream.com)
List pgsql-performance
Le Tue, 27 Sep 2011 19:05:09 +1000,
anthony.shipman@symstream.com a écrit :

> On Tuesday 27 September 2011 18:54, Marc Cousin wrote:
> > The thing is, the optimizer doesn't know if your data will be in
> > cache when you will run your query… if you are sure most of your
> > data is in the cache most of the time, you could try to tune
> > random_page_cost (lower it) to reflect that data is cached. But if
> > the win is small on this query, it may not be worth it.
>
> What I really want is to just read a sequence of records in timestamp
> order between two timestamps. The number of records to be read may be
> in the millions totalling more than 1GB of data so I'm trying to read
> them a slice at a time but I can't get PG to do just this.
>
> If I use offset and limit to grab a slice of the records from a large
> timestamp range then PG will grab all of the records in the range,
> sort them on disk and return just the slice I want. This is absurdly
> slow.
>
> The query that I've shown is one of a sequence of queries with the
> timestamp range progressing in steps of 1 hour through the timestamp
> range. All I want PG to do is find the range in the index, find the
> matching records in the table and return them. All of the planner's
> cleverness just seems to get in the way.
>

Maybe you should try using a cursor, if you don't know where you'll
stop. This associated with a very low cursor_tuple_fraction will
probably give you what you want (a fast start plan).

pgsql-performance by date:

Previous
From: "Kevin Grittner"
Date:
Subject: Re: : Tracking Full Table Scans
Next
From: Antonio Rodriges
Date:
Subject: Re: [PERFORMANCE] Insights: fseek OR read_cluster?