Thread: Speed with offset clause

Speed with offset clause

From

Yves Vindevogel

Date:

24 June 2005, 18:18:56

Hi again all,


My queries are now optimised.  They all use the indexes like they
should.

However, there's still a slight problem when I issue the "offset"
clause.


We have a table that contains 600.000 records

We display them by 25 in the webpage.

So, when I want the last page, which is: 600k / 25 = page 24000 - 1 =
23999, I issue the offset of 23999 * 25

This take a long time to run, about 5-10 seconds whereas offset below
100 take less than a second.


Can I speed this up ?



Met vriendelijke groeten,

Bien à vous,

Kind regards,


<bold>Yves Vindevogel</bold>

<bold>Implements</bold>

<smaller>

</smaller>Hi again all,

My queries are now optimised.  They all use the indexes like they
should.
However, there's still a slight problem when I issue the "offset"
clause.

We have a table that contains 600.000 records
We display them by 25 in the webpage.
So, when I want the last page, which is: 600k / 25 = page 24000 - 1 =
23999, I issue the offset of 23999 * 25
This take a long time to run, about 5-10 seconds whereas offset below
100 take less than a second.

Can I speed this up ?

Met vriendelijke groeten,
Bien à vous,
Kind regards,

Yves Vindevogel
Implements

<smaller>


Mail: yves.vindevogel@implements.be  - Mobile: +32 (478) 80 82 91


Kempische Steenweg 206 - 3500 Hasselt - Tel-Fax: +32 (11) 43 55 76


Web: http://www.implements.be

<italic><x-tad-smaller>

First they ignore you.  Then they laugh at you.  Then they fight you.
Then you win.

Mahatma Ghandi.</x-tad-smaller></italic></smaller>


Mail: yves.vindevogel@implements.be  - Mobile: +32 (478) 80 82 91

Kempische Steenweg 206 - 3500 Hasselt - Tel-Fax: +32 (11) 43 55 76

Web: http://www.implements.be

First they ignore you.  Then they laugh at you.  Then they fight you.
Then you win.
Mahatma Ghandi.

Attachment

Pasted Graphic 2.tiff

Re: Speed with offset clause

From

hubert depesz lubaczewski

Date:

24 June 2005, 18:55:03

On 6/24/05, Yves Vindevogel <yves.vindevogel@implements.be> wrote:
> So, when I want the last page, which is: 600k / 25 = page 24000 - 1 =
> 23999, I issue the offset of 23999 * 25

improving this is hard, but not impossible.
if you have right index created, try to reverse the order and fetch
first adverts, and then resort it (just the 25 adverts) in correct
order.
it will be faster.

depesz

Re: Speed with offset clause

From

John A Meinel

Date:

24 June 2005, 19:22:34

Yves Vindevogel wrote:

> Hi again all,
>
> My queries are now optimised. They all use the indexes like they should.
> However, there's still a slight problem when I issue the "offset" clause.
>
> We have a table that contains 600.000 records
> We display them by 25 in the webpage.
> So, when I want the last page, which is: 600k / 25 = page 24000 - 1 =
> 23999, I issue the offset of 23999 * 25
> This take a long time to run, about 5-10 seconds whereas offset below
> 100 take less than a second.
>
> Can I speed this up ?
>
>
> Met vriendelijke groeten,
> Bien à vous,
> Kind regards,
>
> *Yves Vindevogel*
> *Implements*
>
Postgres has the optimization that it will plan a query, and once it
reaches the limit, it can stop even though there is more data available.
The problem you are having is that it has to go through "offset" rows
first, before it can apply the limit.
If you can, (as mentioned in the other post), try to refine your index
so that you can reverse it for the second half of the data.

This is probably tricky, as you may not know how many rows you have (or
the amount might be changing).

A potentially better thing, is if you have an index you are using, you
could use a subselect so that the only portion that needs to have 60k
rows is a single column.

Maybe an example:
Instead of saying:

SELECT * FROM table1, table2 WHERE table1.id = table2.id ORDER BY
table1.date OFFSET x LIMIT 25;

You could do:

SELECT * FROM
    (SELECT id FROM table1 OFFSET x LIMIT 25) as subselect
    JOIN table1 ON subselect.id = table1.id
    , table2
    WHERE table1.id = table2.id;

That means that the culling process is done on only a few rows of one
table, and the rest of the real merging work is done on only a few rows.

It really depends on you query, though, as what rows you are sorting on
has a big influence on how well this will work.

John
=:->

Attachment

signature.asc

Re: Speed with offset clause

From

Yves Vindevogel

Date:

24 June 2005, 20:24:56

Hi,

Indeed, I would have to do it through a function, where I check the
number of pages, ....

It puts my weakest point in the middle then.

I could simply rewrite my query like you state, just to check.

I think all my queries are on one table only.  (I report in a website
on one table, that has been denormalized into other smaller tables for
speed)

But the problem is on the big table.

I'm currently looking at another possibility, and that is generating
XML files based upon my database.  This would increase disk space
enormously, but limit my problems with the database.

Since I am using Cocoon for the website, this is not such a
problematic decision, disks are cheap and I need only a few
modifications to my code.

On 24 Jun 2005, at 21:22, John A Meinel wrote:

<excerpt>Yves Vindevogel wrote:

<excerpt>Hi again all,

My queries are now optimised. They all use the indexes like they
should.

However, there's still a slight problem when I issue the "offset"
clause.

We have a table that contains 600.000 records

We display them by 25 in the webpage.

So, when I want the last page, which is: 600k / 25 = page 24000 - 1 =
23999, I issue the offset of 23999 * 25

This take a long time to run, about 5-10 seconds whereas offset below
100 take less than a second.

Can I speed this up ?

Met vriendelijke groeten,

Bien à vous,

Kind regards,

*Yves Vindevogel*

*Implements*

</excerpt>Postgres has the optimization that it will plan a query, and
once it reaches the limit, it can stop even though there is more data
available.

The problem you are having is that it has to go through "offset" rows
first, before it can apply the limit.

If you can, (as mentioned in the other post), try to refine your index
so that you can reverse it for the second half of the data.

This is probably tricky, as you may not know how many rows you have
(or the amount might be changing).

A potentially better thing, is if you have an index you are using, you
could use a subselect so that the only portion that needs to have 60k
rows is a single column.

Maybe an example:

Instead of saying:

SELECT * FROM table1, table2 WHERE table1.id = table2.id ORDER BY
table1.date OFFSET x LIMIT 25;

You could do:

SELECT * FROM

   (SELECT id FROM table1 OFFSET x LIMIT 25) as subselect

   JOIN table1 ON subselect.id = table1.id

   , table2

   WHERE table1.id = table2.id;

That means that the culling process is done on only a few rows of one
table, and the rest of the real merging work is done on only a few
rows.

It really depends on you query, though, as what rows you are sorting
on has a big influence on how well this will work.

John

=:->

</excerpt>Met vriendelijke groeten,

Bien à vous,

Kind regards,

<bold>Yves Vindevogel</bold>

<bold>Implements</bold>

<smaller>

</smaller>Hi,

Indeed, I would have to do it through a function, where I check the
number of pages, ....
It puts my weakest point in the middle then.

I could simply rewrite my query like you state, just to check.
I think all my queries are on one table only.  (I report in a website
on one table, that has been denormalized into other smaller tables for
speed)
But the problem is on the big table.

I'm currently looking at another possibility, and that is generating
XML files based upon my database.  This would increase disk space
enormously, but limit my problems with the database.
Since I am using Cocoon for the website, this is not such a problematic
decision, disks are cheap and I need only a few modifications to my
code.

On 24 Jun 2005, at 21:22, John A Meinel wrote:

> Yves Vindevogel wrote:
>
>> Hi again all,
>>
>> My queries are now optimised. They all use the indexes like they
>> should.
>> However, there's still a slight problem when I issue the "offset"
>> clause.
>>
>> We have a table that contains 600.000 records
>> We display them by 25 in the webpage.
>> So, when I want the last page, which is: 600k / 25 = page 24000 - 1 =
>> 23999, I issue the offset of 23999 * 25
>> This take a long time to run, about 5-10 seconds whereas offset below
>> 100 take less than a second.
>>
>> Can I speed this up ?
>>
>>
>> Met vriendelijke groeten,
>> Bien à vous,
>> Kind regards,
>>
>> *Yves Vindevogel*
>> *Implements*
>>
> Postgres has the optimization that it will plan a query, and once it
> reaches the limit, it can stop even though there is more data
> available.
> The problem you are having is that it has to go through "offset" rows
> first, before it can apply the limit.
> If you can, (as mentioned in the other post), try to refine your index
> so that you can reverse it for the second half of the data.
>
> This is probably tricky, as you may not know how many rows you have
> (or the amount might be changing).
>
> A potentially better thing, is if you have an index you are using, you
> could use a subselect so that the only portion that needs to have 60k
> rows is a single column.
>
> Maybe an example:
> Instead of saying:
>
> SELECT * FROM table1, table2 WHERE table1.id = table2.id ORDER BY
> table1.date OFFSET x LIMIT 25;
>
> You could do:
>
> SELECT * FROM
>    (SELECT id FROM table1 OFFSET x LIMIT 25) as subselect
>    JOIN table1 ON subselect.id = table1.id
>    , table2
>    WHERE table1.id = table2.id;
>
> That means that the culling process is done on only a few rows of one
> table, and the rest of the real merging work is done on only a few
> rows.
>
> It really depends on you query, though, as what rows you are sorting
> on has a big influence on how well this will work.
>
> John
> =:->
>
>
>
Met vriendelijke groeten,
Bien à vous,
Kind regards,

Yves Vindevogel
Implements

<smaller>

Mail: yves.vindevogel@implements.be  - Mobile: +32 (478) 80 82 91

Kempische Steenweg 206 - 3500 Hasselt - Tel-Fax: +32 (11) 43 55 76

Web: http://www.implements.be

<italic><x-tad-smaller>

First they ignore you.  Then they laugh at you.  Then they fight you.
Then you win.

Mahatma Ghandi.</x-tad-smaller></italic></smaller>

Mail: yves.vindevogel@implements.be  - Mobile: +32 (478) 80 82 91

Kempische Steenweg 206 - 3500 Hasselt - Tel-Fax: +32 (11) 43 55 76

Web: http://www.implements.be

First they ignore you.  Then they laugh at you.  Then they fight you.
Then you win.
Mahatma Ghandi.

Attachment

Pasted Graphic 2.tiff