Thread: Query slows after offset of 100K

Query slows after offset of 100K

From

Michael Lorenz

Date:

14 February 2008, 14:28:20

Hi all,

I've been reading through the performance list of the last few months, and haven't been able to find a solution to my
problemyet, so I'm posting the specifics here now.  If anyone can suggest what might work (or point me to where this
hasbeen covered before), that would be great.  My current suspicion is that the shared_buffers setting is far too low. 

My query is as follows:
SELECT o.objectid, o.objectname, o.isactive, o.modificationtime
FROM    object o
WHERE  ( o.deleted = false OR o.deleted IS NULL )
AND      o.accountid = 111
ORDER BY 2
LIMIT 20 OFFSET 10000;

The object table has primary key objectid, an index on objectname, and a unique constraint on ( accountid, objectname
).
What I'm trying to do is show only 20 records to the user at a time, sorting on objectname, and the ones I display
dependon the page they're on (that's why I've got LIMIT plus OFFSET, of course). 

When offsetting up to about 90K records, the EXPLAIN ANALYZE is similar to the following:
 Limit  (cost=15357.06..15387.77 rows=20 width=35) (actual time=19.235..19.276 rows=20 loops=1)
   ->  Index Scan using account_objectname on "object" o  (cost=0.00..1151102.10 rows=749559 width=35) (actual
time=0.086..14.981rows=10020 loops=1) 
         Index Cond: (accountid = 354)
         Filter: ((NOT deleted) OR (deleted IS NULL))
 Total runtime: 19.315 ms

If I move the offset up to 100K records or higher, I get:
 Limit  (cost=145636.26..145636.31 rows=20 width=35) (actual time=13524.327..13524.355 rows=20 loops=1)
   ->  Sort  (cost=145386.26..147260.16 rows=749559 width=35) (actual time=13409.216..13481.793 rows=100020 loops=1)
         Sort Key: objectname
         ->  Seq Scan on "object" o  (cost=0.00..16685.49 rows=749559 width=35) (actual time=0.011..1600.683
rows=749549loops=1) 
               Filter: (((NOT deleted) OR (deleted IS NULL)) AND (accountid = 354))
 Total runtime: 14452.374 ms

That's a huge decrease in performance, and I'm wondering if there's a way around it.
Right now there are about 750K records in the object table, and that number will only increase with time.
I've already run a VACUUM FULL on the table and played with changing work_mem, but so far am not seeing any
improvement.

Are there any other settings I can change to get back to that super-fast index scan?  Is the shared_buffers = 2000
settingway too low?  The reason I haven't actually changed that setting is due to some system limitations, etc., that
requiremore work than just a change in the config file.  If I can get confirmation that this is a likely
cause/solution,then I can get the extra changes made. 

I'm running a quad core 2.33GHz Xeon with 4GB memory (1.2GB free), using Postgres 8.1.11.

Thanks,
    Michael Lorenz
_________________________________________________________________
It's simple! Sell your car for just $30 at CarPoint.com.au

http://a.ninemsn.com.au/b.aspx?URL=http%3A%2F%2Fsecure%2Dau%2Eimrworldwide%2Ecom%2Fcgi%2Dbin%2Fa%2Fci%5F450304%2Fet%5F2%2Fcg%5F801459%2Fpi%5F1004813%2Fai%5F859641&_t=762955845&_r=tig_OCT07&_m=EXT

Re: Query slows after offset of 100K

From

Tom Lane

Date:

14 February 2008, 15:08:20

Michael Lorenz <mlorenz1@hotmail.com> writes:
> My query is as follows:
> SELECT o.objectid, o.objectname, o.isactive, o.modificationtime
> FROM    object o
> WHERE  ( o.deleted = false OR o.deleted IS NULL )
> AND      o.accountid = 111
> ORDER BY 2
> LIMIT 20 OFFSET 10000;

This is guaranteed to lose --- huge OFFSET values are never a good idea
(hint: the database still has to fetch those rows it's skipping over).

A saner way to do pagination is to remember the last key you displayed
and do something like "WHERE key > $lastkey ORDER BY key LIMIT 20",
which will allow the database to go directly to the desired rows,
as long as you have an index on the key.  You do need a unique ordering
key for this to work, though.

            regards, tom lane

Re: Query slows after offset of 100K

From

Michael Lorenz

Date:

14 February 2008, 15:49:32

Fair enough, and I did think of this as well.  However, I didn't think this was a viable option in my case, since we're
currentlyallowing the user to randomly access the pages (so $lastkey wouldn't really have any meaning).  The user can
chooseto sort on object ID, name or modification time, and then go straight to any page in the list.  With 750K
records,that's around 37K pages. 

Maybe a better way to phrase my question is:  how can I paginate my data on 3 different keys which allow random access
toany given page, and still get reasonable performance?  Should I just force the user to limit their result set to some
givennumber of records before allowing any paginated access?  Or is it just not practical, period? 

Thanks,
    Michael Lorenz

----------------------------------------
> To: mlorenz1@hotmail.com
> CC: pgsql-performance@postgresql.org
> Subject: Re: [PERFORM] Query slows after offset of 100K
> Date: Thu, 14 Feb 2008 14:08:15 -0500
> From: tgl@sss.pgh.pa.us
>
> Michael Lorenz  writes:
>> My query is as follows:
>> SELECT o.objectid, o.objectname, o.isactive, o.modificationtime
>> FROM    object o
>> WHERE  ( o.deleted = false OR o.deleted IS NULL )
>> AND      o.accountid = 111
>> ORDER BY 2
>> LIMIT 20 OFFSET 10000;
>
> This is guaranteed to lose --- huge OFFSET values are never a good idea
> (hint: the database still has to fetch those rows it's skipping over).
>
> A saner way to do pagination is to remember the last key you displayed
> and do something like "WHERE key> $lastkey ORDER BY key LIMIT 20",
> which will allow the database to go directly to the desired rows,
> as long as you have an index on the key.  You do need a unique ordering
> key for this to work, though.
>
>             regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
>        subscribe-nomail command to majordomo@postgresql.org so that your
>        message can get through to the mailing list cleanly

_________________________________________________________________
Your Future Starts Here. Dream it? Then be it! Find it at www.seek.com.au

http://a.ninemsn.com.au/b.aspx?URL=http%3A%2F%2Fninemsn%2Eseek%2Ecom%2Eau%2F%3Ftracking%3Dsk%3Ahet%3Ask%3Anine%3A0%3Ahot%3Atext&_t=764565661&_r=OCT07_endtext_Future&_m=EXT

Re: Query slows after offset of 100K

From

Mark Lewis

Date:

14 February 2008, 16:32:23

Michael,

Our application had a similar problem, and what we did to avoid having
people click into the middle of 750k records was to show the first page
with forward/back links but no link to go to the middle.  So people
could manually page forward as far as they want, but nobody is going to
sit there clicking next 37k times.  We have several thousand users and
none of them complained about the change.  Maybe it's because at the
same time as we made that change we also improved the rest of the
searching/filtering interface.  But I think that really people don't
need to jump to the middle of the records anyway as long as you have
decent search abilities.

If you wanted to keep your same GUI, one workaround would be to
periodically update a table which maps "page number" to "first unique
key on page".  That decouples the expensive work to generate the page
offsets from the query itself, so if your data changes fairly
infrequently it might be appropriate.  Sort of a materialized-view type
approach.

If you can be approximate in your GUI you can do a lot more with this
optimization-- if people don't necessarily need to be able to go
directly to page 372898 but instead would be satisfied with a page
roughly 47% of the way into the massive result set (think of a GUI
slider), then you wouldn't need to update the lookup table as often even
if the data changed frequently, because adding a few thousand records to
a 750k row result set is statistically insignificant, so your markers
wouldn't need to be updated very frequently and you wouldn't need to
store a marker for each page, maybe only 100 markers spread evenly
across the result set would be sufficient.

-- Mark Lewis

On Thu, 2008-02-14 at 19:49 +0000, Michael Lorenz wrote:
> Fair enough, and I did think of this as well.  However, I didn't think this was a viable option in my case, since
we'recurrently allowing the user to randomly access the pages (so $lastkey wouldn't really have any meaning).  The user
canchoose to sort on object ID, name or modification time, and then go straight to any page in the list.  With 750K
records,that's around 37K pages. 
>
> Maybe a better way to phrase my question is:  how can I paginate my data on 3 different keys which allow random
accessto any given page, and still get reasonable performance?  Should I just force the user to limit their result set
tosome given number of records before allowing any paginated access?  Or is it just not practical, period? 
>
> Thanks,
>     Michael Lorenz
>
> ----------------------------------------
> > To: mlorenz1@hotmail.com
> > CC: pgsql-performance@postgresql.org
> > Subject: Re: [PERFORM] Query slows after offset of 100K
> > Date: Thu, 14 Feb 2008 14:08:15 -0500
> > From: tgl@sss.pgh.pa.us
> >
> > Michael Lorenz  writes:
> >> My query is as follows:
> >> SELECT o.objectid, o.objectname, o.isactive, o.modificationtime
> >> FROM    object o
> >> WHERE  ( o.deleted = false OR o.deleted IS NULL )
> >> AND      o.accountid = 111
> >> ORDER BY 2
> >> LIMIT 20 OFFSET 10000;
> >
> > This is guaranteed to lose --- huge OFFSET values are never a good idea
> > (hint: the database still has to fetch those rows it's skipping over).
> >
> > A saner way to do pagination is to remember the last key you displayed
> > and do something like "WHERE key> $lastkey ORDER BY key LIMIT 20",
> > which will allow the database to go directly to the desired rows,
> > as long as you have an index on the key.  You do need a unique ordering
> > key for this to work, though.
> >
> >             regards, tom lane
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 1: if posting/reading through Usenet, please send an appropriate
> >        subscribe-nomail command to majordomo@postgresql.org so that your
> >        message can get through to the mailing list cleanly
>
> _________________________________________________________________
> Your Future Starts Here. Dream it? Then be it! Find it at www.seek.com.au
>
http://a.ninemsn.com.au/b.aspx?URL=http%3A%2F%2Fninemsn%2Eseek%2Ecom%2Eau%2F%3Ftracking%3Dsk%3Ahet%3Ask%3Anine%3A0%3Ahot%3Atext&_t=764565661&_r=OCT07_endtext_Future&_m=EXT
> ---------------------------(end of broadcast)---------------------------
> TIP 7: You can help support the PostgreSQL project by donating at
>
>                 http://www.postgresql.org/about/donate

Re: Query slows after offset of 100K

From

Tom Lane

Date:

14 February 2008, 17:55:11

Michael Lorenz <mlorenz1@hotmail.com> writes:
> Fair enough, and I did think of this as well.  However, I didn't think this was a viable option in my case, since
we'recurrently allowing the user to randomly access the pages (so $lastkey wouldn't really have any meaning).  The user
canchoose to sort on object ID, name or modification time, and then go straight to any page in the list.  With 750K
records,that's around 37K pages. 

Well, my first question is whether that user interface is actually
useful to anyone.  If you have a dictionary and you want to look up
"foosball", do you start by guessing that it's on page 432?  No, you
look for the "F" tab.  I'd suggest that what you want is to present
links that let people go to specific spots in the key distribution,
rather than expressing it in terms of so-many-pages.

            regards, tom lane

Re: Query slows after offset of 100K

From

Greg Smith

Date:

14 February 2008, 19:54:42

On Thu, 14 Feb 2008, Michael Lorenz wrote:

> When offsetting up to about 90K records, the EXPLAIN ANALYZE is similar to the following:
> Limit  (cost=15357.06..15387.77 rows=20 width=35) (actual time=19.235..19.276 rows=20 loops=1)
>   ->  Index Scan using account_objectname on "object" o  (cost=0.00..1151102.10 rows=749559 width=35) (actual
time=0.086..14.981rows=10020 loops=1) 

It looks like the planner thinks that index scan will have to go through
749559 rows, but there are actually only 10020 there.  Is this table is
getting ANALYZE'd usefully?  VACUUM FULL doesn't do that.  If the row
estimates are so far off, that might explain why it thinks the index scan
is going to be so huge it might as well just walk the whole thing.

Actually, VACUUM FULL can be its own problem--you probably want a very
regular VACUUM instead.

> Is the shared_buffers = 2000 setting way too low?

Quite; with 4GB of ram that could easily be 100,000+ instead.  I wouldn't
make that whole jump at once, but 2000 is only a mere 16MB of memory
dedicated to the database.  Also, be sure to set effective_cache_size to
something reflective of your total memory minus application+OS as it also
has an impact here; you've probably also got that set extremely low and if
this server is mostly for PostgreSQL a good starting point would be
something like 300000 (=2.4GB).

> Are there any other settings I can change to get back to that super-fast
> index scan?

Well, you can try to turn off sequential scans for the query.  You can
test if that makes a difference like this:

SET enable_seq_scan to off;
EXPLAIN ANALYZE <x>;
SET enable_seq_scan to on;

It's also possible to tweak parameters like random_page_cost to similarly
prefer indexes.  Far better to fix the true underlying issues though
(above and below).

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: Query slows after offset of 100K

From

Tom Lane

Date:

14 February 2008, 20:02:45

Greg Smith <gsmith@gregsmith.com> writes:
> On Thu, 14 Feb 2008, Michael Lorenz wrote:
>> When offsetting up to about 90K records, the EXPLAIN ANALYZE is similar to the following:
>> Limit  (cost=15357.06..15387.77 rows=20 width=35) (actual time=19.235..19.276 rows=20 loops=1)
>> ->  Index Scan using account_objectname on "object" o  (cost=0.00..1151102.10 rows=749559 width=35) (actual
time=0.086..14.981rows=10020 loops=1) 

> It looks like the planner thinks that index scan will have to go through
> 749559 rows, but there are actually only 10020 there.

No, you have to be careful about that.  The estimated rowcount is for
the case where the plan node is run to completion, but when there's a
LIMIT over it, it may not get run to completion.  In this case the limit
was satisfied after pulling 10020 rows from the indexscan, but we can't
tell what fraction of the underlying scan was actually completed.

            regards, tom lane

Re: Query slows after offset of 100K

From

Matthew

Date:

15 February 2008, 10:47:18

On Thu, 14 Feb 2008, Michael Lorenz wrote:
> When offsetting up to about 90K records, the EXPLAIN ANALYZE is similar to the following:
> Limit  (cost=15357.06..15387.77 rows=20 width=35) (actual time=19.235..19.276 rows=20 loops=1)
>   ->  Index Scan using account_objectname on "object" o  (cost=0.00..1151102.10 rows=749559 width=35) (actual
time=0.086..14.981rows=10020 loops=1) 
>         Index Cond: (accountid = 354)
>         Filter: ((NOT deleted) OR (deleted IS NULL))
> Total runtime: 19.315 ms

Since this is scanning through 10,000 random rows in 19 milliseconds, I
say all this data is already in the cache. If it wasn't, you'd be looking
at 10,000 random seeks on disk, at about 7ms each, which is 70 seconds.
Try dropping the OS caches (on Linux echo "1" >/proc/sys/vm/drop_caches)
and see if the performance is worse.

> If I move the offset up to 100K records or higher, I get:
> Limit  (cost=145636.26..145636.31 rows=20 width=35) (actual time=13524.327..13524.355 rows=20 loops=1)
>   ->  Sort  (cost=145386.26..147260.16 rows=749559 width=35) (actual time=13409.216..13481.793 rows=100020 loops=1)
>         Sort Key: objectname
>         ->  Seq Scan on "object" o  (cost=0.00..16685.49 rows=749559 width=35) (actual time=0.011..1600.683
rows=749549loops=1) 
>               Filter: (((NOT deleted) OR (deleted IS NULL)) AND (accountid = 354))
> Total runtime: 14452.374 ms

And here, it only takes 1.5 seconds to fetch the entire table from disc
(or it's already in the cache or something), but 14 seconds to sort the
whole lot in memory.

In any case, Postgres is making a good choice - it's just that you have an
unexpected benefit in the first case that the data is in cache. Setting
the effective cache size correctly will help the planner in this case.
Setting work_mem higher will improve the performance of the sort in the
second case.

Of course, what others have said about trying to avoid large offsets is
good advice. You don't actually need a unique index, but it makes it
simpler if you do.

Matthew

--
The early bird gets the worm. If you want something else for breakfast, get
up later.