Re: [SPAM?] Re: Optimize ORDER BY ... LIMIT - Mailing list pgsql-hackers

From mark@mark.mielke.cc
Subject Re: [SPAM?] Re: Optimize ORDER BY ... LIMIT
Date
Msg-id 20060916003058.GA26439@mark.mielke.cc
Whole thread Raw
In response to Re: Optimize ORDER BY ... LIMIT  (Gregory Stark <stark@enterprisedb.com>)
List pgsql-hackers
On Fri, Sep 15, 2006 at 10:06:16PM +0100, Gregory Stark wrote:
> > I'm curious, as I may be such an offender. What alternatives exist?
> > ...
> > What alternatives to limit/offset exist? If there are thousands or
> > more results, I have trouble with an idea that the entire results
> > should be queried, and cached, displaying only a fraction.

> If you have a unique index and instead of using OFFSET you pass
> along the last key of the previous page then you can use a WHERE
> clause on the indexed column to go straight to the correct page
> rather than using OFFSET.  So for example if you're displaying bank
> transactions sorted by transaction_id you have the "next page"
> button pass along the "start_transaction_id=nnn" where nnn is the
> last transaction_id of the previous page. Then on the next page you
> do a query with "WHERE transaction_id > ?" and pass that column. You
> still use ORDER BY transaction_id and LIMIT.

I found benefits to doing things this way that were not related to
performance. If the number of items leading up to your page changes,
remembering the offset can result in listing a completely different
page than you intended when paging forward or backwards. On my pages,
I prefer to define one of the items as the item I am looking at, and
page seeking is always +/- 1 page from that item. This means that I
am pretty close to what you are suggesting - except - because I do
this for functional reasons, and not for performance reasons, I am
doing something worse.

I use COUNT(*) and WHERE as you describe above to map this identifier
to an offset, and then a second SELECT with LIMIT/OFFSET to describe
the object and the those that follow on the page.

According to your suggestion, I think this means I should track the
identifier with the last offset, displaying the offset to the user for
information purposes only, not using it for any queries, and then use
WHERE and LIMIT?

I tried this out. EXPLAIN ANALYZE tells me that for a random
offset=200, limit=20 case I tried, the simple change changes it from
index scanning 207 rows to find 7 rows, to index scanning 7 rows to
find 7 rows. Sweet. Unfortunately, the time to complete is unchanged
around 1.3+/-0.2 milliseconds. Looks like my system has bigger
bottlenecks. :-)

Thanks,
mark

-- 
mark@mielke.cc / markm@ncf.ca / markm@nortel.com     __________________________
.  .  _  ._  . .   .__    .  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/    |_     |\/|  |  |_  |   |/  |_   | 
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada
 One ring to rule them all, one ring to find them, one ring to bring them all                      and in the darkness
bindthem...
 
                          http://mark.mielke.cc/



pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Emacs local vars at the tail of every file
Next
From: Bruce Momjian
Date:
Subject: Re: [COMMITTERS] pgsql: sslinfo contrib module - information