Re: LIMIT Optimization - Mailing list pgsql-sql

From Oleg Bartunov
Subject Re: LIMIT Optimization
Date
Msg-id Pine.GSO.4.33.0201281615350.7250-100000@ra.sai.msu.su
Whole thread Raw
In response to Re: LIMIT Optimization  ("Ross J. Reedstrom" <reedstrm@rice.edu>)
Responses Re: LIMIT Optimization
List pgsql-sql
On Sun, 27 Jan 2002, Ross J. Reedstrom wrote:

> On Sat, Jan 26, 2002 at 11:19:21PM -0500, Bruce Momjian wrote:
> > Tom Lane wrote:
> > > Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > > > I am confused.  I thought we already did optimization for LIMIT that
> > > > assumed you only wanted a few values.  Is there something we are missing
> > > > there?
> > >
> > > Yeah, he was proposing an alternative implementation of sorting that
> > > would win in a scenario like
> > >
> > >     SELECT ... ORDER BY foo LIMIT <something small>
> > >
> > > If you have an index on foo then there's no problem, but if you're
> > > forced to do an explicit sort then the system does a complete sort
> > > before you can get any data out.  If the limit is small enough you
> > > can instead do a one-pass "select top N" scan.
> > >
> > > Note that this is only workable in the non-cursor case, where you
> > > know the limit for sure.
> >
> > Oh, boy, so we would scan through and grab the top X value from the
> > table without a sort.  Interesting.  Add to TODO:
> >
> >    Allow ORDER BY ... LIMIT to select top values without sort or index
>
> Note that it's not as big a win as one might think at first, since you
> stil have to scan the entire table to make sure that last tuple isn't
> in the LIMIT in the sort order. Big (potential) savingings in sort space
> storage, however. And you're O(N) compares, rather than anything larger.

It's still a big win ! think about difference between full sorting of  1mln
rows and partial sorting when you stop sorting after getting desirable
first 100 (or so) rows. This is most common situation in web applications
at least - you always display search results page by page. But statistics
shows that 90% of hits is the first 1-2 pages. If we intend to be more
friendly and compete with MySQL on Web apps. we should consider this
optimization. We had quick and dirty patch for 7.1 but it was just a
sketch and we didn't surprised core developers reject it. Now we have
libpsort - a library which implements partial sorting, and we use it
extensively  in our apps. I think we should add partial sorting
in TODO list for 7.3.

>
> Ross
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/users-lounge/docs/faq.html
>
Regards,    Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83



pgsql-sql by date:

Previous
From: "Ross J. Reedstrom"
Date:
Subject: Re: LIMIT Optimization
Next
From: "Ross J. Reedstrom"
Date:
Subject: Re: LIMIT Optimization