Re: [HACKERS] What about LIMIT in SELECT ? - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: [HACKERS] What about LIMIT in SELECT ?
Date
Msg-id 199810201644.MAA07920@candle.pha.pa.us
Whole thread Raw
In response to RE: [HACKERS] What about LIMIT in SELECT ?  ("Hiroshi Inoue" <Inoue@tpf.co.jp>)
List pgsql-hackers
[Charset iso-8859-1 unsupported, filtering to ASCII...]
> >
> > I know the good part of the posting is coming.
> >
> > >     Last with the btree index on key  and  the  patched  backend.
> > >     This  time the plan is a plain IndexScan because the ORDER BY
> > >     clause exactly matches the sort order of the  chosen  index.
> > >     S1  needs  13  seconds  and  S2 less than 0.2!  This dramatic
> > >     speedup comes from the fact, that this time the index scan is
> > >     the  toplevel  executor  node and the executor run is stopped
> > >     after 20 tuples have been selected.
> >
> > OK, seems like in the S1 case, the use of the psort/ORDER BY code on top
> > of the index was taking and extra 3 seconds, which is 23%.  That is a
> > lot more than I thought for the psort code, and shows we could gain a
> > lot by removing unneeded sorts from queries that are already using
> > matching indexes.
> >
> > Just for clarity, added to TODO.  I think everyone is clear on this one,
> > and its magnitude is a surprise to me:
> >
> >   * Prevent psort() usage when query already using index matching ORDER BY
> >
> >

In a multi-column ORDER BY, the direction of the sorts will have to be
identical too.  That is assumed, I think.  If all are descending, I
think we can traverse the index in reverse order, or can't we do that. 
I am not sure, but if we can't, descending would fail, and require a
psort.


> 
> I can't find the reference to descending order cases except my posting.
> If  we use an index scan to remove sorts in those cases,backward positioning
> and scanning are necessary.
> 
> > >     Analysis of the above timings:
> > >
> > >     If there is an ORDER BY clause, using an index  scan  is  the
> > >     clever  way  if  the  indexqual  dramatically reduces the the
> > >     amount of data selected and sorted.   I  think  this  is  the
> > >     normal case (who really selects nearly all rows from a 5M row
> > >     table?). So choosing the index path  is  correct.  This  will
> > >     hurt if someone really selects most of the rows and the index
> > >     scan jumps over the disc.  But here the programmer should use
> > >     an  unqualified  query  to  perform  a  seqscan  and  do  the
> > >     qualification in the frontend application.
> >
> > Fortunately, the optimizer already does the index selection for us, and
> > guesses pretty well if the index or sequential scan is better.  Once we
> > implement the above removal of psort(), we will have to change the
> > timings because now you have to compare index scan against sequential
> > scan AND psort(), because in the index scan situation, you don't need
> > the psort(), assuming the ORDER BY matches the index exactly.
> >
> 
> Let t be a table with 2 indices, index1(key1,key2), index2(key1,key3).
> i.e. key1 is common to index1 and index2.
> 
> And for the query
>   select * from t where key1>....;
> 
> If   PosgreSQL optimizer choose [ index scan on index1 ] we can't remove
> sorts from the following query.
>     select * from t where key1>... order by key1,key3;
> 
> Similarly if  [ index scan on index2 ] are chosen  we can't remove sorts
> from the following query.
>     select * from t where key1>... order by key1,key2;
> 
> But in both cases (clever) optimizer can choose another index for scan.

Yes, the optimizer is going to have to be smart by looking at the ORDER
BY, and nudging the code to favor a certain index.  This is also true in
a join, where we will want to use an index in cases we would normally
not use it, and prefer a certain index over others.


--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: [HACKERS] What about LIMIT in SELECT ?
Next
From: Bruce Momjian
Date:
Subject: Re: [HACKERS] Re: inet/cidr/bind