Re: [HACKERS] What about LIMIT in SELECT ? - Mailing list pgsql-hackers
From | Bruce Momjian |
---|---|
Subject | Re: [HACKERS] What about LIMIT in SELECT ? |
Date | |
Msg-id | 199810201644.MAA07920@candle.pha.pa.us Whole thread Raw |
In response to | RE: [HACKERS] What about LIMIT in SELECT ? ("Hiroshi Inoue" <Inoue@tpf.co.jp>) |
List | pgsql-hackers |
[Charset iso-8859-1 unsupported, filtering to ASCII...] > > > > I know the good part of the posting is coming. > > > > > Last with the btree index on key and the patched backend. > > > This time the plan is a plain IndexScan because the ORDER BY > > > clause exactly matches the sort order of the chosen index. > > > S1 needs 13 seconds and S2 less than 0.2! This dramatic > > > speedup comes from the fact, that this time the index scan is > > > the toplevel executor node and the executor run is stopped > > > after 20 tuples have been selected. > > > > OK, seems like in the S1 case, the use of the psort/ORDER BY code on top > > of the index was taking and extra 3 seconds, which is 23%. That is a > > lot more than I thought for the psort code, and shows we could gain a > > lot by removing unneeded sorts from queries that are already using > > matching indexes. > > > > Just for clarity, added to TODO. I think everyone is clear on this one, > > and its magnitude is a surprise to me: > > > > * Prevent psort() usage when query already using index matching ORDER BY > > > > In a multi-column ORDER BY, the direction of the sorts will have to be identical too. That is assumed, I think. If all are descending, I think we can traverse the index in reverse order, or can't we do that. I am not sure, but if we can't, descending would fail, and require a psort. > > I can't find the reference to descending order cases except my posting. > If we use an index scan to remove sorts in those cases,backward positioning > and scanning are necessary. > > > > Analysis of the above timings: > > > > > > If there is an ORDER BY clause, using an index scan is the > > > clever way if the indexqual dramatically reduces the the > > > amount of data selected and sorted. I think this is the > > > normal case (who really selects nearly all rows from a 5M row > > > table?). So choosing the index path is correct. This will > > > hurt if someone really selects most of the rows and the index > > > scan jumps over the disc. But here the programmer should use > > > an unqualified query to perform a seqscan and do the > > > qualification in the frontend application. > > > > Fortunately, the optimizer already does the index selection for us, and > > guesses pretty well if the index or sequential scan is better. Once we > > implement the above removal of psort(), we will have to change the > > timings because now you have to compare index scan against sequential > > scan AND psort(), because in the index scan situation, you don't need > > the psort(), assuming the ORDER BY matches the index exactly. > > > > Let t be a table with 2 indices, index1(key1,key2), index2(key1,key3). > i.e. key1 is common to index1 and index2. > > And for the query > select * from t where key1>....; > > If PosgreSQL optimizer choose [ index scan on index1 ] we can't remove > sorts from the following query. > select * from t where key1>... order by key1,key3; > > Similarly if [ index scan on index2 ] are chosen we can't remove sorts > from the following query. > select * from t where key1>... order by key1,key2; > > But in both cases (clever) optimizer can choose another index for scan. Yes, the optimizer is going to have to be smart by looking at the ORDER BY, and nudging the code to favor a certain index. This is also true in a join, where we will want to use an index in cases we would normally not use it, and prefer a certain index over others. -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
pgsql-hackers by date: