Thread: sorting problem
Problem i am having at the moment i cant get a true alpha sort to work as Order By is sorting A..Z then a..z where i need aA..zZ sort independant of case. SQL Query SELECT * FROM person WHERE (salutation LIKE '%To%') ORDER BY person.lastname Results Ahsteit Bloggs Cap Carrey Diver Duckula Goldsworthy Gruff Harmony Hassleberry-flop-flop Heyheagle Jahoosal Straindove Yorrick of Finchery
Jamie Deppeler wrote: > Problem i am having at the moment i cant get a true alpha sort to > work as Order By is sorting A..Z then a..z where i need aA..zZ sort > independant of case. Initialize the database cluster with a locale setting other than "C". -- Peter Eisentraut http://developer.postgresql.org/~petere/
On Fri, Dec 17, 2004 at 11:28:36AM +1100, Jamie Deppeler wrote: > Problem i am having at the moment i cant get a true alpha sort to work > as Order By is sorting A..Z then a..z where i need aA..zZ sort > independant of case. ORDER BY LOWER(person.lastname) or ORDER BY UPPER(person.lastname) -- Michael Fuhr http://www.fuhr.org/~mfuhr/
On Fri, Dec 17, 2004 at 01:45:36AM +0100, Peter Eisentraut wrote: > Jamie Deppeler wrote: > > Problem i am having at the moment i cant get a true alpha sort to > > work as Order By is sorting A..Z then a..z where i need aA..zZ sort > > independant of case. > > Initialize the database cluster with a locale setting other than "C". Hmmm...did I misunderstand something when I recommended using ORDER BY LOWER(person.lastname)? -- Michael Fuhr http://www.fuhr.org/~mfuhr/
Would doing it this way require an index: create index lower_lastname on table x lower(lastname); ? Regards, Chris Smith Suite 30, 45-51 Huntley St, Alexandria, NSW 2015 Australia Ph: +61 2 9517 2505 Fx: +61 2 9517 1915 email: info@interspire.com web: www.interspire.com Michael Fuhr wrote: > On Fri, Dec 17, 2004 at 11:28:36AM +1100, Jamie Deppeler wrote: > > >>Problem i am having at the moment i cant get a true alpha sort to work >>as Order By is sorting A..Z then a..z where i need aA..zZ sort >>independant of case. > > > ORDER BY LOWER(person.lastname) > > or > > ORDER BY UPPER(person.lastname) > -- No virus found in this outgoing message. Checked by AVG Anti-Virus. Version: 7.0.296 / Virus Database: 265.5.4 - Release Date: 12/15/2004
Chris Smith <chris@interspire.com> writes: > Would doing it this way require an index: > > create index lower_lastname on table x lower(lastname); Well it doesn't *require* but it may be a good idea. It depends on your queries. It will NOT be useful for a query like: select * from x order by lower(lastname) where postgres won't bother with the index since it will be slower than just resorting the entire table. The way this index is useful is if you have queries of the form: select * from x where lower(lastname) between ? and ? order by lower(lastname) or select * from x order by lower(lastname) offset ? limit ? Though this will eventually switch to sorting when the offset is large. Better is to use something like: select * from x where lower(lastname) > ? order by lower(lastname) limit ? or perhaps something like this if a merge join with fast start is useful: select * from x join y on (x.lower(lastname)=y.lower(lastname)) But -- greg
Am Freitag, 17. Dezember 2004 01:59 schrieb Michael Fuhr: > > Initialize the database cluster with a locale setting other than "C". > > Hmmm...did I misunderstand something when I recommended using > ORDER BY LOWER(person.lastname)? Those are two different ways to achieve the same effect in this particular case. -- Peter Eisentraut http://developer.postgresql.org/~petere/
On Thu, Dec 16, 2004 at 23:33:00 -0500, Greg Stark <gsstark@mit.edu> wrote: > > Chris Smith <chris@interspire.com> writes: > > > Would doing it this way require an index: > > > > create index lower_lastname on table x lower(lastname); > > Well it doesn't *require* but it may be a good idea. It depends on your > queries. It will NOT be useful for a query like: > > select * from x order by lower(lastname) > > where postgres won't bother with the index since it will be slower than just > resorting the entire table. The way this index is useful is if you have > queries of the form: Using an index to do an order by is an order N operation. Doing a sort is an order N log N operation. For large values of N, a index will be faster. The index will slow down write operations, so it may still be a bad idea in some cases.
Bruno Wolff III <bruno@wolff.to> writes: > Using an index to do an order by is an order N operation. No, using an index to do an order by is actually still n*log(n). You have to traverse all the parent pages in the binary tree of the index as well. This only goes to show how small the log(n) component of the order is. It's easily dwarfed by large constants such as the difference in i/o efficiency from non-contiguous reads. -- greg
Bruno Wolff III <bruno@wolff.to> writes: > Greg Stark <gsstark@mit.edu> wrote: >> where postgres won't bother with the index since it will be slower than just >> resorting the entire table. > Using an index to do an order by is an order N operation. Doing a sort > is an order N log N operation. For large values of N, a index will be > faster. Unfortunately not ... the constant factors are such that the index solution isn't very competitive at large N, unless the table is already well ordered (ie clustered). The sort code is a lot better at avoiding random-seeks-all-over-the-disk syndrome. I suppose your argument is good as N goes to infinity, but for real-world cases we don't seem to reach the asymptotic regime. regards, tom lane
Greg Stark <gsstark@mit.edu> writes: > Bruno Wolff III <bruno@wolff.to> writes: >> Using an index to do an order by is an order N operation. > No, using an index to do an order by is actually still n*log(n). You have to > traverse all the parent pages in the binary tree of the index as well. Only if you searched afresh from the root for each key, which an indexscan is not going to do. regards, tom lane
--=======6C297FEF======= Content-Type: text/plain; x-avg-checked=avg-ok-43C64EFF; charset=us-ascii; format=flowed Content-Transfer-Encoding: 8bit At 12:14 PM 12/17/2004 -0500, Tom Lane wrote: >Bruno Wolff III <bruno@wolff.to> writes: > > Greg Stark <gsstark@mit.edu> wrote: > >> where postgres won't bother with the index since it will be slower > than just > >> resorting the entire table. > > > Using an index to do an order by is an order N operation. Doing a sort > > is an order N log N operation. For large values of N, a index will be > > faster. > >Unfortunately not ... the constant factors are such that the index >solution isn't very competitive at large N, unless the table is already >well ordered (ie clustered). The sort code is a lot better at avoiding >random-seeks-all-over-the-disk syndrome. Which would involve reading less data? In some cases I'd like to use the method that is more likely to fit within RAM. Also if there are multiple parallel queries, one could end up with something similar to random-seeks, so reading/writing less data could still be better. Regards, Link. --=======6C297FEF=======--
Tom Lane <tgl@sss.pgh.pa.us> writes: > Greg Stark <gsstark@mit.edu> writes: > > Bruno Wolff III <bruno@wolff.to> writes: > >> Using an index to do an order by is an order N operation. > > > No, using an index to do an order by is actually still n*log(n). You have to > > traverse all the parent pages in the binary tree of the index as well. > > Only if you searched afresh from the root for each key, which an > indexscan is not going to do. Isn't that still nlog(n)? In the end you're going to have read in every page of the index including all those non-leaf pages. Aren't there nlog(n) pages? -- greg
On Fri, Dec 17, 2004 at 15:36:58 -0500, Greg Stark <gsstark@mit.edu> wrote: > > Isn't that still nlog(n)? In the end you're going to have read in every page > of the index including all those non-leaf pages. Aren't there nlog(n) pages? The depth of the tree is log N, but there are only N nodes. I think you can treat the amount of information at each node as constant.
On Fri, Dec 17, 2004 at 15:12:18 -0600, Bruno Wolff III <bruno@wolff.to> wrote: > On Fri, Dec 17, 2004 at 15:36:58 -0500, > Greg Stark <gsstark@mit.edu> wrote: > > > > Isn't that still nlog(n)? In the end you're going to have read in every page > > of the index including all those non-leaf pages. Aren't there nlog(n) pages? > > The depth of the tree is log N, but there are only N nodes. I think you > can treat the amount of information at each node as constant. That should have been 2N-1 nodes.
Bruno Wolff III <bruno@wolff.to> writes: > Greg Stark <gsstark@mit.edu> wrote: >> Isn't that still nlog(n)? In the end you're going to have read in every page >> of the index including all those non-leaf pages. Aren't there nlog(n) pages? > The depth of the tree is log N, but there are only N nodes. I think you > can treat the amount of information at each node as constant. Besides, we don't read any non-leaf pages except the ones down the left edge of the btree. An indexscan just visits the leaf pages using their sibling links to get from one to the next. (If it did otherwise, we'd not produce the output in sorted order, which would make this discussion moot...) regards, tom lane