Thread: Index order
Hi chaps, I was just wondering if there's any way to tweak the way an an index is ordered on a text field? I have a table with a varchar field "artist" (see table def below), and if I order by the field "artist" it does not orderas expected when there are characters such as ^ at the start of the text. For example, the query "select artist fromartist order by artist;" orders as follows: BILLY BRAGG BILLY COBHAM & ASERE ^ BILLY ELLIOT 40 OFFER ^ BILLY ELLIOT HARD ROCK CAFE MEAL OFFER BILLY ELLIOT THE MUSICAL BILLY OCEAN BIOHAZARD I'd expect the rows starting with the caret to appear either at the start or end of, rather than in the middle, it appearsas if the index ignores them. Database locale is Latin1 Am I missing something obvious here? Thanks Glyn CREATE TABLE events.artist ( recnum bigint NOT NULL DEFAULT nextval(('"events"."artist_dfseq"'::text)::regclass), artist character varying(50) NOT NULL DEFAULT ' '::character varying, price_list character varying(4) NOT NULL DEFAULT ' '::character varying, CONSTRAINT artist_index01 PRIMARY KEY (artist) ) ALTER TABLE events.artist ADD CONSTRAINT artist_index01 PRIMARY KEY(artist);
Use a functional index! Il Tuesday 16 September 2008 14:34:44 Glyn Astill ha scritto: > Hi chaps, > > I was just wondering if there's any way to tweak the way an an index is > ordered on a text field? > > I have a table with a varchar field "artist" (see table def below), and if > I order by the field "artist" it does not order as expected when there are > characters such as ^ at the start of the text. For example, the query > "select artist from artist order by artist;" orders as follows: > > BILLY BRAGG > BILLY COBHAM & ASERE > ^ BILLY ELLIOT 40 OFFER > ^ BILLY ELLIOT HARD ROCK CAFE MEAL OFFER > BILLY ELLIOT THE MUSICAL > BILLY OCEAN > BIOHAZARD > > > I'd expect the rows starting with the caret to appear either at the start > or end of, rather than in the middle, it appears as if the index ignores > them. > > Database locale is Latin1 > > Am I missing something obvious here? > > Thanks > Glyn > > > CREATE TABLE events.artist > ( > recnum bigint NOT NULL DEFAULT > nextval(('"events"."artist_dfseq"'::text)::regclass), artist character > varying(50) NOT NULL DEFAULT ' '::character varying, price_list character > varying(4) NOT NULL DEFAULT ' '::character varying, CONSTRAINT > artist_index01 PRIMARY KEY (artist) > ) > > ALTER TABLE events.artist > ADD CONSTRAINT artist_index01 PRIMARY KEY(artist);
On Tue, 16 Sep 2008, Glyn Astill wrote: > Hi chaps, > > I was just wondering if there's any way to tweak the way an an index is ordered on a text field? Yes, it's called functional index. Write your function, which does whatever you want with your data and create index (foo(artist)) > > I have a table with a varchar field "artist" (see table def below), and if I order by the field "artist" it does not orderas expected when there are characters such as ^ at the start of the text. For example, the query "select artist fromartist order by artist;" orders as follows: > > BILLY BRAGG > BILLY COBHAM & ASERE > ^ BILLY ELLIOT 40 OFFER > ^ BILLY ELLIOT HARD ROCK CAFE MEAL OFFER > BILLY ELLIOT THE MUSICAL > BILLY OCEAN > BIOHAZARD > > > I'd expect the rows starting with the caret to appear either at the start or end of, rather than in the middle, it appearsas if the index ignores them. > > Database locale is Latin1 > > Am I missing something obvious here? > > Thanks > Glyn > > > CREATE TABLE events.artist > ( > recnum bigint NOT NULL DEFAULT nextval(('"events"."artist_dfseq"'::text)::regclass), > artist character varying(50) NOT NULL DEFAULT ' '::character varying, > price_list character varying(4) NOT NULL DEFAULT ' '::character varying, > CONSTRAINT artist_index01 PRIMARY KEY (artist) > ) > > ALTER TABLE events.artist > ADD CONSTRAINT artist_index01 PRIMARY KEY(artist); > > > > > Regards, Oleg _____________________________________________________________ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83
> > > Hi chaps, > > > > I was just wondering if there's any way to tweak > the way an an index is ordered on a text field? > > Yes, it's called functional index. Write your function, > which does > whatever you want with your data and create index > (foo(artist)) > Hmm, I understand what you're saying, but how ion earth do I create a function that reorders the result based on all thedifferent characters ^ . * etc that could cause this? Also does anyone know why this happens? Why is the caret ignored ibn the index order - surely that just plain wrong?
Glyn Astill <glynastill@yahoo.co.uk> writes: > I was just wondering if there's any way to tweak the way an an index is ordered on a text field? > I have a table with a varchar field "artist" (see table def below), and if I order by the field "artist" it does not order as expected when there are characters such as ^ at the start of the text. You didn't say what you consider "expected", but I suspect that the answer is you want to use a different locale setting ... probably C. Unfortunately there's currently no way to change that short of initdb. regards, tom lane
> > > I have a table with a varchar field "artist" > (see table def below), > and if I order by the field "artist" it does not > order as expected when > there are characters such as ^ at the start of the text. > > You didn't say what you consider "expected", > but I suspect that the > answer is you want to use a different locale setting ... > probably C. > Unfortunately there's currently no way to change that > short of initdb. > Sorry. I expected the index to order A-Z with anything starting with the carret (or other characters) being either at theend or at the start, not in the middle. I know it's naive, but for some reason I expected the index would order any textin alphabetical order e.g. AA,AB,AC,ZA,ZZ,^A,^Z ...etc I guess if the only soloution is initdb and a different locale, I'll have to find a workaround elsewhere. The client applicationsthat rely on this are a pile of junk anyway.
Glyn Astill <glynastill@yahoo.co.uk> writes: > I'd expect the rows starting with the caret to appear either at the start or > end of, rather than in the middle, it appears as if the index ignores them. > > Database locale is Latin1 Latin1 isn't a locale, it's a character set (and an encoding). Your locale is probably something like en_US which generally sorts as you show above. It sounds like you're expecting the C locale sorting which is the ascii order. You can either re-initdb your database with locale set to C, or on recent versions of Postgres you can use "ORDER BY artist USING ~>~" which specifically sorts based on the binary ascii order. If you want the query to use an index you would have to built it with something like: create index idx on foo (artist text_pattern_ops); Note that this ordering won't do anything sane with any accented characters or in latin1 since they'll be sorted by their binary value which comes after all the unaccented characters. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's 24x7 Postgres support!
> Hmm, I understand what you're saying, but how ion earth do I create > a function that reorders the result based on all the different > characters ^ . * etc that could cause this? Write a function that strips out the characters you want to ignore (returning the rest of the given string) and then create an index on that function. (Please note: that's an answer to the question that you asked above, not necessarily a recommendation that you choose this particular collation strategy). -- Korry
> Write a function that strips out the characters you want to > ignore > (returning the rest of the given string) and then create an > index on > that function. > Hmm, thanks for the suggesgion. Although the problem seems to have been that the locale I was using was ignoring those characterswhen they should have been taken into account, so taking them out wouldn't have helped. It's all academic now anyway, I dumped the databases on all the servers, re-initdb'd, reloaded the data and fudged slonyback in.