Thread: Index order

Index order

From
Glyn Astill
Date:
Hi chaps,

I was just wondering if there's any way to tweak the way an an index is ordered on a text field?

I have a table with a varchar field "artist" (see table def below), and if I order by the field "artist" it does not
orderas expected when there are characters such as ^ at the start of the text.  For example, the query "select artist
fromartist order by artist;" orders as follows: 

 BILLY BRAGG
 BILLY COBHAM & ASERE
 ^ BILLY ELLIOT 40 OFFER
 ^ BILLY ELLIOT HARD ROCK CAFE MEAL OFFER
 BILLY ELLIOT THE MUSICAL
 BILLY OCEAN
 BIOHAZARD


I'd expect the rows starting with the caret to appear either at the start or end of, rather than in the middle, it
appearsas if the index ignores them. 

Database locale is Latin1

Am I missing something obvious here?

Thanks
Glyn


CREATE TABLE events.artist
(
  recnum bigint NOT NULL DEFAULT nextval(('"events"."artist_dfseq"'::text)::regclass),
  artist character varying(50) NOT NULL DEFAULT ' '::character varying,
  price_list character varying(4) NOT NULL DEFAULT ' '::character varying,
  CONSTRAINT artist_index01 PRIMARY KEY (artist)
)

ALTER TABLE events.artist
  ADD CONSTRAINT artist_index01 PRIMARY KEY(artist);




Re: Index order

From
Reg Me Please
Date:
Use a functional index!

Il Tuesday 16 September 2008 14:34:44 Glyn Astill ha scritto:
> Hi chaps,
>
> I was just wondering if there's any way to tweak the way an an index is
> ordered on a text field?
>
> I have a table with a varchar field "artist" (see table def below), and if
> I order by the field "artist" it does not order as expected when there are
> characters such as ^ at the start of the text.  For example, the query
> "select artist from artist order by artist;" orders as follows:
>
>  BILLY BRAGG
>  BILLY COBHAM & ASERE
>  ^ BILLY ELLIOT 40 OFFER
>  ^ BILLY ELLIOT HARD ROCK CAFE MEAL OFFER
>  BILLY ELLIOT THE MUSICAL
>  BILLY OCEAN
>  BIOHAZARD
>
>
> I'd expect the rows starting with the caret to appear either at the start
> or end of, rather than in the middle, it appears as if the index ignores
> them.
>
> Database locale is Latin1
>
> Am I missing something obvious here?
>
> Thanks
> Glyn
>
>
> CREATE TABLE events.artist
> (
>   recnum bigint NOT NULL DEFAULT
> nextval(('"events"."artist_dfseq"'::text)::regclass), artist character
> varying(50) NOT NULL DEFAULT ' '::character varying, price_list character
> varying(4) NOT NULL DEFAULT ' '::character varying, CONSTRAINT
> artist_index01 PRIMARY KEY (artist)
> )
>
> ALTER TABLE events.artist
>   ADD CONSTRAINT artist_index01 PRIMARY KEY(artist);



Re: Index order

From
Oleg Bartunov
Date:
On Tue, 16 Sep 2008, Glyn Astill wrote:

> Hi chaps,
>
> I was just wondering if there's any way to tweak the way an an index is ordered on a text field?

Yes, it's called functional index. Write your function, which does
whatever you want with your data and create index (foo(artist))


>
> I have a table with a varchar field "artist" (see table def below), and if I order by the field "artist" it does not
orderas expected when there are characters such as ^ at the start of the text.  For example, the query "select artist
fromartist order by artist;" orders as follows: 
>
> BILLY BRAGG
> BILLY COBHAM & ASERE
> ^ BILLY ELLIOT 40 OFFER
> ^ BILLY ELLIOT HARD ROCK CAFE MEAL OFFER
> BILLY ELLIOT THE MUSICAL
> BILLY OCEAN
> BIOHAZARD
>
>
> I'd expect the rows starting with the caret to appear either at the start or end of, rather than in the middle, it
appearsas if the index ignores them. 
>
> Database locale is Latin1
>
> Am I missing something obvious here?
>
> Thanks
> Glyn
>
>
> CREATE TABLE events.artist
> (
>  recnum bigint NOT NULL DEFAULT nextval(('"events"."artist_dfseq"'::text)::regclass),
>  artist character varying(50) NOT NULL DEFAULT ' '::character varying,
>  price_list character varying(4) NOT NULL DEFAULT ' '::character varying,
>  CONSTRAINT artist_index01 PRIMARY KEY (artist)
> )
>
> ALTER TABLE events.artist
>  ADD CONSTRAINT artist_index01 PRIMARY KEY(artist);
>
>
>
>
>

     Regards,
         Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

Re: Index order

From
Glyn Astill
Date:
>
> > Hi chaps,
> >
> > I was just wondering if there's any way to tweak
> the way an an index is ordered on a text field?
>
> Yes, it's called functional index. Write your function,
> which does
> whatever you want with your data and create index
> (foo(artist))
>

Hmm, I understand what you're saying, but how ion earth do I create a function that reorders the result based on all
thedifferent characters ^ . * etc that could cause this? 

Also does anyone know why this happens? Why is the caret ignored ibn the index order - surely that just plain wrong?






Re: Index order

From
Tom Lane
Date:
Glyn Astill <glynastill@yahoo.co.uk> writes:
> I was just wondering if there's any way to tweak the way an an index is ordered on a text field?

> I have a table with a varchar field "artist" (see table def below),
and if I order by the field "artist" it does not order as expected when
there are characters such as ^ at the start of the text.

You didn't say what you consider "expected", but I suspect that the
answer is you want to use a different locale setting ... probably C.
Unfortunately there's currently no way to change that short of initdb.

            regards, tom lane

Re: Index order

From
Glyn Astill
Date:
>
> > I have a table with a varchar field "artist"
> (see table def below),
> and if I order by the field "artist" it does not
> order as expected when
> there are characters such as ^ at the start of the text.
>
> You didn't say what you consider "expected",
> but I suspect that the
> answer is you want to use a different locale setting ...
> probably C.
> Unfortunately there's currently no way to change that
> short of initdb.
>

Sorry. I expected the index to order A-Z with anything starting with the carret (or other characters) being either at
theend or at the start, not in the middle.  I know it's naive, but for some reason I expected the index would order any
textin alphabetical order e.g. AA,AB,AC,ZA,ZZ,^A,^Z ...etc 

I guess if the only soloution is initdb and a different locale, I'll have to find a workaround elsewhere.  The client
applicationsthat rely on this are a pile of junk anyway. 





Re: Index order

From
Gregory Stark
Date:
Glyn Astill <glynastill@yahoo.co.uk> writes:

> I'd expect the rows starting with the caret to appear either at the start or
> end of, rather than in the middle, it appears as if the index ignores them.
>
> Database locale is Latin1

Latin1 isn't a locale, it's a character set (and an encoding).

Your locale is probably something like en_US which generally sorts as you show
above. It sounds like you're expecting the C locale sorting which is the ascii
order.

You can either re-initdb your database with locale set to C, or on recent
versions of Postgres you can use "ORDER BY artist USING ~>~" which
specifically sorts based on the binary ascii order.

If you want the query to use an index you would have to built it with
something like:

create index idx on foo (artist text_pattern_ops);

Note that this ordering won't do anything sane with any accented characters or
in latin1 since they'll be sorted by their binary value which comes after all
the unaccented characters.

--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com
  Ask me about EnterpriseDB's 24x7 Postgres support!

Re: Index order

From
Korry Douglas
Date:
> Hmm, I understand what you're saying, but how ion earth do I create
> a function that reorders the result based on all the different
> characters ^ . * etc that could cause this?

Write a function that strips out the characters you want to ignore
(returning the rest of the given string) and then create an index on
that function.

(Please note: that's an answer to the question that you asked above,
not necessarily a recommendation that you choose this particular
collation strategy).


                -- Korry


Re: Index order

From
Glyn Astill
Date:
> Write a function that strips out the characters you want to
> ignore
> (returning the rest of the given string) and then create an
> index on
> that function.
>

Hmm, thanks for the suggesgion. Although the problem seems to have been that the locale I was using was ignoring those
characterswhen they should have been taken into account, so taking them out wouldn't have helped. 

It's all academic now anyway, I dumped the databases on all the servers, re-initdb'd, reloaded the data and fudged
slonyback in.