Thread: case insensitive sorting & searching in oracle 10g

case insensitive sorting & searching in oracle 10g

From
David Garamond
Date:
in oracle 10g, you can issue:

  ALTER SESSION SET NLS_COMP = ansi;
  ALTER SESSION SET NLS_SORT = binary_ci;

do you think this is an elegant solution for case insensitive sorting &
searching? is there interest in seeing this in postgres?

--
dave


Re: case insensitive sorting & searching in oracle 10g

From
Pierre-Frédéric Caillaud
Date:
    create a functional index on lower case value of your column.
    ORDER BY lower case value of your column.


> in oracle 10g, you can issue:
>
>   ALTER SESSION SET NLS_COMP = ansi;
>   ALTER SESSION SET NLS_SORT = binary_ci;
>
> do you think this is an elegant solution for case insensitive sorting &
> searching? is there interest in seeing this in postgres?
>



Re: case insensitive sorting & searching in oracle 10g

From
David Garamond
Date:
- not transparent
- can't automatically make all values fed to SELECT case-converted
- not transparent

Pierre-Frédéric Caillaud wrote:
>
>     create a functional index on lower case value of your column.
>     ORDER BY lower case value of your column.
>
>
>> in oracle 10g, you can issue:
>>
>>   ALTER SESSION SET NLS_COMP = ansi;
>>   ALTER SESSION SET NLS_SORT = binary_ci;
>>
>> do you think this is an elegant solution for case insensitive sorting
>> &  searching? is there interest in seeing this in postgres?

--
dave


Re: case insensitive sorting & searching in oracle 10g

From
Tommi Maekitalo
Date:
Hi,

I don't like the solution. "Select ... order by ..." should be self-contained
and not dependant of some settings. Case-insensitive sort should be specified
in the order-by-clause like "select ... order by lower(a)".


Tommi

Am Donnerstag, 5. August 2004 11:04 schrieb David Garamond:
> in oracle 10g, you can issue:
>
>   ALTER SESSION SET NLS_COMP = ansi;
>   ALTER SESSION SET NLS_SORT = binary_ci;
>
> do you think this is an elegant solution for case insensitive sorting &
> searching? is there interest in seeing this in postgres?

Re: case insensitive sorting & searching in oracle 10g

From
Stephan Szabo
Date:
On Thu, 5 Aug 2004, David Garamond wrote:

> in oracle 10g, you can issue:
>
>   ALTER SESSION SET NLS_COMP = ansi;
>   ALTER SESSION SET NLS_SORT = binary_ci;
>
> do you think this is an elegant solution for case insensitive sorting &
> searching? is there interest in seeing this in postgres?

IMHO, no on both questions.  There's always danger on relying on the
value of session variables in general in that an application must either
set the variable immediately before sending queries that use it (breaking
the transparency) or must be willing to deal with the fact that it might
not be what you expect.  For the second, I don't see how this really does
much that the standard spec collation stuff can't do better and I'd think
that'd be a much better route to go.


Re: case insensitive sorting & searching in oracle 10g

From
David Garamond
Date:
Stephan Szabo wrote:
>>in oracle 10g, you can issue:
>>
>>  ALTER SESSION SET NLS_COMP = ansi;
>>  ALTER SESSION SET NLS_SORT = binary_ci;
>>
>>do you think this is an elegant solution for case insensitive sorting &
>>searching? is there interest in seeing this in postgres?
>
> IMHO, no on both questions.  There's always danger on relying on the
> value of session variables in general in that an application must either
> set the variable immediately before sending queries that use it (breaking
> the transparency) or must be willing to deal with the fact that it might
> not be what you expect.  For the second, I don't see how this really does
> much that the standard spec collation stuff can't do better and I'd think
> that'd be a much better route to go.

Could you point me where in the archives can I read more? I'm having a
bit of trouble finding discussion on this. Thanks.

--
dave

Re: case insensitive sorting & searching in oracle 10g

From
Stephan Szabo
Date:
On Thu, 5 Aug 2004, David Garamond wrote:

> Stephan Szabo wrote:
> >>in oracle 10g, you can issue:
> >>
> >>  ALTER SESSION SET NLS_COMP = ansi;
> >>  ALTER SESSION SET NLS_SORT = binary_ci;
> >>
> >>do you think this is an elegant solution for case insensitive sorting &
> >>searching? is there interest in seeing this in postgres?
> >
> > IMHO, no on both questions.  There's always danger on relying on the
> > value of session variables in general in that an application must either
> > set the variable immediately before sending queries that use it (breaking
> > the transparency) or must be willing to deal with the fact that it might
> > not be what you expect.  For the second, I don't see how this really does
> > much that the standard spec collation stuff can't do better and I'd think
> > that'd be a much better route to go.
>
> Could you point me where in the archives can I read more? I'm having a
> bit of trouble finding discussion on this. Thanks.

I didn't spend too much time looking, but there are a few that look like
they'll touch upon related issues:

http://archives.postgresql.org/pgsql-hackers/2003-11/msg01299.php
http://archives.postgresql.org/pgsql-hackers/2001-11/msg00610.php
http://archives.postgresql.org/pgsql-hackers/2002-11/msg00515.php

And a message where I pulled some text out of the SQL92 draft:
http://archives.postgresql.org/pgsql-general/2003-08/msg00620.php

Re: case insensitive sorting & searching in oracle 10g

From
Pierre-Frédéric Caillaud
Date:
> IMHO, no on both questions.  There's always danger on relying on the
> value of session variables in general in that an application must either

    And what if you use a connection sharing/pooling software ? What happens
with the session vars ?

Re: case insensitive sorting & searching in oracle 10g

From
David Garamond
Date:
Stephan Szabo wrote:
>>Could you point me where in the archives can I read more? I'm having a
>>bit of trouble finding discussion on this. Thanks.
>
> I didn't spend too much time looking, but there are a few that look like
> they'll touch upon related issues:
>
> http://archives.postgresql.org/pgsql-hackers/2003-11/msg01299.php
> http://archives.postgresql.org/pgsql-hackers/2001-11/msg00610.php
> http://archives.postgresql.org/pgsql-hackers/2002-11/msg00515.php

So, as I understand it, the current plan is:

1. charset + encoding will be tagged to each column (as per SQL standard)

2a. individual string values will be tagged with charset+encoding. this
incurs an overhead of 1-2 bytes per value.

or

2b. all string values will be stored in a single charset+encoding (e.g.
unicode + utf8). this will of course upset some people, e.g. japanese.

Is it 1+2a or 1+2b? Recent language implementations/VM like Parrot and
Ruby2 are inclined to 2a, I think.

--
dave

Re: case insensitive sorting & searching in oracle 10g

From
David Garamond
Date:
David Garamond wrote:
> 2a. individual string values will be tagged with charset+encoding. this
> incurs an overhead of 1-2 bytes per value.

forgot to add: this overhead is just for "in-memory" or temporary value
(e.g. when being passed as arguments). in the storage itself, this is
not needed because charset+encoding is recorded in the column definition.

--
dave