Thread: case insensitive sorting & searching in oracle 10g
in oracle 10g, you can issue: ALTER SESSION SET NLS_COMP = ansi; ALTER SESSION SET NLS_SORT = binary_ci; do you think this is an elegant solution for case insensitive sorting & searching? is there interest in seeing this in postgres? -- dave
create a functional index on lower case value of your column. ORDER BY lower case value of your column. > in oracle 10g, you can issue: > > ALTER SESSION SET NLS_COMP = ansi; > ALTER SESSION SET NLS_SORT = binary_ci; > > do you think this is an elegant solution for case insensitive sorting & > searching? is there interest in seeing this in postgres? >
- not transparent - can't automatically make all values fed to SELECT case-converted - not transparent Pierre-Frédéric Caillaud wrote: > > create a functional index on lower case value of your column. > ORDER BY lower case value of your column. > > >> in oracle 10g, you can issue: >> >> ALTER SESSION SET NLS_COMP = ansi; >> ALTER SESSION SET NLS_SORT = binary_ci; >> >> do you think this is an elegant solution for case insensitive sorting >> & searching? is there interest in seeing this in postgres? -- dave
Hi, I don't like the solution. "Select ... order by ..." should be self-contained and not dependant of some settings. Case-insensitive sort should be specified in the order-by-clause like "select ... order by lower(a)". Tommi Am Donnerstag, 5. August 2004 11:04 schrieb David Garamond: > in oracle 10g, you can issue: > > ALTER SESSION SET NLS_COMP = ansi; > ALTER SESSION SET NLS_SORT = binary_ci; > > do you think this is an elegant solution for case insensitive sorting & > searching? is there interest in seeing this in postgres?
On Thu, 5 Aug 2004, David Garamond wrote: > in oracle 10g, you can issue: > > ALTER SESSION SET NLS_COMP = ansi; > ALTER SESSION SET NLS_SORT = binary_ci; > > do you think this is an elegant solution for case insensitive sorting & > searching? is there interest in seeing this in postgres? IMHO, no on both questions. There's always danger on relying on the value of session variables in general in that an application must either set the variable immediately before sending queries that use it (breaking the transparency) or must be willing to deal with the fact that it might not be what you expect. For the second, I don't see how this really does much that the standard spec collation stuff can't do better and I'd think that'd be a much better route to go.
Stephan Szabo wrote: >>in oracle 10g, you can issue: >> >> ALTER SESSION SET NLS_COMP = ansi; >> ALTER SESSION SET NLS_SORT = binary_ci; >> >>do you think this is an elegant solution for case insensitive sorting & >>searching? is there interest in seeing this in postgres? > > IMHO, no on both questions. There's always danger on relying on the > value of session variables in general in that an application must either > set the variable immediately before sending queries that use it (breaking > the transparency) or must be willing to deal with the fact that it might > not be what you expect. For the second, I don't see how this really does > much that the standard spec collation stuff can't do better and I'd think > that'd be a much better route to go. Could you point me where in the archives can I read more? I'm having a bit of trouble finding discussion on this. Thanks. -- dave
On Thu, 5 Aug 2004, David Garamond wrote: > Stephan Szabo wrote: > >>in oracle 10g, you can issue: > >> > >> ALTER SESSION SET NLS_COMP = ansi; > >> ALTER SESSION SET NLS_SORT = binary_ci; > >> > >>do you think this is an elegant solution for case insensitive sorting & > >>searching? is there interest in seeing this in postgres? > > > > IMHO, no on both questions. There's always danger on relying on the > > value of session variables in general in that an application must either > > set the variable immediately before sending queries that use it (breaking > > the transparency) or must be willing to deal with the fact that it might > > not be what you expect. For the second, I don't see how this really does > > much that the standard spec collation stuff can't do better and I'd think > > that'd be a much better route to go. > > Could you point me where in the archives can I read more? I'm having a > bit of trouble finding discussion on this. Thanks. I didn't spend too much time looking, but there are a few that look like they'll touch upon related issues: http://archives.postgresql.org/pgsql-hackers/2003-11/msg01299.php http://archives.postgresql.org/pgsql-hackers/2001-11/msg00610.php http://archives.postgresql.org/pgsql-hackers/2002-11/msg00515.php And a message where I pulled some text out of the SQL92 draft: http://archives.postgresql.org/pgsql-general/2003-08/msg00620.php
> IMHO, no on both questions. There's always danger on relying on the > value of session variables in general in that an application must either And what if you use a connection sharing/pooling software ? What happens with the session vars ?
Stephan Szabo wrote: >>Could you point me where in the archives can I read more? I'm having a >>bit of trouble finding discussion on this. Thanks. > > I didn't spend too much time looking, but there are a few that look like > they'll touch upon related issues: > > http://archives.postgresql.org/pgsql-hackers/2003-11/msg01299.php > http://archives.postgresql.org/pgsql-hackers/2001-11/msg00610.php > http://archives.postgresql.org/pgsql-hackers/2002-11/msg00515.php So, as I understand it, the current plan is: 1. charset + encoding will be tagged to each column (as per SQL standard) 2a. individual string values will be tagged with charset+encoding. this incurs an overhead of 1-2 bytes per value. or 2b. all string values will be stored in a single charset+encoding (e.g. unicode + utf8). this will of course upset some people, e.g. japanese. Is it 1+2a or 1+2b? Recent language implementations/VM like Parrot and Ruby2 are inclined to 2a, I think. -- dave
David Garamond wrote: > 2a. individual string values will be tagged with charset+encoding. this > incurs an overhead of 1-2 bytes per value. forgot to add: this overhead is just for "in-memory" or temporary value (e.g. when being passed as arguments). in the storage itself, this is not needed because charset+encoding is recorded in the column definition. -- dave