Thread: Sorting problems with SELECT * FROM table WHERE name LIKE 'Ö%'

Sorting problems with SELECT * FROM table WHERE name LIKE 'Ö%'

From

Nico Grubert

Date:

30 October 2005, 12:15:21

Hi there,

I have a problem when sorting records with:
SELECT * FROM table WHERE name LIKE 'Ö%'

I am running Postgres 8.02 with a database whose character encoding is
UNICODE.

The SQL Query

   SELECT *
     FROM member
     WHERE name LIKE 'O%'
           OR
           name like 'Ö%'
     ORDER BY name


returns this:
  Öhlmann
  Öhmann
  Obenaus
  Ochoa
  O'Donovan
  Oehme
  Oklant
  Oltub
  Oltüch
  Oltutz
  Oltüwer

According to german sorting rules the result is fine except the both
first entries "Öhlmann" and "Öhmann".
Why do appear these records at the beginning of the list?
The proper result should read like this:
  Obenaus
  Ochoa
  O'Donovan
  Oehme
  Öhlmann
  Öhmann
  Oklant
  Oltub
  Oltüch
  Oltutz
  Oltüwer



The same problem accours when using "E" where my result is this:
   Élie de Beaumont
   Eberer
   Ecü
   Edding
   Emmer

The proper result should be:
   Eberer
   Ecü
   Edding
   Élie de Beaumont
   Emmer


Any idea how I can solve this problem?


Thank you very much in advance,
Nico

Re: Sorting problems with SELECT * FROM table WHERE name LIKE 'Ö%'

From

Tom Lane

Date:

30 October 2005, 12:31:51

Nico Grubert <nicogrubert@arcor.de> writes:
> I have a problem when sorting records with:
> SELECT * FROM table WHERE name LIKE '�%'

> I am running Postgres 8.02 with a database whose character encoding is
> UNICODE.

... but what locale is it using?  (See LC_COLLATE and LC_CTYPE.)

            regards, tom lane

Re: [GENERAL] Sorting problems with SELECT * FROM table WHERE name LIKE 'Ö%'

From

Nico Grubert

Date:

30 October 2005, 12:36:10

> ... but what locale is it using?  (See LC_COLLATE and LC_CTYPE.)

Can I find out out these settings in "phpPgAdmin"?
Or can I use LC_COLLATE and LC_CTYPE in the SQL Query?

Re: [GENERAL] Sorting problems with SELECT * FROM table WHERE name LIKE 'Ö%'

From

Nico Grubert

Date:

30 October 2005, 12:38:05

Ah, I found it:

lc_collate:     de_DE@euro
lc_ctype:     de_DE@euro

Re: [GENERAL] Sorting problems with SELECT * FROM table WHERE name LIKE 'Ö%'

From

Nico Grubert

Date:

31 October 2005, 13:37:59

> I have a problem when sorting records with:
> SELECT * FROM table WHERE name LIKE 'Ö%'
>
> I am running Postgres 8.02 with a database whose character encoding is
> UNICODE.
>
> The SQL Query
>
>   SELECT *
>     FROM member
>     WHERE name LIKE 'O%'
>           OR
>           name like 'Ö%'
>     ORDER BY name
>
>
> returns this:
>  Öhlmann
>  Öhmann
>  Obenaus
>  Ochoa
>  O'Donovan
>  Oehme
>  Oklant
>  Oltub
>  Oltüch
>  Oltutz
>  Oltüwer
>
> According to german sorting rules the result is fine except the both
> first entries "Öhlmann" and "Öhmann".
> Why do appear these records at the beginning of the list?
> The proper result should read like this:
>  Obenaus
>  Ochoa
>  O'Donovan
>  Oehme
>  Öhlmann
>  Öhmann
>  Oklant
>  Oltub
>  Oltüch
>  Oltutz
>  Oltüwer
>
>
>
> The same problem accours when using "E" where my result is this:
>   Élie de Beaumont
>   Eberer
>   Ecü
>   Edding
>   Emmer
>
> The proper result should be:
>   Eberer
>   Ecü
>   Edding
>   Élie de Beaumont
>   Emmer
>
>
> Any idea how I can solve this problem?
>
>
> Thank you very much in advance,
> Nico

To complete the missing information, here are the variables set for the
databases:
add_missing_from     on
archive_command     unset
australian_timezones     off
authentication_timeout     60
bgwriter_delay     200
bgwriter_maxpages     100
bgwriter_percent     1
block_size     8192
check_function_bodies     on
checkpoint_segments     3
checkpoint_timeout     300
checkpoint_warning     30
client_encoding     UNICODE
client_min_messages     notice
commit_delay     0
commit_siblings     5
cpu_index_tuple_cost     0.001
cpu_operator_cost     0.0025
cpu_tuple_cost     0.01
custom_variable_classes     unset
DateStyle     ISO, MDY
db_user_namespace     off
deadlock_timeout     1000
debug_pretty_print     off
debug_print_parse     off
debug_print_plan     off
debug_print_rewritten     off
debug_shared_buffers     0
default_statistics_target     10
default_tablespace     unset
default_transaction_isolation     read committed
default_transaction_read_only     off
default_with_oids     on
effective_cache_size     1000
enable_hashagg     on
enable_hashjoin     on
enable_indexscan     on
enable_mergejoin     on
enable_nestloop     on
enable_seqscan     on
enable_sort     on
enable_tidscan     on
explain_pretty_print     on
extra_float_digits     0
from_collapse_limit     8
fsync     on
geqo     on
geqo_effort     5
geqo_generations     0
geqo_pool_size     0
geqo_selection_bias     2
geqo_threshold     12
integer_datetimes     on
join_collapse_limit     8
lc_collate     de_DE@euro
lc_ctype     de_DE@euro
lc_messages     de_DE@euro
lc_monetary     de_DE@euro
lc_numeric     de_DE@euro
lc_time     de_DE@euro
listen_addresses     localhost
log_connections     off
log_destination     stderr
log_disconnections     off
log_duration     off
log_error_verbosity     default
log_executor_stats     off
log_hostname     off
log_line_prefix     unset
log_min_duration_statement     -1
log_min_error_statement     panic
log_min_messages     notice
log_parser_stats     off
log_planner_stats     off
log_rotation_age     1440
log_rotation_size     10240
log_statement     none
log_statement_stats     off
log_truncate_on_rotation     off
maintenance_work_mem     16384
max_connections     100
max_files_per_process     1000
max_fsm_pages     20000
max_fsm_relations     1000
max_function_args     32
max_identifier_length     63
max_index_keys     32
max_locks_per_transaction     64
max_stack_depth     2048
password_encryption     on
port     5432
pre_auth_delay     0
random_page_cost     4
redirect_stderr     off
regex_flavor     advanced
rendezvous_name     unset
search_path     $user,public
server_encoding     UNICODE
server_version     8.0.2
shared_buffers     1000
silent_mode     off
sql_inheritance     on
ssl     off
statement_timeout     0
stats_block_level     off
stats_command_string     off
stats_reset_on_server_start     on
stats_row_level     off
stats_start_collector     on
superuser_reserved_connections     2
syslog_facility     LOCAL0
syslog_ident     postgres
TimeZone     Europe/Berlin
trace_notify     off
transaction_isolation     read committed
transaction_read_only     off
transform_null_equals     off
unix_socket_group     unset
unix_socket_permissions     511
vacuum_cost_delay     0
vacuum_cost_limit     200
vacuum_cost_page_dirty     20
vacuum_cost_page_hit     1
vacuum_cost_page_miss     10
wal_buffers     8
wal_sync_method     fdatasync
work_mem     1024
zero_damaged_pages     off


Is there any explaination why the result is not sorted properly?

Re: Re: [GENERAL] Sorting problems with SELECT * FROM

From

"Daniel Verite"

Date:

31 October 2005, 16:34:50

    Nico Grubert wrote:

> Ah, I found it:
>
> lc_collate:     de_DE@euro
> lc_ctype:     de_DE@euro

This is an iso-8859-15 locale, isn't it?
If your database encoding is UNICODE, I believe you'd have more success
using an  UTF8 locale, such as de_DE.UTF-8 in your case.

--
 Daniel
 PostgreSQL-powered mail user agent and storage: http://www.manitou-mail.org