Re: Shouldn't non-MULTIBYTE backend refuse to start in MB database? - Mailing list pgsql-hackers

From Tatsuo Ishii
Subject Re: Shouldn't non-MULTIBYTE backend refuse to start in MB database?
Date
Msg-id 20010215172508E.t-ishii@sra.co.jp
Whole thread Raw
In response to Re: Shouldn't non-MULTIBYTE backend refuse to start in MB database?  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Shouldn't non-MULTIBYTE backend refuse to start in MB database?  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
> >> Are these encodings all guaranteed to have the same collation order as
> >> SQL_ASCII?
> 
> > Yes & no. 
> 
> Um, I'm confused ...
> 
> >> If not, we have the same index corruption issues as for LOCALE.
> 
> > If the backend is configued with LOCALE enabled and the database is
> > not configured with LOCALE, we will have a problem. But this will
> > happen with/without MUTIBYTE anyway. Mutibyte support does nothing
> > with LOCALE support.
> 
> Can a backend configured with MULTIBYTE and running in non-SQL_ASCII
> encoding ever sort strings in non-character-code ordering, even if it
> is in C locale?  I should think that such behavior is highly likely
> for multibyte character sets.

Hum, I don't think I understand your point because of my English
abilities. I'm going to explain what I want to say in hex
representation, rather than English:-)

Suppose we have four EUC_JP multibyte strings, each consists of two
bytes (actually they are my name in KANJI characters). They would look
like:

0xc0d0
0xb0e6
0xc3a3
0xd7c9

If we sort these strings using strcmp(), we would get:

0xb0e6
0xc0d0
0xc3a3
0xd7c9

This result might not be perfect, but resonable for most cases since
the code value of each character in EUC_JP is defined in the hope that
it can be sorted by its phisical value.

If we are not satisfied with this result for some reasons, we could
add an auxiliary "yomigana" field to get the correct order (Yomigana
is a pronounciation of KANJI).

> If it can, then we mustn't allow a non-MULTIBYTE backend to run in
> such a database, I think.
> 
>             regards, tom lane

Can you explain more about this?
--
Tatsuo Ishii


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: untrusted Pl/tcl?
Next
From: Tatsuo Ishii
Date:
Subject: HISTORY