Re: handling unconvertible error messages - Mailing list pgsql-hackers

From Victor Wagner
Subject Re: handling unconvertible error messages
Date
Msg-id 20160810095046.4187f74b@fafnir.local.vm
Whole thread Raw
In response to Re: handling unconvertible error messages  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
Responses Re: handling unconvertible error messages  (Vladimir Sitnikov <sitnikov.vladimir@gmail.com>)
List pgsql-hackers
On Wed, 10 Aug 2016 11:08:43 +0900 (Tokyo Standard Time)
Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote:

> Hello,
> 
> (I've recovered the lost Cc recipients so far)
> 
> At Mon, 8 Aug 2016 12:52:11 +0300, Victor Wagner <vitus@wagner.pp.ru>
> wrote in <20160808125211.1361cc0f@fafnir.local.vm>
> > On Mon, 08 Aug 2016 18:28:57 +0900 (Tokyo Standard Time)
> > Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote:  
> > > 
> > > I don't see charset compatibility to be easily detectable,  
> > 
> > In the worst case we can hardcode explicit compatibility table.  
> 
> We could have the language lists compatible with some
> language-bound encodings.  For example, LATIN1 (ISO/IEC 8859-1),
> according to Wikipedia
> (https://en.wikipedia.org/wiki/ISO/IEC_8859-1)
> 
> According to the list, we might have the following compatibility
> list of locales, maybe without region.
> 
> {{"UTF8", "LATIN1"}, "af", "sq", "eu", "da", "en", "fo", "en"}... and
> so.
> 
> The biggest problem for this is at least *I* cannot confirm the
> validity of the list. Both about perfectness of coverage of

I think that people from localization team can. At least authors of
particular translation can tell which encodings support their language.

> ISO639-1 seems to have about 190 languages and most of them are

We don't have 190 message  catalog translations in the PostgreSQL.
So problem with encoding for messages is quite limited.

> 
> I suppose that 'fallback' means "have a try then use English if
> failed" so I think it is sutable rather for message, not for
> data, and it doesn't need any a priori information about

Yes, I'm talking about messages, not about encoding conversion for
data. As far as my experience goes, data in the PostgreSQL are
converted more or less predictable way. May be it could be improved,
but it is possible to set up client and server such way it would do a
right job. 

Situation with messages, especially ones which are returned before
establishing of the session completes (or when it fails) now is a bit
worse.

> compatibility. It seems to me that PostgreSQL refuses to ignore

Alas, it does. At least with example given by Peter Eisentraut at the
start of this thread.

> or conceal conversion errors and return broken or unwanted byte
> sequence for data.  Things are different for error messages, it
> is preferable to be anyyhow readable than totally abandoned.
> 
> > I think that for now we can assume that the best effort is already
> > done for the data, and think how to improve situation with
> > messages.  
> 
> Is there any source to know the compatibility for any combination
> of language vs encoding? Maybe we need a ground for the list.
> 
> regards,
> 




pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: PATCH: Batch/pipelining support for libpq
Next
From: Simon Riggs
Date:
Subject: Re: Small issues in syncrep.c