Re: client side syntax error localisation for psql (v1) - Mailing list pgsql-hackers

From Fabien COELHO
Subject Re: client side syntax error localisation for psql (v1)
Date
Msg-id Pine.GSO.4.58.0403121345270.19051@elvis
Whole thread Raw
In response to Re: client side syntax error localisation for psql (v1)  (Tatsuo Ishii <t-ishii@sra.co.jp>)
Responses Re: client side syntax error localisation for psql (v1)
List pgsql-hackers
Dear Tatsuo,

> > > 1) a character is not always represented on a terminal propotional to
> > >    the storage size. For example a kanji character in UTF-8 encoding
> > >    has a storage size of 3 bytes while it occupies spaces only twice
> > >    of ASCII characters on a terminal. Same thing can be said to LATIN
> > >    2,3 etc. in UTF-8 perhaps.
> >
> > I thought I dealt with that in the code by calling PQmblen for every char.
> > Am I wrong ?
>
> PQmblen returns the storage size, which is not necessarily same as the
> character width reprensented in a terminal. For example for a kanji
> character in UTF-8 PQmblen returns 3, but it ocuppies 2 x ASCII
> character space, not x 3. Isn't that a problem for you?

If I read you correctly, you mean that 1 character may take 3 bytes
of storage in the string, but it is not guaranteed to be 1 character
from the terminal perspective... Argh, that's definitely an issue:-(

I assumed that one character whatever the encoding would be 1 character
on the display.

If it is not the case, I think I can put/compute this information in the
translation structures that is use by PQmblen, and implement a
PQmbtermlen function...

Maybe you could point me some source of information about display lengths
of characters depending on the encoding?

> > What I mean by "ASCII compatible" is that spaces, new lines, carriage
> > returns, tabs and NULL (C string terminaison) are one byte characters.
> > This assumption seemed pretty safe to me.
>
> I think you can do it safely using PQmblen.

Ok, what you describe is basically what I've done with the qidx
computation as suggested by Tom Lane and then later I check that the
encoded length is one to find my special characters.

Thanks for you reply,

-- 
Fabien Coelho - coelho@cri.ensmp.fr


pgsql-hackers by date:

Previous
From: Tatsuo Ishii
Date:
Subject: Re: client side syntax error localisation for psql (v1)
Next
From: Tatsuo Ishii
Date:
Subject: Re: client side syntax error localisation for psql (v1)