Thread: BUG #6480: NLS text width problem
The following bug has been logged on the website: Bug reference: 6480 Logged by: Sergey Burladyan Email address: eshkinkot@gmail.com PostgreSQL version: 9.1.2 Operating system: Debian testing Description:=20=20=20=20=20=20=20=20 This code incorrectly calculate width for translated text if it multibyte string. strlen(ct) vs. UTF-8 src/bin/psql/describe.c:2100 else { /* display the list of child tables */ const char *ct =3D _("Child tables"); for (i =3D 0; i < tuples; i++) { if (i =3D=3D 0) printfPQExpBuffer(&buf, "%s: %s", ct, PQgetvalue(result, i, 0)); else printfPQExpBuffer(&buf, "%*s %s", (int) strlen(ct), "", PQgetvalue(result, i, 0)); if (i < tuples - 1) appendPQExpBuffer(&buf, ","); printTableAddFooter(&cont, buf.data); } } PQclear(result);
eshkinkot@gmail.com writes: > The following bug has been logged on the website: > > Bug reference: 6480 > Logged by: Sergey Burladyan > Email address: eshkinkot@gmail.com > PostgreSQL version: 9.1.2 > Operating system: Debian testing > Description:=20=20=20=20=20=20=20=20 > > This code incorrectly calculate width for translated text if it multibyte > string. strlen(ct) vs. UTF-8 > > src/bin/psql/describe.c:2100 Test case: create table t (); create table t_1 () inherits (t); create table t_2 () inherits (t); create table d () inherits (t_1, t_2); \d+ t \d+ d Table "public.t" Column | Type | Modifiers | Storage | Description=20 --------+------+-----------+---------+------------- Child tables: t_1, t_2 Has OIDs: no Table "public.d" Column | Type | Modifiers | Storage | Description=20 --------+------+-----------+---------+------------- Inherits: t_1, t_2 Has OIDs: no English, correct indentation: . . . Child tables: t_1, t_2 . . . Inherits: t_1, t_2 Russian (UTF-8), wrong indentation: =D0=A2=D0=B0=D0=B1=D0=BB=D0=B8=D1=86=D0=B0 "public.t" =D0=9A=D0=BE=D0=BB=D0=BE=D0=BD=D0=BA=D0=B0 | =D0=A2=D0=B8=D0=BF | =D0=9C= =D0=BE=D0=B4=D0=B8=D1=84=D0=B8=D0=BA=D0=B0=D1=82=D0=BE=D1=80=D1=8B | =D0=A5= =D1=80=D0=B0=D0=BD=D0=B8=D0=BB=D0=B8=D1=89=D0=B5 | =D0=9E=D0=BF=D0=B8=D1=81= =D0=B0=D0=BD=D0=B8=D0=B5=20 ---------+-----+--------------+-----------+---------- =D0=94=D0=BE=D1=87=D0=B5=D1=80=D0=BD=D0=B8=D0=B5 =D1=82=D0=B0=D0=B1=D0=BB= =D0=B8=D1=86=D1=8B: t_1, t_2 =D0=A1=D0=BE=D0=B4=D0=B5=D1=80=D0=B6=D0=B8=D1=82 OID: =D0=BD=D0=B5=D1=82 =D0=A2=D0=B0=D0=B1=D0=BB=D0=B8=D1=86=D0=B0 "public.d" =D0=9A=D0=BE=D0=BB=D0=BE=D0=BD=D0=BA=D0=B0 | =D0=A2=D0=B8=D0=BF | =D0=9C= =D0=BE=D0=B4=D0=B8=D1=84=D0=B8=D0=BA=D0=B0=D1=82=D0=BE=D1=80=D1=8B | =D0=A5= =D1=80=D0=B0=D0=BD=D0=B8=D0=BB=D0=B8=D1=89=D0=B5 | =D0=9E=D0=BF=D0=B8=D1=81= =D0=B0=D0=BD=D0=B8=D0=B5=20 ---------+-----+--------------+-----------+---------- =D0=9D=D0=B0=D1=81=D0=BB=D0=B5=D0=B4=D1=83=D0=B5=D1=82: t_1, t_2 =D0=A1=D0=BE=D0=B4=D0=B5=D1=80=D0=B6=D0=B8=D1=82 OID: =D0=BD=D0=B5=D1=82 --=20 Sergey Burladyan
On ons, 2012-02-22 at 22:37 +0400, Sergey Burladyan wrote: > eshkinkot@gmail.com writes: > > > The following bug has been logged on the website: > > > > Bug reference: 6480 > > Logged by: Sergey Burladyan > > Email address: eshkinkot@gmail.com > > PostgreSQL version: 9.1.2 > > Operating system: Debian testing > > Description: > > > > This code incorrectly calculate width for translated text if it multibyte > > string. strlen(ct) vs. UTF-8 > > > > src/bin/psql/describe.c:2100 Can you prepare a patch?
Peter Eisentraut <peter_e@gmx.net> writes: > On ons, 2012-02-22 at 22:37 +0400, Sergey Burladyan wrote: > > eshkinkot@gmail.com writes: > > > > > The following bug has been logged on the website: > > > > > > Bug reference: 6480 > > > Logged by: Sergey Burladyan > > > Email address: eshkinkot@gmail.com > > > PostgreSQL version: 9.1.2 > > > Operating system: Debian testing > > > Description: > > > > > > This code incorrectly calculate width for translated text if it multibyte > > > string. strlen(ct) vs. UTF-8 > > > > > > src/bin/psql/describe.c:2100 > > Can you prepare a patch? > Surely, I was sent this patch to pgsql-hackers and added to the commitfest-next to be sure I'll never lost it https://commitfest.postgresql.org/action/patch_view?id=816 Unfortunately, I was sent it with content-disposition: inline by mistake, as result, web interface divided it by two independent parts. Also this patch for 9.1 To resolve this issue, I was rebased this patch to current master (bc97c38) and send it as attachment. Here it is:
Sergey Burladyan <eshkinkot@gmail.com> writes: > Peter Eisentraut <peter_e@gmx.net> writes: >> Can you prepare a patch? > Surely, I was sent this patch to pgsql-hackers and added to the commitfest-next to > be sure I'll never lost it https://commitfest.postgresql.org/action/patch_view?id=816 Hmm, this patch makes it obvious that the current incarnation of pg_wcswidth has never worked. Good thing it's been unused for the same length of time :-( > Unfortunately, I was sent it with content-disposition: inline by mistake, as > result, web interface divided it by two independent parts. Also this patch for 9.1 I'm a bit nervous about the idea of back-patching this, as if there is anything wrong with it it will break code that works perfectly fine for most people. Possibly more to the point, it is making assumptions about the behavior of printf with %*s that I think are unportable. Even granted that libc is glibc, isn't this pretty much guaranteed to fail if glibc's idea of the encoding is different from pset.encoding? I think it'd be better to avoid depending on %*s for the data string and instead use it (with appropriate adjustment of the calculation) for the space-separator part of the format. Since that's a constant empty string, there shouldn't be any possibility of libc doing something other than what we intend. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> writes: > I think it'd be better to avoid depending on %*s for the data string > and instead use it (with appropriate adjustment of the calculation) > for the space-separator part of the format. Since that's a constant > empty string, there shouldn't be any possibility of libc doing something > other than what we intend. Sorry, I'm going on vacation for four days. Can't answer right now... -- Sergey Burladyan
Sergey Burladyan <eshkinkot@gmail.com> writes: > Tom Lane <tgl@sss.pgh.pa.us> writes: >> I think it'd be better to avoid depending on %*s for the data string >> and instead use it (with appropriate adjustment of the calculation) >> for the space-separator part of the format. Since that's a constant >> empty string, there shouldn't be any possibility of libc doing something >> other than what we intend. > Sorry, I'm going on vacation for four days. Can't answer right now... Ah, nevermind --- I re-read the patch and realized that it was already doing exactly what I said. Committed, sorry for the noise. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> writes: > Ah, nevermind --- I re-read the patch and realized that it was already > doing exactly what I said. Committed, sorry for the noise. Great, thank you, Tom! -- Sergey Burladyan