Re: Unicode UTF-8 table formatting for psql text output - Mailing list pgsql-hackers

From Roger Leigh
Subject Re: Unicode UTF-8 table formatting for psql text output
Date
Msg-id 20091026233339.GC11903@codelibre.net
Whole thread Raw
In response to Re: Unicode UTF-8 table formatting for psql text output  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Unicode UTF-8 table formatting for psql text output  (Roger Leigh <rleigh@codelibre.net>)
List pgsql-hackers
On Mon, Oct 26, 2009 at 07:19:24PM -0400, Tom Lane wrote:
> Roger Leigh <rleigh@codelibre.net> writes:
> > On Mon, Oct 26, 2009 at 01:33:19PM -0400, Tom Lane wrote:
> >> Yeah.  We can do what we like with the UTF8 format but I'm considerably
> >> more worried about the aspect of making random changes to the
> >> plain-ASCII output.
>
> > I checked (using strace)
> > gnumeric (via libgda and gnome-database-properties)
> > openoffice (oobase)
>
> Even if that were the entire universe of programs we cared about,
> whether their internal ODBC logic goes through psql isn't really
> the point here.  What I'm worried about is somebody piping the
> text output of psql into another program.
>
> > On a related note, there's something odd with the pager code.
>
> Hm, what platform are you testing that on?

Debian GNU/Linux (unstable)
linux        2.6.30
eglibc       2.10.1
libreadline6 6.0.5
libncurses5  5.7
gcc          4.3.4

This is the trace of the broken write:

16206 write(1, "      Name       \342\224\202  Owner   \342\224"..., 102) = 102
16206 write(1,
"\342\224\200\342\224\200\342\224\200\342\224\200\342\224\200\342\224\200\342\224\200\342\224\200\342\224\200\342\224\200\342
\224"..., 256) = 256
16206 write(1, "\224\200\342\224\200\342\224\200\342\224\200\342\224\200\342\224\200\n", 18) = 18

I'll attach the whole thing for reference.  What's clear is that the
first write was *exactly* 256 bytes, which is what was requested,
presumably by libc stdio buffering (which shouldn't by itself be a
problem).  Since we use 3-byte UTF-8 and 256/3 is 85 + 1 remainder,
this is where the wierd 85 char forced newline comes from.  Since it
only happens when the terminal window is >85 chars, that's where I'm
assuming some odd termios influence comes from (though it might just
be the source of the window size and be completely innocent).  The
fact that libc did the two separate writes kind of rules out termios
mangling the output post-write().


Regards,
Roger

--
  .''`.  Roger Leigh
 : :' :  Debian GNU/Linux             http://people.debian.org/~rleigh/
 `. `'   Printing on GNU/Linux?       http://gutenprint.sourceforge.net/
   `-    GPG Public Key: 0x25BFB848   Please GPG sign your mail.

Attachment

pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: Anonymous Code Blocks as Lambdas?
Next
From: Greg Stark
Date:
Subject: Re: per-tablespace random_page_cost/seq_page_cost