Thread: Re: [NOVICE] encoding problems

Re: [NOVICE] encoding problems

From
Bruce Momjian
Date:
Cliff Nieuwenhuis wrote:
> On Tuesday 11 March 2008 11:41:35 Tom Lane wrote:
> > Cliff Nieuwenhuis <cliff@nieusite.com> writes:
> > > I'm not sure how to ask this question.  I have written a function, and
> > > with PostgreSQL 8.0.13 I can do a "\df+" and see something like this
> > > under Source Code:
> > >     DECLARE
> > >         result text;
> > > ...
> > >
> > > If I create the same function on my computer running PostgreSQL 8.3.0 and
> > > try the \df+ then the Source Code shows:
> > >
> > > \x09DECLARE
> > > \x09\x09result text;
> > > ...
> >
> > That's not an encoding problem, that's an intentional behavioral change
> > in the way that psql formats strings for display.
> >
> > I guess it's a bit annoying if you were hoping that tabs would be useful
> > for pretty-printing purposes.  Should we reconsider what's done with a
> > tab in mbprint.c?
> >
> >             regards, tom lane
>
> My vote would be to go back to the old way, or at least have that as an option
> of some sort.  I use command-line psql all the time -- to me, psql offers the
> same advantages as using a command-line interface for other work. I find the
> extra characters really get in the way.

Yes, I think our psql display of tabs needs improving too.  Our current
behavior is to output tab as 0x09:

    test=> SELECT E'\011';
     ?column?
    ----------
     \x09
    (1 row)

    test=> CREATE FUNCTION xx() RETURNS text AS E'
    test'> SELECT   ''a''::text
    test'> WHERE    1 = 1'
    test-> LANGUAGE SQL;
    CREATE FUNCTION

    test=> SELECT prosrc FROM pg_proc WHERE proname = 'xx';
           prosrc
    ---------------------

     SELECT\x09'a'::text
     WHERE\x091 = 1
    (1 row)

    test=> \x
    Expanded display is on.

    test=> \df+ xx
    List of functions
    -[ RECORD 1 ]-------+--------------------
    Schema              | public
    Name                | xx
    Result data type    | text
    Argument data types |
    Volatility          | volatile
    Owner               | postgres
    Language            | sql
    Source code         |
                        : SELECT\x09'a'::text
                        : WHERE\x091 = 1
    Description         |

I have implemented the following patch which outputs tab as a tab.  It
also assumes a tab has a width of 4, which is its average width:

    test=> SELECT E'\011';
     ?column?
    ----------

    (1 row)

    test=> SELECT prosrc FROM pg_proc WHERE proname = 'xx';
           prosrc
    ---------------------

     SELECT 'a'::text
     WHERE  1 = 1
    (1 row)

    test=> \x
    Expanded display is on.

    test=> \df+ xx
    List of functions
    -[ RECORD 1 ]-------+--------------------
    Schema              | public
    Name                | xx
    Result data type    | text
    Argument data types |
    Volatility          | volatile
    Owner               | postgres
    Language            | sql
    Source code         |
                        : SELECT    'a'::text
                        : WHERE     1 = 1
    Description         |

The only downside I see for this patch is that we are never sure of the
display width of tab because we don't know what tab stop we are at.
With '\x09' we knew exactly how wide it was.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +
Index: src/bin/psql/mbprint.c
===================================================================
RCS file: /cvsroot/pgsql/src/bin/psql/mbprint.c,v
retrieving revision 1.30
diff -c -c -r1.30 mbprint.c
*** src/bin/psql/mbprint.c    16 Apr 2008 18:18:00 -0000    1.30
--- src/bin/psql/mbprint.c    7 May 2008 15:18:25 -0000
***************
*** 315,320 ****
--- 315,330 ----
                  linewidth += 2;
                  ptr += 2;
              }
+             else if (*pwcs == '\t')        /* Tab */
+             {
+                 strcpy((char *) ptr, "\t");
+                 /*
+                  *    We don't know what tab stop we are on, so assuming 8-space
+                  *    tabs, the average width of a tab is 4.
+                  */
+                 linewidth += 4;
+                 ptr += 1;
+             }
              else if (w < 0)        /* Other control char */
              {
                  sprintf((char *) ptr, "\\x%02X", *pwcs);

Re: [NOVICE] encoding problems

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> I have implemented the following patch which outputs tab as a tab.  It
> also assumes a tab has a width of 4, which is its average width:

That pretty much completely sucks; it will undo all the hard work we've
put into nice formatting of the output, because seven times out of eight
this assumption is wrong.

An actually acceptable solution would involve emitting the correct
number of spaces depending on how much we've put out so far.

            regards, tom lane

Re: [NOVICE] encoding problems

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > I have implemented the following patch which outputs tab as a tab.  It
> > also assumes a tab has a width of 4, which is its average width:
>
> That pretty much completely sucks; it will undo all the hard work we've
> put into nice formatting of the output, because seven times out of eight
> this assumption is wrong.
>
> An actually acceptable solution would involve emitting the correct
> number of spaces depending on how much we've put out so far.

Even if we knew the column position at output time, when we are doing
aligned column width computations, we don't know the width of the
previous columns so we would have no way to know how far the tab would
extend in the current column.

The only other idea I have is to output four spaces rather than '\x09'
for a tab.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

Re: [NOVICE] encoding problems

From
Alvaro Herrera
Date:
Bruce Momjian wrote:

> Even if we knew the column position at output time, when we are doing
> aligned column width computations, we don't know the width of the
> previous columns so we would have no way to know how far the tab would
> extend in the current column.

If you start counting every line from the start of the current column,
it will align correctly regardless of the previous columns.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: [NOVICE] encoding problems

From
Bruce Momjian
Date:
Alvaro Herrera wrote:
> Bruce Momjian wrote:
>
> > Even if we knew the column position at output time, when we are doing
> > aligned column width computations, we don't know the width of the
> > previous columns so we would have no way to know how far the tab would
> > extend in the current column.
>
> If you start counting every line from the start of the current column,
> it will align correctly regardless of the previous columns.

At this stage you don't know the width of previous columns because you
don't know if a very wide value is coming in a later row, so there is no
way to output the width of the cell with a tab you are looking at now.

Unless I am misunderstanding you.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

Re: [NOVICE] encoding problems

From
Alvaro Herrera
Date:
Bruce Momjian wrote:
> Alvaro Herrera wrote:

> > If you start counting every line from the start of the current column,
> > it will align correctly regardless of the previous columns.
>
> At this stage you don't know the width of previous columns because you
> don't know if a very wide value is coming in a later row, so there is no
> way to output the width of the cell with a tab you are looking at now.
>
> Unless I am misunderstanding you.

Surely psql computes the width of all cells before printing anything.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: [NOVICE] encoding problems

From
Bruce Momjian
Date:
Alvaro Herrera wrote:
> Bruce Momjian wrote:
> > Alvaro Herrera wrote:
>
> > > If you start counting every line from the start of the current column,
> > > it will align correctly regardless of the previous columns.
> >
> > At this stage you don't know the width of previous columns because you
> > don't know if a very wide value is coming in a later row, so there is no
> > way to output the width of the cell with a tab you are looking at now.
> >
> > Unless I am misunderstanding you.
>
> Surely psql computes the width of all cells before printing anything.

It does, but if you have a value that has a tab, how do you know what
tab stop you are on because you don't know the final width of the
previous columns at that time, so there is no way to know the width of
that cell.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

Re: [NOVICE] encoding problems

From
Alvaro Herrera
Date:
Bruce Momjian wrote:
> Alvaro Herrera wrote:

> > Surely psql computes the width of all cells before printing anything.
>
> It does, but if you have a value that has a tab, how do you know what
> tab stop you are on because you don't know the final width of the
> previous columns at that time, so there is no way to know the width of
> that cell.

My point is that you don't need to align the tabstops with the start of
the line, but with the start of the _column_.  So the width of the
previous column doesn't matter.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: [NOVICE] encoding problems

From
Bruce Momjian
Date:
Alvaro Herrera wrote:
> Bruce Momjian wrote:
> > Alvaro Herrera wrote:
>
> > > Surely psql computes the width of all cells before printing anything.
> >
> > It does, but if you have a value that has a tab, how do you know what
> > tab stop you are on because you don't know the final width of the
> > previous columns at that time, so there is no way to know the width of
> > that cell.
>
> My point is that you don't need to align the tabstops with the start of
> the line, but with the start of the _column_.  So the width of the
> previous column doesn't matter.

OK, so I am not really using tabs in the output, but outputting the
proper number of spaces to make it look like a tab?  That works.  Let me
try it.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

Re: [NOVICE] encoding problems

From
Bruce Momjian
Date:
Alvaro Herrera wrote:
> Bruce Momjian wrote:
> > Alvaro Herrera wrote:
>
> > > Surely psql computes the width of all cells before printing anything.
> >
> > It does, but if you have a value that has a tab, how do you know what
> > tab stop you are on because you don't know the final width of the
> > previous columns at that time, so there is no way to know the width of
> > that cell.
>
> My point is that you don't need to align the tabstops with the start of
> the line, but with the start of the _column_.  So the width of the
> previous column doesn't matter.

Alvaro, using spaces instead of the terminal hard tabs was a very good
idea.  The output is now:

    test=> \x
    Expanded display is on.

    test=> \df+ xx
    List of functions
    -[ RECORD 1 ]-------+--------------------
    Schema              | public
    Name                | xx
    Result data type    | text
    Argument data types |
    Volatility          | volatile
    Owner               | postgres
    Language            | sql
    Source code         | SELECT  'a'::text
                        : WHERE   1 = 1
    Description         |


Patch attached.  It substitutes spaces for the tab.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +
Index: src/bin/psql/mbprint.c
===================================================================
RCS file: /cvsroot/pgsql/src/bin/psql/mbprint.c,v
retrieving revision 1.30
diff -c -c -r1.30 mbprint.c
*** src/bin/psql/mbprint.c    16 Apr 2008 18:18:00 -0000    1.30
--- src/bin/psql/mbprint.c    7 May 2008 20:27:39 -0000
***************
*** 315,320 ****
--- 315,328 ----
                  linewidth += 2;
                  ptr += 2;
              }
+             else if (*pwcs == '\t')        /* Tab */
+             {
+                 do
+                 {
+                     *ptr++ = ' ';
+                     linewidth++;
+                 } while (linewidth % 8 != 0);
+             }
              else if (w < 0)        /* Other control char */
              {
                  sprintf((char *) ptr, "\\x%02X", *pwcs);

Re: [NOVICE] encoding problems

From
Bruce Momjian
Date:
Applied.

---------------------------------------------------------------------------

Bruce Momjian wrote:
> Alvaro Herrera wrote:
> > Bruce Momjian wrote:
> > > Alvaro Herrera wrote:
> >
> > > > Surely psql computes the width of all cells before printing anything.
> > >
> > > It does, but if you have a value that has a tab, how do you know what
> > > tab stop you are on because you don't know the final width of the
> > > previous columns at that time, so there is no way to know the width of
> > > that cell.
> >
> > My point is that you don't need to align the tabstops with the start of
> > the line, but with the start of the _column_.  So the width of the
> > previous column doesn't matter.
>
> Alvaro, using spaces instead of the terminal hard tabs was a very good
> idea.  The output is now:
>
>     test=> \x
>     Expanded display is on.
>
>     test=> \df+ xx
>     List of functions
>     -[ RECORD 1 ]-------+--------------------
>     Schema              | public
>     Name                | xx
>     Result data type    | text
>     Argument data types |
>     Volatility          | volatile
>     Owner               | postgres
>     Language            | sql
>     Source code         | SELECT  'a'::text
>                         : WHERE   1 = 1
>     Description         |
>
>
> Patch attached.  It substitutes spaces for the tab.
>
> --
>   Bruce Momjian  <bruce@momjian.us>        http://momjian.us
>   EnterpriseDB                             http://enterprisedb.com
>
>   + If your life is a hard drive, Christ can be your backup. +


>
> --
> Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-patches

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

Re: [NOVICE] encoding problems

From
"Guillaume Smet"
Date:
On Thu, May 8, 2008 at 9:11 PM, Bruce Momjian <bruce@momjian.us> wrote:
>
> Applied.

As I mentioned it before, is there any chance for this fix to be
backported to 8.3 branch? IMHO it's a usability regression.

Thanks.

--
Guillaume

Re: [NOVICE] encoding problems

From
Bruce Momjian
Date:
Guillaume Smet wrote:
> On Thu, May 8, 2008 at 9:11 PM, Bruce Momjian <bruce@momjian.us> wrote:
> >
> > Applied.
>
> As I mentioned it before, is there any chance for this fix to be
> backported to 8.3 branch? IMHO it's a usability regression.

No, we don't change behaviors in back branches unless we get lots of
complaints, and we haven't in this case.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

Re: [NOVICE] encoding problems

From
"Guillaume Smet"
Date:
On Fri, May 9, 2008 at 4:38 AM, Bruce Momjian <bruce@momjian.us> wrote:
> No, we don't change behaviors in back branches unless we get lots of
> complaints, and we haven't in this case.

I suspect it's annoying for a lot of people, just not annoying enough
to make them complain about it.

I understand your point of view but I really think it's more a
regression fix than a behavior change.

That said, if nobody is following me on this one, I'll live with it -
it's just annoying, not blocking.

Thanks for fixing it anyway.

--
Guillaume

Re: [NOVICE] encoding problems

From
Bruce Momjian
Date:
Guillaume Smet wrote:
> On Fri, May 9, 2008 at 4:38 AM, Bruce Momjian <bruce@momjian.us> wrote:
> > No, we don't change behaviors in back branches unless we get lots of
> > complaints, and we haven't in this case.
>
> I suspect it's annoying for a lot of people, just not annoying enough
> to make them complain about it.
>
> I understand your point of view but I really think it's more a
> regression fix than a behavior change.
>
> That said, if nobody is following me on this one, I'll live with it -
> it's just annoying, not blocking.

If I can get other hackers to say we should backpatch we can consider
it.

Let me add though that as the patch is coded it is not the same as 8.2,
but better, so it is hard to say we should actually improve 8.3 over
8.2 in a minor release.  As you can see 8.3.X  behavior will not match
8.3 and that might be worse than just keeping it constant until 8.4.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

Re: [NOVICE] encoding problems

From
Alvaro Herrera
Date:
Bruce Momjian escribió:
> Guillaume Smet wrote:
> > On Thu, May 8, 2008 at 9:11 PM, Bruce Momjian <bruce@momjian.us> wrote:

> > As I mentioned it before, is there any chance for this fix to be
> > backported to 8.3 branch? IMHO it's a usability regression.
>
> No, we don't change behaviors in back branches unless we get lots of
> complaints, and we haven't in this case.

complaints++

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: [HACKERS] [NOVICE] encoding problems

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> Guillaume Smet wrote:
>> I understand your point of view but I really think it's more a
>> regression fix than a behavior change.

> If I can get other hackers to say we should backpatch we can consider
> it.

Well, 8.3 is already different from 8.2, and a lot of people will see
this particular aspect of it as a regression.  I'm okay with
backpatching to 8.3 ... though the patch needed rather more testing
than you gave it.

            regards, tom lane

Re: [HACKERS] [NOVICE] encoding problems

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > Guillaume Smet wrote:
> >> I understand your point of view but I really think it's more a
> >> regression fix than a behavior change.
>
> > If I can get other hackers to say we should backpatch we can consider
> > it.
>
> Well, 8.3 is already different from 8.2, and a lot of people will see
> this particular aspect of it as a regression.  I'm okay with
> backpatching to 8.3 ... though the patch needed rather more testing
> than you gave it.

OK, so Alvaro and Tom want this backpatched.  However, it isn't going to
match 8.2 behavior --- is that OK?

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

Re: [HACKERS] [NOVICE] encoding problems

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> Tom Lane wrote:
>> Well, 8.3 is already different from 8.2, and a lot of people will see
>> this particular aspect of it as a regression.  I'm okay with
>> backpatching to 8.3 ... though the patch needed rather more testing
>> than you gave it.

> OK, so Alvaro and Tom want this backpatched.  However, it isn't going to
> match 8.2 behavior --- is that OK?

Huh?  8.3 is already hugely different from 8.2 because of the newline
formatting changes.

            regards, tom lane

Re: [NOVICE] encoding problems

From
Cliff Nieuwenhuis
Date:
On Fri, 9 May 2008 08:38:01 -0400
Alvaro Herrera <alvherre@commandprompt.com> wrote:

> Bruce Momjian escribió:
> > Guillaume Smet wrote:
> > > On Thu, May 8, 2008 at 9:11 PM, Bruce Momjian <bruce@momjian.us>
> > > wrote:
>
> > > As I mentioned it before, is there any chance for this fix to be
> > > backported to 8.3 branch? IMHO it's a usability regression.
> >
> > No, we don't change behaviors in back branches unless we get lots of
> > complaints, and we haven't in this case.
>
> complaints++
>

I suppose this a "Me Too" post, but Bruce Momjian invites it.  You
folks take this to a level way beyond me, but I can tell you that the
idea of using spaces instead of the terminal hard tabs would solve my
problem -- I'd prefer _any_ choice of whitespace over seeing "\x09" on
the terminal.

--

Cliff Nieuwenhuis

"As long as the error messages keep changing we're making progress."

Re: [NOVICE] encoding problems

From
Bruce Momjian
Date:
Cliff Nieuwenhuis wrote:
> On Fri, 9 May 2008 08:38:01 -0400
> Alvaro Herrera <alvherre@commandprompt.com> wrote:
>
> > Bruce Momjian escribi?:
> > > Guillaume Smet wrote:
> > > > On Thu, May 8, 2008 at 9:11 PM, Bruce Momjian <bruce@momjian.us>
> > > > wrote:
> >
> > > > As I mentioned it before, is there any chance for this fix to be
> > > > backported to 8.3 branch? IMHO it's a usability regression.
> > >
> > > No, we don't change behaviors in back branches unless we get lots of
> > > complaints, and we haven't in this case.
> >
> > complaints++
> >
>
> I suppose this a "Me Too" post, but Bruce Momjian invites it.  You
> folks take this to a level way beyond me, but I can tell you that the
> idea of using spaces instead of the terminal hard tabs would solve my
> problem -- I'd prefer _any_ choice of whitespace over seeing "\x09" on
> the terminal.

FYI, this was corrected in 8.3.3.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +