Thread: Problem with press release

Problem with press release

From
Bruce Momjian
Date:
I saw a problem with the most recent press release --- some spaces were
encoded as non-breakable spaces in the press release:

    http://en.wikipedia.org/wiki/Non-breaking_space

    In ISO/IEC 8859, NBSP is 0xA0.

They showed up as either:

    ????????http://www.postgresql.org/docs/8.3/static/release.html

or

    <A0><A0><A0><A0><A0><A0><A0><A0>http://www.postgresql.org/docs

I think future press releases should be straight ASCII.  Josh thinks
there is something wrong with my email software.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

Re: Problem with press release

From
Josh Berkus
Date:
Bruce,

>     <A0><A0><A0><A0><A0><A0><A0><A0>http://www.postgresql.org/docs
>
> I think future press releases should be straight ASCII.  Josh thinks
> there is something wrong with my email software.

I'm relatively certain that's the case.  Not one single person other than you
has reported this problem, not over the 2 years you've persisted in bringing
this up.  Please either obtain some corroborating evidence, or stop bringing
it up.

--
Josh Berkus
PostgreSQL @ Sun
San Francisco

Re: Problem with press release

From
Alvaro Herrera
Date:
Josh Berkus wrote:
> Bruce,
>
> >     <A0><A0><A0><A0><A0><A0><A0><A0>http://www.postgresql.org/docs
> >
> > I think future press releases should be straight ASCII.  Josh thinks
> > there is something wrong with my email software.
>
> I'm relatively certain that's the case.  Not one single person other than you
> has reported this problem, not over the 2 years you've persisted in bringing
> this up.  Please either obtain some corroborating evidence, or stop bringing
> it up.

I'm the only one reporting Evolution emitting annoying and seemingly
random 0xfeff byte pairs though, and that doesn't make it any less
broken.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: Problem with press release

From
Gregory Stark
Date:
"Josh Berkus" <josh@agliodbs.com> writes:

>>     <A0><A0><A0><A0><A0><A0><A0><A0>http://www.postgresql.org/docs
>>
>> I think future press releases should be straight ASCII.  Josh thinks
>> there is something wrong with my email software.

Or both

> I'm relatively certain that's the case.  Not one single person other than you
> has reported this problem, not over the 2 years you've persisted in bringing
> this up.  Please either obtain some corroborating evidence, or stop bringing
> it up.

Is there some reason our English press releases need any non-ascii characters?

--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com
  Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL training!

Re: Problem with press release

From
Josh Berkus
Date:
Greg,

> Is there some reason our English press releases need any non-ascii
> characters?

I'm disputing Bruce's report that there *are* any.

--
Josh Berkus
PostgreSQL @ Sun
San Francisco

Re: Problem with press release

From
Andrew Sullivan
Date:
On Sun, Jun 15, 2008 at 07:25:00PM -0700, Josh Berkus wrote:
>
> I'm disputing Bruce's report that there *are* any.

Well, this is what mutt thinks your MIME encoding is on the message I
got:

text/plain, 8bit, iso-8859-1

Probably the issue is not just that it's not ASCII, but that it's not
UTF-8 either.

A

--
Andrew Sullivan
ajs@commandprompt.com
+1 503 667 4564 x104
http://www.commandprompt.com/

Re: Problem with press release

From
Bruce Momjian
Date:
Josh Berkus wrote:
> Bruce,
>
> >     <A0><A0><A0><A0><A0><A0><A0><A0>http://www.postgresql.org/docs
> >
> > I think future press releases should be straight ASCII.  Josh thinks
> > there is something wrong with my email software.
>
> I'm relatively certain that's the case.  Not one single person other than you
> has reported this problem, not over the 2 years you've persisted in bringing
> this up.  Please either obtain some corroborating evidence, or stop bringing
> it up.

OK, I pulled the announce mbox file from:

    http://archives.postgresql.org/pgsql-announce/mbox/pgsql-announce.2008-06.gz

and am attaching the MIME-encoded email that shows the non-ASCII
characters, specifically:

    =A0=A0=A0=A0=A0=A0=A0=A0http://www.postgresql.org/docs/8.3/static/release.h=

In fact I have no idea why the email itself has to be MIME-encoded.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +
To: pgsql-announce@postgresql.org
Subject: PostgreSQL 8.3.3, 8.2.9 etc. Update Release
Date: Thu, 12 Jun 2008 09:20:40 -0700
User-Agent: KMail/1.8.2
MIME-Version: 1.0
Content-Type: text/plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline
Message-Id: <200806120920.40297.josh@postgresql.org>
X-Virus-Scanned: Maia Mailguard 1.0.1
X-Archive-Number: 200806/12
X-Sequence-Number: 1351

Updates for all maintained versions of PostgreSQL are available today: 8.3.=
3,=20
8.2.9, 8.1.13, 8.0.17 and 7.4.21. =A0These releases fix more than two dozen=
=20
minor issues reported and patched over the last few months. =A0All PostgreS=
QL=20
users should plan to update at their earliest convenience. Users of UTF-8=20
databases on Windows and people in affected time zones, in particular, shou=
ld=20
upgrade as soon as possible.

The issues fixed include a crash caused by encoding mismatch on Windows,=20
possible crash when decompressing corrupted data, non-optimization of some=
=20
parameterized queries, new time zone updates, SIGTERM-caused memory=20
corruption, runaway LWLocks with GIN indexes, and several more. =A0Read the=
=20
release notes to see if any of the issues affect you.

As with other minor releases, users are not required to dump and reload the=
ir=20
database in order to apply this update release; you may simply upgrade the=
=20
PostgreSQL binaries. =A0Users skipping more than one update may need to che=
ck=20
the release notes for extra, post-update steps. =A0As previously announced,=
=20
only versions 8.2.9 and 8.3.3 of the Windows binaries are being released, a=
s=20
we no longer support 8.0 and 8.1 on Windows.

Release Notes:
=A0=A0=A0=A0=A0=A0=A0=A0http://www.postgresql.org/docs/8.3/static/release.h=
tml
Source Code
=A0=A0=A0=A0=A0=A0=A0=A0http://www.postgresql.org/ftp/source
Binaries
=A0=A0=A0=A0=A0=A0=A0=A0http://www.postgresql.org/ftp/binary

Please note: we "skipped" a minor release number due to an issue found with=
=20
the 8.3.2 etc. release bundles, which were never announced but were availab=
le=20
via FTP for a few days.  If for some reason you downloaded versions 8.3.2,=
=20
8.2.8, 8.1.12, 8.0.16 or 7.4.20, please replace them with the new update=20
immediately.

=2D-=20
PostgreSQL Global Development Group


Re: Problem with press release

From
Alvaro Herrera
Date:
Josh Berkus wrote:
> Greg,
>
> > Is there some reason our English press releases need any non-ascii
> > characters?
>
> I'm disputing Bruce's report that there *are* any.

The message is encoded in quoted-printable and has a lot of A0 bytes on
it.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: Problem with press release

From
Markus Schaber
Date:
Hi, Alvaro,

Alvaro Herrera <alvherre@commandprompt.com> wrote:

> I'm the only one reporting Evolution emitting annoying and seemingly
> random 0xfeff byte pairs though, and that doesn't make it any less
> broken.

That looks like a byte order mark in utf16 encodings to me.


Regards,
Markus

--
Markus Schaber | Logical Tracking&Tracing International AG
Dipl. Inf.     | Software Development GIS

Fight against software patents in Europe! www.ffii.org
www.nosoftwarepatents.org

Re: Problem with press release

From
Alvaro Herrera
Date:
Markus Schaber wrote:
> Hi, Alvaro,
>
> Alvaro Herrera <alvherre@commandprompt.com> wrote:
>
> > I'm the only one reporting Evolution emitting annoying and seemingly
> > random 0xfeff byte pairs though, and that doesn't make it any less
> > broken.
>
> That looks like a byte order mark in utf16 encodings to me.

Yes.  Why is it in the middle of an email?

--
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: Problem with press release

From
Robert Treat
Date:
On Sunday 15 June 2008 22:25:00 Josh Berkus wrote:
> Greg,
>
> > Is there some reason our English press releases need any non-ascii
> > characters?
>
> I'm disputing Bruce's report that there *are* any.
>

See the evidence of the problem in other followup emails, but wanted to chime
in that I can see the extra charectors if I do a "view source" on the message
in kmail. This shows the raw text (includes headers and all that) that come
with the message, but normally my email client formats all that away to give
me a nicer looking email (i think i use the fancy headers setting), so I
think the problem does exist, but most people aren't aware of it becuase most
mail clients hide it away from you.

--
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

Re: Problem with press release

From
Andrew Sullivan
Date:
On Mon, Jun 16, 2008 at 06:25:48AM -0400, Bruce Momjian wrote:
> In fact I have no idea why the email itself has to be MIME-encoded.

Many MUAs encode all mail as MIME.  You attach the message as an
rfc822 message.  Apart from strange people who submit directly to
sendmail out of mh that they edited by hand in Emacs, it's a normal
thing to do.  That's completely irrelevant to whether it needs to be
encoded as 8859-1, which is the problem.  (Note that if it were
encoded as UTF-8, it would probably end up transmitted as 7 bit ASCII
instead.)

A
--
Andrew Sullivan
ajs@commandprompt.com
+1 503 667 4564 x104
http://www.commandprompt.com/

Re: Problem with press release

From
Josh Berkus
Date:
Andrew Sullivan wrote:
> On Sun, Jun 15, 2008 at 07:25:00PM -0700, Josh Berkus wrote:
>> I'm disputing Bruce's report that there *are* any.
>
> Well, this is what mutt thinks your MIME encoding is on the message I
> got:
>
> text/plain, 8bit, iso-8859-1
>
> Probably the issue is not just that it's not ASCII, but that it's not
> UTF-8 either.

Interesting.  Ok, time for me to take it up with the KDE guys.  It's
probably cut-and-paste from a cvs terminal session into Kmail that's
fouling things.

--Josh

Re: Problem with press release

From
Andrew Sullivan
Date:
On Mon, Jun 16, 2008 at 10:40:01AM -0700, Josh Berkus wrote:

> Interesting.  Ok, time for me to take it up with the KDE guys.  It's
> probably cut-and-paste from a cvs terminal session into Kmail that's
> fouling things.

Or your locale.  Try starting in UTF-8 instead, and see if it helps.
(Remember, the bottom bits of UTF-8 are ASCII, so you don't have the
same problem.)

A


--
Andrew Sullivan
ajs@commandprompt.com
+1 503 667 4564 x104
http://www.commandprompt.com/

Re: Problem with press release

From
Jan de Visser
Date:
On Monday 16 June 2008 13:40:01 Josh Berkus wrote:
> Andrew Sullivan wrote:
> > On Sun, Jun 15, 2008 at 07:25:00PM -0700, Josh Berkus wrote:
> >> I'm disputing Bruce's report that there *are* any.
> >
> > Well, this is what mutt thinks your MIME encoding is on the message I
> > got:
> >
> > text/plain, 8bit, iso-8859-1
> >
> > Probably the issue is not just that it's not ASCII, but that it's not
> > UTF-8 either.
>
> Interesting.  Ok, time for me to take it up with the KDE guys.  It's
> probably cut-and-paste from a cvs terminal session into Kmail that's
> fouling things.

Does (in the composer) Edit->Clean Spaces help?

>
> --Josh

Re: Problem with press release

From
Chander Ganesan
Date:
Josh Berkus wrote:
> Andrew Sullivan wrote:
>> On Sun, Jun 15, 2008 at 07:25:00PM -0700, Josh Berkus wrote:
>>> I'm disputing Bruce's report that there *are* any.
>>
>> Well, this is what mutt thinks your MIME encoding is on the message I
>> got:
>>
>> text/plain, 8bit, iso-8859-1
>>
>> Probably the issue is not just that it's not ASCII, but that it's not
>> UTF-8 either.
>
> Interesting.  Ok, time for me to take it up with the KDE guys.  It's
> probably cut-and-paste from a cvs terminal session into Kmail that's
> fouling things.
ISO-8859-1 is basically ASCII, it simply adds for additional characters
after 127 ...  Most mail clients use it, or Windows-1252 (essentially
the same) today.

The fact that it's 8bit requires that it be quoted printable encoded,
since QP is required for sending all the 8 bit characters in 7 bit
encoding.  AFAIK, iso-8859-1 is the standard character set used by most
email clients that send ascii messages.

The =A0 character is a "non breaking space", it's a space that prevents
a line-wrap from occurring.  My guess is that Bruce's email client
somehow doesn't understand the non breaking space and as a result
displays just an <A0> in its place.  I would think this is a problem
with his mail client, not the message itself, in that it doesn't
understand iso-8859-1 ...

chander

Re: Problem with press release

From
Josh Berkus
Date:
All,

Well, I figured out the Kmail issue.  Kmail defaults to ASCII when composing a
new message, but for some reason flips into iso-8859-1 when I do
cut-and-paste.  That's easy enough to fix now that I know what's going on.

--
Josh Berkus
PostgreSQL @ Sun
San Francisco

Re: Problem with press release

From
"Joshua D. Drake"
Date:
Chander Ganesan wrote:
> Josh Berkus wrote:
>> Andrew Sullivan wrote:
>>> On Sun, Jun 15, 2008 at 07:25:00PM -0700, Josh Berkus wrote:
>>>> I'm disputing Bruce's report that there *are* any.

> The =A0 character is a "non breaking space", it's a space that prevents
> a line-wrap from occurring.  My guess is that Bruce's email client
> somehow doesn't understand the non breaking space and as a result
> displays just an <A0> in its place.  I would think this is a problem
> with his mail client, not the message itself, in that it doesn't
> understand iso-8859-1 ...

Bruce are you still using Elm?

Joshua D. Drake

Re: Problem with press release

From
Chander Ganesan
Date:
Bruce Momjian wrote:
> Josh Berkus wrote:
>
>> Bruce,
>>
>>
>>>     <A0><A0><A0><A0><A0><A0><A0><A0>http://www.postgresql.org/docs
>>>
>>> I think future press releases should be straight ASCII.  Josh thinks
>>> there is something wrong with my email software.
>>>
>> I'm relatively certain that's the case.  Not one single person other than you
>> has reported this problem, not over the 2 years you've persisted in bringing
>> this up.  Please either obtain some corroborating evidence, or stop bringing
>> it up.
>>
>
> OK, I pulled the announce mbox file from:
>
>     http://archives.postgresql.org/pgsql-announce/mbox/pgsql-announce.2008-06.gz
>
> and am attaching the MIME-encoded email that shows the non-ASCII
> characters, specifically:
>
>     =A0=A0=A0=A0=A0=A0=A0=A0http://www.postgresql.org/docs/8.3/static/release.h=
>
> In fact I have no idea why the email itself has to be MIME-encoded.
>
I think I do (looking at this).  The =A0 (non breaking spaces) were
added to prevent the mail client from wrapping the line prior to
displaying the URL.  This would be an issue for mail clients that do
line wrapping.  The '=' at the end of the line is a continuation
character, which indicates that the line is continued on the next line.

Without this, the spaces would probably be displayed on one line, and
the URL wouldn't be indented on the next.

--
Chander Ganesan
Open Technology Group, Inc.
One Copley Parkway, Suite 210
Morrisville, NC  27560
919-463-0999/877-258-8987
http://www.otg-nc.com


Re: Problem with press release

From
Andrew Sullivan
Date:
On Tue, Jun 17, 2008 at 10:40:47AM -0400, Chander Ganesan wrote:

> ISO-8859-1 is basically ASCII, it simply adds for additional characters
> after 127 ...  Most mail clients use it, or Windows-1252 (essentially the
> same) today.

Sort of.  The ISO 8859 series of encodings was intended to be an
extension of ASCII to allow 8-bit communications for different
linguistic communities.  This turns out not to work that well, because
you have to signal which one of the encodings you're using at the
beginning, because (for instance) the IANA-labelled ISO-88591-1 and
ISO-8859-2 are not the same, but use the same bits.

> The fact that it's 8bit requires that it be quoted printable encoded, since
> QP is required for sending all the 8 bit characters in 7 bit encoding.

Not quite.  First, extended SMTP allows 8 bit data transfer in the
message body.  (See RFC 1652 and RFC 2821, among others, for more on
this.)  The practical upshot is usually QP, though.  Second, if you
used only US-ASCII characters, RFC 2046 explicitly encourages you to
mark the content as US-ASCII.  So the problem arises because the
encoded bits are non-breaking spaces, which won't fit in US-ASCII.
More on this below.

> AFAIK, iso-8859-1 is the standard character set used by most email clients
> that send ascii messages.

I sure hope not.  First, at least European English speakers can't use
just ISO-8859-1 any more, since it doesn't have EURO SIGN (€)
available.  Second, there is no reason to prefer ISO-8859-1 over
Unicode now, because the UTF-8 libraries are widely available and the
character set is more comprehensive.

> <A0> in its place.  I would think this is a problem with his mail client,
> not the message itself, in that it doesn't understand iso-8859-1 ...

It's a problem with the message itself, in that there's no need at all
to include a non-breaking space in there.  If you just want to make
the text flow the right way, RFC 3676 provides the Format=Flowed
approach.  No need for anything other than ASCII at all.

A

--
Andrew Sullivan
ajs@commandprompt.com
+1 503 667 4564 x104
http://www.commandprompt.com/

Re: Problem with press release

From
Bruce Momjian
Date:
Joshua D. Drake wrote:
> Chander Ganesan wrote:
> > Josh Berkus wrote:
> >> Andrew Sullivan wrote:
> >>> On Sun, Jun 15, 2008 at 07:25:00PM -0700, Josh Berkus wrote:
> >>>> I'm disputing Bruce's report that there *are* any.
>
> > The =A0 character is a "non breaking space", it's a space that prevents
> > a line-wrap from occurring.  My guess is that Bruce's email client
> > somehow doesn't understand the non breaking space and as a result
> > displays just an <A0> in its place.  I would think this is a problem
> > with his mail client, not the message itself, in that it doesn't
> > understand iso-8859-1 ...
>
> Bruce are you still using Elm?

Yep.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +