Thread: Problem with press release
I saw a problem with the most recent press release --- some spaces were encoded as non-breakable spaces in the press release: http://en.wikipedia.org/wiki/Non-breaking_space In ISO/IEC 8859, NBSP is 0xA0. They showed up as either: ????????http://www.postgresql.org/docs/8.3/static/release.html or <A0><A0><A0><A0><A0><A0><A0><A0>http://www.postgresql.org/docs I think future press releases should be straight ASCII. Josh thinks there is something wrong with my email software. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Bruce, > <A0><A0><A0><A0><A0><A0><A0><A0>http://www.postgresql.org/docs > > I think future press releases should be straight ASCII. Josh thinks > there is something wrong with my email software. I'm relatively certain that's the case. Not one single person other than you has reported this problem, not over the 2 years you've persisted in bringing this up. Please either obtain some corroborating evidence, or stop bringing it up. -- Josh Berkus PostgreSQL @ Sun San Francisco
Josh Berkus wrote: > Bruce, > > > <A0><A0><A0><A0><A0><A0><A0><A0>http://www.postgresql.org/docs > > > > I think future press releases should be straight ASCII. Josh thinks > > there is something wrong with my email software. > > I'm relatively certain that's the case. Not one single person other than you > has reported this problem, not over the 2 years you've persisted in bringing > this up. Please either obtain some corroborating evidence, or stop bringing > it up. I'm the only one reporting Evolution emitting annoying and seemingly random 0xfeff byte pairs though, and that doesn't make it any less broken. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
"Josh Berkus" <josh@agliodbs.com> writes: >> <A0><A0><A0><A0><A0><A0><A0><A0>http://www.postgresql.org/docs >> >> I think future press releases should be straight ASCII. Josh thinks >> there is something wrong with my email software. Or both > I'm relatively certain that's the case. Not one single person other than you > has reported this problem, not over the 2 years you've persisted in bringing > this up. Please either obtain some corroborating evidence, or stop bringing > it up. Is there some reason our English press releases need any non-ascii characters? -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL training!
Greg, > Is there some reason our English press releases need any non-ascii > characters? I'm disputing Bruce's report that there *are* any. -- Josh Berkus PostgreSQL @ Sun San Francisco
On Sun, Jun 15, 2008 at 07:25:00PM -0700, Josh Berkus wrote: > > I'm disputing Bruce's report that there *are* any. Well, this is what mutt thinks your MIME encoding is on the message I got: text/plain, 8bit, iso-8859-1 Probably the issue is not just that it's not ASCII, but that it's not UTF-8 either. A -- Andrew Sullivan ajs@commandprompt.com +1 503 667 4564 x104 http://www.commandprompt.com/
Josh Berkus wrote: > Bruce, > > > <A0><A0><A0><A0><A0><A0><A0><A0>http://www.postgresql.org/docs > > > > I think future press releases should be straight ASCII. Josh thinks > > there is something wrong with my email software. > > I'm relatively certain that's the case. Not one single person other than you > has reported this problem, not over the 2 years you've persisted in bringing > this up. Please either obtain some corroborating evidence, or stop bringing > it up. OK, I pulled the announce mbox file from: http://archives.postgresql.org/pgsql-announce/mbox/pgsql-announce.2008-06.gz and am attaching the MIME-encoded email that shows the non-ASCII characters, specifically: =A0=A0=A0=A0=A0=A0=A0=A0http://www.postgresql.org/docs/8.3/static/release.h= In fact I have no idea why the email itself has to be MIME-encoded. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + To: pgsql-announce@postgresql.org Subject: PostgreSQL 8.3.3, 8.2.9 etc. Update Release Date: Thu, 12 Jun 2008 09:20:40 -0700 User-Agent: KMail/1.8.2 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline Message-Id: <200806120920.40297.josh@postgresql.org> X-Virus-Scanned: Maia Mailguard 1.0.1 X-Archive-Number: 200806/12 X-Sequence-Number: 1351 Updates for all maintained versions of PostgreSQL are available today: 8.3.= 3,=20 8.2.9, 8.1.13, 8.0.17 and 7.4.21. =A0These releases fix more than two dozen= =20 minor issues reported and patched over the last few months. =A0All PostgreS= QL=20 users should plan to update at their earliest convenience. Users of UTF-8=20 databases on Windows and people in affected time zones, in particular, shou= ld=20 upgrade as soon as possible. The issues fixed include a crash caused by encoding mismatch on Windows,=20 possible crash when decompressing corrupted data, non-optimization of some= =20 parameterized queries, new time zone updates, SIGTERM-caused memory=20 corruption, runaway LWLocks with GIN indexes, and several more. =A0Read the= =20 release notes to see if any of the issues affect you. As with other minor releases, users are not required to dump and reload the= ir=20 database in order to apply this update release; you may simply upgrade the= =20 PostgreSQL binaries. =A0Users skipping more than one update may need to che= ck=20 the release notes for extra, post-update steps. =A0As previously announced,= =20 only versions 8.2.9 and 8.3.3 of the Windows binaries are being released, a= s=20 we no longer support 8.0 and 8.1 on Windows. Release Notes: =A0=A0=A0=A0=A0=A0=A0=A0http://www.postgresql.org/docs/8.3/static/release.h= tml Source Code =A0=A0=A0=A0=A0=A0=A0=A0http://www.postgresql.org/ftp/source Binaries =A0=A0=A0=A0=A0=A0=A0=A0http://www.postgresql.org/ftp/binary Please note: we "skipped" a minor release number due to an issue found with= =20 the 8.3.2 etc. release bundles, which were never announced but were availab= le=20 via FTP for a few days. If for some reason you downloaded versions 8.3.2,= =20 8.2.8, 8.1.12, 8.0.16 or 7.4.20, please replace them with the new update=20 immediately. =2D-=20 PostgreSQL Global Development Group
Josh Berkus wrote: > Greg, > > > Is there some reason our English press releases need any non-ascii > > characters? > > I'm disputing Bruce's report that there *are* any. The message is encoded in quoted-printable and has a lot of A0 bytes on it. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Hi, Alvaro, Alvaro Herrera <alvherre@commandprompt.com> wrote: > I'm the only one reporting Evolution emitting annoying and seemingly > random 0xfeff byte pairs though, and that doesn't make it any less > broken. That looks like a byte order mark in utf16 encodings to me. Regards, Markus -- Markus Schaber | Logical Tracking&Tracing International AG Dipl. Inf. | Software Development GIS Fight against software patents in Europe! www.ffii.org www.nosoftwarepatents.org
Markus Schaber wrote: > Hi, Alvaro, > > Alvaro Herrera <alvherre@commandprompt.com> wrote: > > > I'm the only one reporting Evolution emitting annoying and seemingly > > random 0xfeff byte pairs though, and that doesn't make it any less > > broken. > > That looks like a byte order mark in utf16 encodings to me. Yes. Why is it in the middle of an email? -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Sunday 15 June 2008 22:25:00 Josh Berkus wrote: > Greg, > > > Is there some reason our English press releases need any non-ascii > > characters? > > I'm disputing Bruce's report that there *are* any. > See the evidence of the problem in other followup emails, but wanted to chime in that I can see the extra charectors if I do a "view source" on the message in kmail. This shows the raw text (includes headers and all that) that come with the message, but normally my email client formats all that away to give me a nicer looking email (i think i use the fancy headers setting), so I think the problem does exist, but most people aren't aware of it becuase most mail clients hide it away from you. -- Robert Treat Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL
On Mon, Jun 16, 2008 at 06:25:48AM -0400, Bruce Momjian wrote: > In fact I have no idea why the email itself has to be MIME-encoded. Many MUAs encode all mail as MIME. You attach the message as an rfc822 message. Apart from strange people who submit directly to sendmail out of mh that they edited by hand in Emacs, it's a normal thing to do. That's completely irrelevant to whether it needs to be encoded as 8859-1, which is the problem. (Note that if it were encoded as UTF-8, it would probably end up transmitted as 7 bit ASCII instead.) A -- Andrew Sullivan ajs@commandprompt.com +1 503 667 4564 x104 http://www.commandprompt.com/
Andrew Sullivan wrote: > On Sun, Jun 15, 2008 at 07:25:00PM -0700, Josh Berkus wrote: >> I'm disputing Bruce's report that there *are* any. > > Well, this is what mutt thinks your MIME encoding is on the message I > got: > > text/plain, 8bit, iso-8859-1 > > Probably the issue is not just that it's not ASCII, but that it's not > UTF-8 either. Interesting. Ok, time for me to take it up with the KDE guys. It's probably cut-and-paste from a cvs terminal session into Kmail that's fouling things. --Josh
On Mon, Jun 16, 2008 at 10:40:01AM -0700, Josh Berkus wrote: > Interesting. Ok, time for me to take it up with the KDE guys. It's > probably cut-and-paste from a cvs terminal session into Kmail that's > fouling things. Or your locale. Try starting in UTF-8 instead, and see if it helps. (Remember, the bottom bits of UTF-8 are ASCII, so you don't have the same problem.) A -- Andrew Sullivan ajs@commandprompt.com +1 503 667 4564 x104 http://www.commandprompt.com/
On Monday 16 June 2008 13:40:01 Josh Berkus wrote: > Andrew Sullivan wrote: > > On Sun, Jun 15, 2008 at 07:25:00PM -0700, Josh Berkus wrote: > >> I'm disputing Bruce's report that there *are* any. > > > > Well, this is what mutt thinks your MIME encoding is on the message I > > got: > > > > text/plain, 8bit, iso-8859-1 > > > > Probably the issue is not just that it's not ASCII, but that it's not > > UTF-8 either. > > Interesting. Ok, time for me to take it up with the KDE guys. It's > probably cut-and-paste from a cvs terminal session into Kmail that's > fouling things. Does (in the composer) Edit->Clean Spaces help? > > --Josh
Josh Berkus wrote: > Andrew Sullivan wrote: >> On Sun, Jun 15, 2008 at 07:25:00PM -0700, Josh Berkus wrote: >>> I'm disputing Bruce's report that there *are* any. >> >> Well, this is what mutt thinks your MIME encoding is on the message I >> got: >> >> text/plain, 8bit, iso-8859-1 >> >> Probably the issue is not just that it's not ASCII, but that it's not >> UTF-8 either. > > Interesting. Ok, time for me to take it up with the KDE guys. It's > probably cut-and-paste from a cvs terminal session into Kmail that's > fouling things. ISO-8859-1 is basically ASCII, it simply adds for additional characters after 127 ... Most mail clients use it, or Windows-1252 (essentially the same) today. The fact that it's 8bit requires that it be quoted printable encoded, since QP is required for sending all the 8 bit characters in 7 bit encoding. AFAIK, iso-8859-1 is the standard character set used by most email clients that send ascii messages. The =A0 character is a "non breaking space", it's a space that prevents a line-wrap from occurring. My guess is that Bruce's email client somehow doesn't understand the non breaking space and as a result displays just an <A0> in its place. I would think this is a problem with his mail client, not the message itself, in that it doesn't understand iso-8859-1 ... chander
All, Well, I figured out the Kmail issue. Kmail defaults to ASCII when composing a new message, but for some reason flips into iso-8859-1 when I do cut-and-paste. That's easy enough to fix now that I know what's going on. -- Josh Berkus PostgreSQL @ Sun San Francisco
Chander Ganesan wrote: > Josh Berkus wrote: >> Andrew Sullivan wrote: >>> On Sun, Jun 15, 2008 at 07:25:00PM -0700, Josh Berkus wrote: >>>> I'm disputing Bruce's report that there *are* any. > The =A0 character is a "non breaking space", it's a space that prevents > a line-wrap from occurring. My guess is that Bruce's email client > somehow doesn't understand the non breaking space and as a result > displays just an <A0> in its place. I would think this is a problem > with his mail client, not the message itself, in that it doesn't > understand iso-8859-1 ... Bruce are you still using Elm? Joshua D. Drake
Bruce Momjian wrote: > Josh Berkus wrote: > >> Bruce, >> >> >>> <A0><A0><A0><A0><A0><A0><A0><A0>http://www.postgresql.org/docs >>> >>> I think future press releases should be straight ASCII. Josh thinks >>> there is something wrong with my email software. >>> >> I'm relatively certain that's the case. Not one single person other than you >> has reported this problem, not over the 2 years you've persisted in bringing >> this up. Please either obtain some corroborating evidence, or stop bringing >> it up. >> > > OK, I pulled the announce mbox file from: > > http://archives.postgresql.org/pgsql-announce/mbox/pgsql-announce.2008-06.gz > > and am attaching the MIME-encoded email that shows the non-ASCII > characters, specifically: > > =A0=A0=A0=A0=A0=A0=A0=A0http://www.postgresql.org/docs/8.3/static/release.h= > > In fact I have no idea why the email itself has to be MIME-encoded. > I think I do (looking at this). The =A0 (non breaking spaces) were added to prevent the mail client from wrapping the line prior to displaying the URL. This would be an issue for mail clients that do line wrapping. The '=' at the end of the line is a continuation character, which indicates that the line is continued on the next line. Without this, the spaces would probably be displayed on one line, and the URL wouldn't be indented on the next. -- Chander Ganesan Open Technology Group, Inc. One Copley Parkway, Suite 210 Morrisville, NC 27560 919-463-0999/877-258-8987 http://www.otg-nc.com
On Tue, Jun 17, 2008 at 10:40:47AM -0400, Chander Ganesan wrote: > ISO-8859-1 is basically ASCII, it simply adds for additional characters > after 127 ... Most mail clients use it, or Windows-1252 (essentially the > same) today. Sort of. The ISO 8859 series of encodings was intended to be an extension of ASCII to allow 8-bit communications for different linguistic communities. This turns out not to work that well, because you have to signal which one of the encodings you're using at the beginning, because (for instance) the IANA-labelled ISO-88591-1 and ISO-8859-2 are not the same, but use the same bits. > The fact that it's 8bit requires that it be quoted printable encoded, since > QP is required for sending all the 8 bit characters in 7 bit encoding. Not quite. First, extended SMTP allows 8 bit data transfer in the message body. (See RFC 1652 and RFC 2821, among others, for more on this.) The practical upshot is usually QP, though. Second, if you used only US-ASCII characters, RFC 2046 explicitly encourages you to mark the content as US-ASCII. So the problem arises because the encoded bits are non-breaking spaces, which won't fit in US-ASCII. More on this below. > AFAIK, iso-8859-1 is the standard character set used by most email clients > that send ascii messages. I sure hope not. First, at least European English speakers can't use just ISO-8859-1 any more, since it doesn't have EURO SIGN (€) available. Second, there is no reason to prefer ISO-8859-1 over Unicode now, because the UTF-8 libraries are widely available and the character set is more comprehensive. > <A0> in its place. I would think this is a problem with his mail client, > not the message itself, in that it doesn't understand iso-8859-1 ... It's a problem with the message itself, in that there's no need at all to include a non-breaking space in there. If you just want to make the text flow the right way, RFC 3676 provides the Format=Flowed approach. No need for anything other than ASCII at all. A -- Andrew Sullivan ajs@commandprompt.com +1 503 667 4564 x104 http://www.commandprompt.com/
Joshua D. Drake wrote: > Chander Ganesan wrote: > > Josh Berkus wrote: > >> Andrew Sullivan wrote: > >>> On Sun, Jun 15, 2008 at 07:25:00PM -0700, Josh Berkus wrote: > >>>> I'm disputing Bruce's report that there *are* any. > > > The =A0 character is a "non breaking space", it's a space that prevents > > a line-wrap from occurring. My guess is that Bruce's email client > > somehow doesn't understand the non breaking space and as a result > > displays just an <A0> in its place. I would think this is a problem > > with his mail client, not the message itself, in that it doesn't > > understand iso-8859-1 ... > > Bruce are you still using Elm? Yep. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. +