Thread: Encoding of src/timezone/tznames/Europe.txt

Encoding of src/timezone/tznames/Europe.txt

From
Christoph Berg
Date:
Is there any reason why src/timezone/tznames/Europe.txt is encoded in
latin1 and not utf-8?

The offending lines are these timezones:

MESZ     7200 D  # Mitteleuropäische Sommerzeit (German)
                 #     (attested in IANA comments though not their code)

MEZ      3600    # Mitteleuropäische Zeit (German)
                 #     (attested in IANA comments though not their code)

It's not important for anything, just general sanity. (Spotted by
Debian's package checker, lintian.)

Christoph



Re: Encoding of src/timezone/tznames/Europe.txt

From
Tom Lane
Date:
Christoph Berg <myon@debian.org> writes:
> Is there any reason why src/timezone/tznames/Europe.txt is encoded in
> latin1 and not utf-8?

> The offending lines are these timezones:

> MESZ     7200 D  # Mitteleuropäische Sommerzeit (German)
>                  #     (attested in IANA comments though not their code)

> MEZ      3600    # Mitteleuropäische Zeit (German)
>                  #     (attested in IANA comments though not their code)

> It's not important for anything, just general sanity. (Spotted by
> Debian's package checker, lintian.)

Hm.  TBH, my first reaction is "let's lose the accents".  I agree that
it's not great to be installing files that are encoded in latin1, but
it might not be great to be installing files that are encoded in utf8
either.  Aren't we better off insisting that these files be plain ascii?

I notice that the copies of these lines in src/timezone/tznames/Default
seem to be ascii-ified already.  Haven't traced the git history,
but I bet somebody fixed Default without noticing the other copy.

            regards, tom lane



Re: Encoding of src/timezone/tznames/Europe.txt

From
Christoph Berg
Date:
Re: Tom Lane
> > MESZ     7200 D  # Mitteleuropäische Sommerzeit (German)
> >                  #     (attested in IANA comments though not their code)
> 
> > It's not important for anything, just general sanity. (Spotted by
> > Debian's package checker, lintian.)
> 
> Hm.  TBH, my first reaction is "let's lose the accents".

Or that, yes. (The correct German transliteration is
"Mitteleuropaeische" with 'ae'.)

Christoph



Re: Encoding of src/timezone/tznames/Europe.txt

From
Michael Paquier
Date:
On Thu, Jul 16, 2020 at 09:46:03PM +0200, Christoph Berg wrote:
> Or that, yes. (The correct German transliteration is
> "Mitteleuropaeische" with 'ae'.)

tznames/Europe.txt is iso-latin-1-unix for buffer-file-coding-system
since its introduction in d8b5c95, and tznames/Default is using ASCII
as well since this point.  +1 to switch all that to ASCII and give up
on the accents.
--
Michael

Attachment

Re: Encoding of src/timezone/tznames/Europe.txt

From
Tom Lane
Date:
Michael Paquier <michael@paquier.xyz> writes:
> On Thu, Jul 16, 2020 at 09:46:03PM +0200, Christoph Berg wrote:
>> Or that, yes. (The correct German transliteration is
>> "Mitteleuropaeische" with 'ae'.)

> tznames/Europe.txt is iso-latin-1-unix for buffer-file-coding-system
> since its introduction in d8b5c95, and tznames/Default is using ASCII
> as well since this point.  +1 to switch all that to ASCII and give up
> on the accents.

Done that way.  I also checked for other discrepancies between
tznames/Default and the other files, and found a few more trivialities.

            regards, tom lane



Re: Encoding of src/timezone/tznames/Europe.txt

From
Christoph Berg
Date:
Re: Tom Lane
> Done that way.  I also checked for other discrepancies between
> tznames/Default and the other files, and found a few more trivialities.

Thanks!

Christoph



Re: Encoding of src/timezone/tznames/Europe.txt

From
Michael Paquier
Date:
On Fri, Jul 17, 2020 at 07:24:28PM +0200, Christoph Berg wrote:
> Re: Tom Lane
>> Done that way.  I also checked for other discrepancies between
>> tznames/Default and the other files, and found a few more trivialities.
>
> Thanks!

+1.
--
Michael

Attachment