Thread: BUG #6066: Bad string in German translation causes segfault (user-triggerable)

BUG #6066: Bad string in German translation causes segfault (user-triggerable)

From
"Christoph Berg"
Date:
The following bug has been logged online:

Bug reference:      6066
Logged by:          Christoph Berg
Email address:      cb@df7cb.de
PostgreSQL version: 9.1, 9.0, 8.4
Operating system:   any
Description:        Bad string in German translation causes segfault
(user-triggerable)
Details:

In German locale, the follow statement causes vsnprintf() to segfault when
printing the hint:

SELECT TO_DATE('30.12.2011', 'YYYYMMDD') AS datum;

Fix tested for 8.4:

$ diff -c src/backend/po/de.po.orig src/backend/po/de.po
*** src/backend/po/de.po.orig    2011-06-17 10:06:41.000000000 +0200
--- src/backend/po/de.po    2011-06-17 10:06:48.000000000 +0200
***************
*** 12318,12324 ****
  "If your source string is not fixed-width, try using the \"FM\"
modifier."
  msgstr ""
  "Wenn die Quellzeichenkette keine feste Breite hat, versuchen Sie den "
! "Modifikator »%s«."

  #: utils/adt/formatting.c:1886 utils/adt/formatting.c:1899
  #: utils/adt/formatting.c:2029
--- 12318,12324 ----
  "If your source string is not fixed-width, try using the \"FM\"
modifier."
  msgstr ""
  "Wenn die Quellzeichenkette keine feste Breite hat, versuchen Sie den "
! "Modifikator »FM«."

  #: utils/adt/formatting.c:1886 utils/adt/formatting.c:1899
  #: utils/adt/formatting.c:2029
--On 17. Juni 2011 08:18:03 +0000 Christoph Berg <cb@df7cb.de> wrote:

> In German locale, the follow statement causes vsnprintf() to segfault when
> printing the hint:
>
> SELECT TO_DATE('30.12.2011', 'YYYYMMDD') AS datum;
>
> Fix tested for 8.4:

Additionally, this seems to be the case for 9.0, 9.1 and current -HEAD, too.

--
Thanks

    Bernd

Re: BUG #6066: Bad string in German translation causes segfault (user-triggerable)

From
Heikki Linnakangas
Date:
On 17.06.2011 11:22, Bernd Helmle wrote:
> --On 17. Juni 2011 08:18:03 +0000 Christoph Berg <cb@df7cb.de> wrote:
>
>> In German locale, the follow statement causes vsnprintf() to segfault
>> when
>> printing the hint:
>>
>> SELECT TO_DATE('30.12.2011', 'YYYYMMDD') AS datum;
>>
>> Fix tested for 8.4:
>
> Additionally, this seems to be the case for 9.0, 9.1 and current -HEAD,
> too.

So, this is a case where the untranslated string doesn't have a %s in
it, but the translated one does. We should have a way to check those
automatically. In fact, I'm surprised if someone somewhere hasn't
already written such a script, as gettext is used very widely. Anyone
want to research/write a script?

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com
Re: Heikki Linnakangas 2011-06-17 <4DFB137E.4040404@enterprisedb.com>
> So, this is a case where the untranslated string doesn't have a %s
> in it, but the translated one does. We should have a way to check
> those automatically. In fact, I'm surprised if someone somewhere
> hasn't already written such a script, as gettext is used very
> widely. Anyone want to research/write a script?

Actually, msgfmt can do that itself with -c. This can be set in
Makefile.global:

$ grep MSGFMT src/Makefile.global
MSGFMT  = msgfmt -c

Unfortunately that doesn't help in this case, as the bad string isn't
tagged as "#, c-format", but still gets used as such. This seems to be
the case for many errhint() strings. Maybe xgettext should be taught
to treat all errhint() et al arguments as c-strings.

Christoph
--
cb@df7cb.de | http://www.df7cb.de/

Re: BUG #6066: [PATCH] Mark more strings as c-format

From
Christoph Berg
Date:
Re: To pgsql-bugs@postgresql.org 2011-06-17 <20110617091114.GC4130@msgid.df=
7cb.de>
> Unfortunately that doesn't help in this case, as the bad string isn't
> tagged as "#, c-format", but still gets used as such. This seems to be
> the case for many errhint() strings. Maybe xgettext should be taught
> to treat all errhint() et al arguments as c-strings.

Here's a patch to implement that, with backend/nls.mk updated.

msgfmt -c is already available in the "maintainer-check-po" target.
I'd assume this was called at least once during the release process.


diff --git a/src/backend/nls.mk b/src/backend/nls.mk
index 1894569..3c3f8ed 100644
*** a/src/backend/nls.mk
--- b/src/backend/nls.mk
*************** GETTEXT_TRIGGERS:=3D _ errmsg errmsg_plura
*** 6,11 ****
--- 6,15 ----
      errdetail_plural:1,2 errhint errcontext \
      GUC_check_errmsg GUC_check_errdetail GUC_check_errhint \
      write_stderr yyerror parser_yyerror
+ GETTEXT_FLAGS   :=3D errmsg:1:c-format errmsg_plural:1:c-format \
+     errmsg_plural:2:c-format errhint:1:c-format errcontext:1:c-format \
+     GUC_check_errmsg:1:c-format GUC_check_errdetail:1:c-format \
+     GUC_check_errhint:1:c-format write_stderr:1:c-format
=20=20
  gettext-files: distprep
      find $(srcdir)/ $(srcdir)/../port/ -name '*.c' -print >$@
diff --git a/src/nls-global.mk b/src/nls-global.mk
index 32b3c0f..3aa598f 100644
*** a/src/nls-global.mk
--- b/src/nls-global.mk
***************
*** 12,17 ****
--- 12,20 ----
  # GETTEXT_FILES        -- list of source files that contain message strings
  # GETTEXT_TRIGGERS    -- (optional) list of functions that contain
  #                          translatable strings
+ # GETTEXT_FLAGS        -- (optional) list of gettext --flag arguments to mark
+ #                          function arguments that contain C format strin=
gs
+ #                          (functions must be listed in TRIGGERS and FLAG=
S)
  #
  # That's all, the rest is done here, if --enable-nls was specified.
  #
*************** all-po: $(MO_FILES)
*** 48,54 ****
  ifeq ($(word 1,$(GETTEXT_FILES)),+)
  po/$(CATALOG_NAME).pot: $(word 2, $(GETTEXT_FILES)) $(MAKEFILE_LIST)
  ifdef XGETTEXT
!     $(XGETTEXT) -D $(srcdir) -n $(addprefix -k, $(GETTEXT_TRIGGERS)) -f $<
  else
      @echo "You don't have 'xgettext'."; exit 1
  endif
--- 51,57 ----
  ifeq ($(word 1,$(GETTEXT_FILES)),+)
  po/$(CATALOG_NAME).pot: $(word 2, $(GETTEXT_FILES)) $(MAKEFILE_LIST)
  ifdef XGETTEXT
!     $(XGETTEXT) -D $(srcdir) -n $(addprefix -k, $(GETTEXT_TRIGGERS)) $(addpr=
efix --flag=3D, $(GETTEXT_FLAGS)) -f $<
  else
      @echo "You don't have 'xgettext'."; exit 1
  endif
*************** po/$(CATALOG_NAME).pot: $(GETTEXT_FILES)
*** 57,63 ****
  # Change to srcdir explicitly, don't rely on $^.  That way we get
  # consistent #: file references in the po files.
  ifdef XGETTEXT
!     $(XGETTEXT) -D $(srcdir) -n $(addprefix -k, $(GETTEXT_TRIGGERS)) $(GETTE=
XT_FILES)
  else
      @echo "You don't have 'xgettext'."; exit 1
  endif
--- 60,66 ----
  # Change to srcdir explicitly, don't rely on $^.  That way we get
  # consistent #: file references in the po files.
  ifdef XGETTEXT
!     $(XGETTEXT) -D $(srcdir) -n $(addprefix -k, $(GETTEXT_TRIGGERS)) $(addpr=
efix --flag=3D, $(GETTEXT_FLAGS)) $(GETTEXT_FILES)
  else
      @echo "You don't have 'xgettext'."; exit 1
  endif


Christoph
--=20
cb@df7cb.de | http://www.df7cb.de/

Re: BUG #6066: [PATCH] Mark more strings as c-format

From
Alvaro Herrera
Date:
Excerpts from Christoph Berg's message of vie jun 17 07:10:34 -0400 2011:
> Re: To pgsql-bugs@postgresql.org 2011-06-17 <20110617091114.GC4130@msgid.df7cb.de>
> > Unfortunately that doesn't help in this case, as the bad string isn't
> > tagged as "#, c-format", but still gets used as such. This seems to be
> > the case for many errhint() strings. Maybe xgettext should be taught
> > to treat all errhint() et al arguments as c-strings.
>
> Here's a patch to implement that, with backend/nls.mk updated.
>
> msgfmt -c is already available in the "maintainer-check-po" target.
> I'd assume this was called at least once during the release process.

Yeah, msgfmt -c is called pretty frequently on the translation service
http://babel.postgresql.org.

Thanks for the patch, I'll have a look at integrating it.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Fri, Jun 17, 2011 at 08:18:03AM +0000, Christoph Berg wrote:
> $ diff -c src/backend/po/de.po.orig src/backend/po/de.po
> *** src/backend/po/de.po.orig    2011-06-17 10:06:41.000000000 +0200
> --- src/backend/po/de.po    2011-06-17 10:06:48.000000000 +0200
> ***************
> *** 12318,12324 ****
>   "If your source string is not fixed-width, try using the \"FM\"
> modifier."
>   msgstr ""
>   "Wenn die Quellzeichenkette keine feste Breite hat, versuchen Sie den "
> ! "Modifikator »%s«."
>
>   #: utils/adt/formatting.c:1886 utils/adt/formatting.c:1899
>   #: utils/adt/formatting.c:2029
> --- 12318,12324 ----
>   "If your source string is not fixed-width, try using the \"FM\"
> modifier."
>   msgstr ""
>   "Wenn die Quellzeichenkette keine feste Breite hat, versuchen Sie den "
> ! "Modifikator »FM«."
>
>   #: utils/adt/formatting.c:1886 utils/adt/formatting.c:1899
>   #: utils/adt/formatting.c:2029

Applied, thanks for the patch.

Michael
--
Michael Meskes
Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org
Jabber: michael.meskes at googlemail dot com
VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL
On fre, 2011-06-17 at 10:22 +0200, Bernd Helmle wrote:
>
> --On 17. Juni 2011 08:18:03 +0000 Christoph Berg <cb@df7cb.de> wrote:
>
> > In German locale, the follow statement causes vsnprintf() to segfault when
> > printing the hint:
> >
> > SELECT TO_DATE('30.12.2011', 'YYYYMMDD') AS datum;
> >
> > Fix tested for 8.4:
>
> Additionally, this seems to be the case for 9.0, 9.1 and current -HEAD, too.

Fix committed to 8.4, 9.0, 9.1 translation repositories.

Re: BUG #6066: [PATCH] Mark more strings as c-format

From
Peter Eisentraut
Date:
On fre, 2011-06-17 at 13:10 +0200, Christoph Berg wrote:
> Re: To pgsql-bugs@postgresql.org 2011-06-17 <20110617091114.GC4130@msgid.df7cb.de>
> > Unfortunately that doesn't help in this case, as the bad string isn't
> > tagged as "#, c-format", but still gets used as such. This seems to be
> > the case for many errhint() strings. Maybe xgettext should be taught
> > to treat all errhint() et al arguments as c-strings.
>
> Here's a patch to implement that, with backend/nls.mk updated.

I have committed a patch based on that, with the other nls.mk filled in
as well.  Thanks for the idea.