BUG #6066: Bad string in German translation causes segfault (user-triggerable)

Started by Christoph Bergalmost 15 years ago9 messagesbugs
Jump to latest
#1Christoph Berg
myon@debian.org

The following bug has been logged online:

Bug reference: 6066
Logged by: Christoph Berg
Email address: cb@df7cb.de
PostgreSQL version: 9.1, 9.0, 8.4
Operating system: any
Description: Bad string in German translation causes segfault
(user-triggerable)
Details:

In German locale, the follow statement causes vsnprintf() to segfault when
printing the hint:

SELECT TO_DATE('30.12.2011', 'YYYYMMDD') AS datum;

Fix tested for 8.4:

$ diff -c src/backend/po/de.po.orig src/backend/po/de.po
*** src/backend/po/de.po.orig	2011-06-17 10:06:41.000000000 +0200
--- src/backend/po/de.po	2011-06-17 10:06:48.000000000 +0200
***************
*** 12318,12324 ****
  "If your source string is not fixed-width, try using the \"FM\"
modifier."
  msgstr ""
  "Wenn die Quellzeichenkette keine feste Breite hat, versuchen Sie den "
! "Modifikator »%s«."
  #: utils/adt/formatting.c:1886 utils/adt/formatting.c:1899
  #: utils/adt/formatting.c:2029
--- 12318,12324 ----
  "If your source string is not fixed-width, try using the \"FM\"
modifier."
  msgstr ""
  "Wenn die Quellzeichenkette keine feste Breite hat, versuchen Sie den "
! "Modifikator »FM«."

#: utils/adt/formatting.c:1886 utils/adt/formatting.c:1899
#: utils/adt/formatting.c:2029

#2Bernd Helmle
mailings@oopsware.de
In reply to: Christoph Berg (#1)
Re: BUG #6066: Bad string in German translation causes segfault (user-triggerable)

--On 17. Juni 2011 08:18:03 +0000 Christoph Berg <cb@df7cb.de> wrote:

In German locale, the follow statement causes vsnprintf() to segfault when
printing the hint:

SELECT TO_DATE('30.12.2011', 'YYYYMMDD') AS datum;

Fix tested for 8.4:

Additionally, this seems to be the case for 9.0, 9.1 and current -HEAD, too.

--
Thanks

Bernd

#3Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Bernd Helmle (#2)
Re: BUG #6066: Bad string in German translation causes segfault (user-triggerable)

On 17.06.2011 11:22, Bernd Helmle wrote:

--On 17. Juni 2011 08:18:03 +0000 Christoph Berg <cb@df7cb.de> wrote:

In German locale, the follow statement causes vsnprintf() to segfault
when
printing the hint:

SELECT TO_DATE('30.12.2011', 'YYYYMMDD') AS datum;

Fix tested for 8.4:

Additionally, this seems to be the case for 9.0, 9.1 and current -HEAD,
too.

So, this is a case where the untranslated string doesn't have a %s in
it, but the translated one does. We should have a way to check those
automatically. In fact, I'm surprised if someone somewhere hasn't
already written such a script, as gettext is used very widely. Anyone
want to research/write a script?

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#4Christoph Berg
myon@debian.org
In reply to: Heikki Linnakangas (#3)
Re: BUG #6066: Bad string in German translation causes segfault (user-triggerable)

Re: Heikki Linnakangas 2011-06-17 <4DFB137E.4040404@enterprisedb.com>

So, this is a case where the untranslated string doesn't have a %s
in it, but the translated one does. We should have a way to check
those automatically. In fact, I'm surprised if someone somewhere
hasn't already written such a script, as gettext is used very
widely. Anyone want to research/write a script?

Actually, msgfmt can do that itself with -c. This can be set in
Makefile.global:

$ grep MSGFMT src/Makefile.global
MSGFMT = msgfmt -c

Unfortunately that doesn't help in this case, as the bad string isn't
tagged as "#, c-format", but still gets used as such. This seems to be
the case for many errhint() strings. Maybe xgettext should be taught
to treat all errhint() et al arguments as c-strings.

Christoph
--
cb@df7cb.de | http://www.df7cb.de/

#5Christoph Berg
myon@debian.org
In reply to: Christoph Berg (#4)
Re: BUG #6066: [PATCH] Mark more strings as c-format

Re: To pgsql-bugs@postgresql.org 2011-06-17 <20110617091114.GC4130@msgid.df7cb.de>

Unfortunately that doesn't help in this case, as the bad string isn't
tagged as "#, c-format", but still gets used as such. This seems to be
the case for many errhint() strings. Maybe xgettext should be taught
to treat all errhint() et al arguments as c-strings.

Here's a patch to implement that, with backend/nls.mk updated.

msgfmt -c is already available in the "maintainer-check-po" target.
I'd assume this was called at least once during the release process.

diff --git a/src/backend/nls.mk b/src/backend/nls.mk
index 1894569..3c3f8ed 100644
*** a/src/backend/nls.mk
--- b/src/backend/nls.mk
*************** GETTEXT_TRIGGERS:= _ errmsg errmsg_plura
*** 6,11 ****
--- 6,15 ----
      errdetail_plural:1,2 errhint errcontext \
      GUC_check_errmsg GUC_check_errdetail GUC_check_errhint \
      write_stderr yyerror parser_yyerror
+ GETTEXT_FLAGS   := errmsg:1:c-format errmsg_plural:1:c-format \
+     errmsg_plural:2:c-format errhint:1:c-format errcontext:1:c-format \
+     GUC_check_errmsg:1:c-format GUC_check_errdetail:1:c-format \
+     GUC_check_errhint:1:c-format write_stderr:1:c-format
  gettext-files: distprep
  	find $(srcdir)/ $(srcdir)/../port/ -name '*.c' -print >$@
diff --git a/src/nls-global.mk b/src/nls-global.mk
index 32b3c0f..3aa598f 100644
*** a/src/nls-global.mk
--- b/src/nls-global.mk
***************
*** 12,17 ****
--- 12,20 ----
  # GETTEXT_FILES		-- list of source files that contain message strings
  # GETTEXT_TRIGGERS	-- (optional) list of functions that contain
  #                          translatable strings
+ # GETTEXT_FLAGS		-- (optional) list of gettext --flag arguments to mark
+ #                          function arguments that contain C format strings
+ #                          (functions must be listed in TRIGGERS and FLAGS)
  #
  # That's all, the rest is done here, if --enable-nls was specified.
  #
*************** all-po: $(MO_FILES)
*** 48,54 ****
  ifeq ($(word 1,$(GETTEXT_FILES)),+)
  po/$(CATALOG_NAME).pot: $(word 2, $(GETTEXT_FILES)) $(MAKEFILE_LIST)
  ifdef XGETTEXT
! 	$(XGETTEXT) -D $(srcdir) -n $(addprefix -k, $(GETTEXT_TRIGGERS)) -f $<
  else
  	@echo "You don't have 'xgettext'."; exit 1
  endif
--- 51,57 ----
  ifeq ($(word 1,$(GETTEXT_FILES)),+)
  po/$(CATALOG_NAME).pot: $(word 2, $(GETTEXT_FILES)) $(MAKEFILE_LIST)
  ifdef XGETTEXT
! 	$(XGETTEXT) -D $(srcdir) -n $(addprefix -k, $(GETTEXT_TRIGGERS)) $(addprefix --flag=, $(GETTEXT_FLAGS)) -f $<
  else
  	@echo "You don't have 'xgettext'."; exit 1
  endif
*************** po/$(CATALOG_NAME).pot: $(GETTEXT_FILES)
*** 57,63 ****
  # Change to srcdir explicitly, don't rely on $^.  That way we get
  # consistent #: file references in the po files.
  ifdef XGETTEXT
! 	$(XGETTEXT) -D $(srcdir) -n $(addprefix -k, $(GETTEXT_TRIGGERS)) $(GETTEXT_FILES)
  else
  	@echo "You don't have 'xgettext'."; exit 1
  endif
--- 60,66 ----
  # Change to srcdir explicitly, don't rely on $^.  That way we get
  # consistent #: file references in the po files.
  ifdef XGETTEXT
! 	$(XGETTEXT) -D $(srcdir) -n $(addprefix -k, $(GETTEXT_TRIGGERS)) $(addprefix --flag=, $(GETTEXT_FLAGS)) $(GETTEXT_FILES)
  else
  	@echo "You don't have 'xgettext'."; exit 1
  endif

Christoph
--
cb@df7cb.de | http://www.df7cb.de/

#6Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Christoph Berg (#5)
Re: BUG #6066: [PATCH] Mark more strings as c-format

Excerpts from Christoph Berg's message of vie jun 17 07:10:34 -0400 2011:

Re: To pgsql-bugs@postgresql.org 2011-06-17 <20110617091114.GC4130@msgid.df7cb.de>

Unfortunately that doesn't help in this case, as the bad string isn't
tagged as "#, c-format", but still gets used as such. This seems to be
the case for many errhint() strings. Maybe xgettext should be taught
to treat all errhint() et al arguments as c-strings.

Here's a patch to implement that, with backend/nls.mk updated.

msgfmt -c is already available in the "maintainer-check-po" target.
I'd assume this was called at least once during the release process.

Yeah, msgfmt -c is called pretty frequently on the translation service
http://babel.postgresql.org.

Thanks for the patch, I'll have a look at integrating it.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#7Michael Meskes
meskes@postgresql.org
In reply to: Christoph Berg (#1)
Re: BUG #6066: Bad string in German translation causes segfault (user-triggerable)

On Fri, Jun 17, 2011 at 08:18:03AM +0000, Christoph Berg wrote:

$ diff -c src/backend/po/de.po.orig src/backend/po/de.po
*** src/backend/po/de.po.orig	2011-06-17 10:06:41.000000000 +0200
--- src/backend/po/de.po	2011-06-17 10:06:48.000000000 +0200
***************
*** 12318,12324 ****
"If your source string is not fixed-width, try using the \"FM\"
modifier."
msgstr ""
"Wenn die Quellzeichenkette keine feste Breite hat, versuchen Sie den "
! "Modifikator »%s«."
#: utils/adt/formatting.c:1886 utils/adt/formatting.c:1899
#: utils/adt/formatting.c:2029
--- 12318,12324 ----
"If your source string is not fixed-width, try using the \"FM\"
modifier."
msgstr ""
"Wenn die Quellzeichenkette keine feste Breite hat, versuchen Sie den "
! "Modifikator »FM«."

#: utils/adt/formatting.c:1886 utils/adt/formatting.c:1899
#: utils/adt/formatting.c:2029

Applied, thanks for the patch.

Michael
--
Michael Meskes
Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org
Jabber: michael.meskes at googlemail dot com
VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL

#8Peter Eisentraut
peter_e@gmx.net
In reply to: Bernd Helmle (#2)
Re: BUG #6066: Bad string in German translation causes segfault (user-triggerable)

On fre, 2011-06-17 at 10:22 +0200, Bernd Helmle wrote:

--On 17. Juni 2011 08:18:03 +0000 Christoph Berg <cb@df7cb.de> wrote:

In German locale, the follow statement causes vsnprintf() to segfault when
printing the hint:

SELECT TO_DATE('30.12.2011', 'YYYYMMDD') AS datum;

Fix tested for 8.4:

Additionally, this seems to be the case for 9.0, 9.1 and current -HEAD, too.

Fix committed to 8.4, 9.0, 9.1 translation repositories.

#9Peter Eisentraut
peter_e@gmx.net
In reply to: Christoph Berg (#5)
Re: BUG #6066: [PATCH] Mark more strings as c-format

On fre, 2011-06-17 at 13:10 +0200, Christoph Berg wrote:

Re: To pgsql-bugs@postgresql.org 2011-06-17 <20110617091114.GC4130@msgid.df7cb.de>

Unfortunately that doesn't help in this case, as the bad string isn't
tagged as "#, c-format", but still gets used as such. This seems to be
the case for many errhint() strings. Maybe xgettext should be taught
to treat all errhint() et al arguments as c-strings.

Here's a patch to implement that, with backend/nls.mk updated.

I have committed a patch based on that, with the other nls.mk filled in
as well. Thanks for the idea.