BUG #15511: Drop table error "invalid argument"
The following bug has been logged on the website:
Bug reference: 15511
Logged by: Ertuğrul Kahveci
Email address: ertugrul9090@gmail.com
PostgreSQL version: 11.1
Operating system: windows 10
Description:
Hello;
Drop table error "HATA: vsnprintf failed: Invalid argument", error
code:"XX000" , PostgreSQL version..: "11.1"
Thanks,
Hi,
On 2018-11-17 20:32:27 +0000, PG Bug reporting form wrote:
The following bug has been logged on the website:
Bug reference: 15511
Logged by: Ertuğrul Kahveci
Email address: ertugrul9090@gmail.com
PostgreSQL version: 11.1
Operating system: windows 10
Description:Hello;
Drop table error "HATA: vsnprintf failed: Invalid argument", error
code:"XX000" , PostgreSQL version..: "11.1"
Thanks,
This is not enough information for us to do anything abou this
report. What precisely lead to this error being reported? Can you give
us enough details to reproduce?
Do you get the same error if you do SET lc_messages = 'C'; before you
drop the table?
Greetings,
Andres Freund
=?utf-8?q?PG_Bug_reporting_form?= <noreply@postgresql.org> writes:
Drop table error "HATA: vsnprintf failed: Invalid argument", error
code:"XX000" , PostgreSQL version..: "11.1"
Hmm. Looking at snprintf.c, EINVAL could only be returned for an
incorrect translated message (i.e. wrong use of %n$ notation),
so it seems somebody fat-fingered a translation. We can infer
that you're using the Turkish message set, but that doesn't help
much to narrow down where the mistake is. What do you see if you
do "set lc_messages = 'C'" and then repeat the failing command?
regards, tom lane
I wrote:
=?utf-8?q?PG_Bug_reporting_form?= <noreply@postgresql.org> writes:
Drop table error "HATA: vsnprintf failed: Invalid argument", error
code:"XX000" , PostgreSQL version..: "11.1"
Hmm. Looking at snprintf.c, EINVAL could only be returned for an
incorrect translated message (i.e. wrong use of %n$ notation),
so it seems somebody fat-fingered a translation. We can infer
that you're using the Turkish message set, but that doesn't help
much to narrow down where the mistake is. What do you see if you
do "set lc_messages = 'C'" and then repeat the failing command?
I poked at that to the extent of testing every string in 11.1's tr.po
against our implementation of snprintf, and what I find is this:
#. translator: second %s is, e.g., "table %s"
#: catalog/objectaddress.c:2694
#, c-format
msgid "column %s of %s"
msgstr "%2$s'nin %1$ sütunu"
This msgstr is legal according to POSIX, so it's not surprising that
GNU msgfmt doesn't complain about it; but our version of snprintf()
does, because space is not a valid flag character according to it.
So the bit "%1$ s" is valid to msgfmt but not to us.
Presumably, what failed for you was something along the lines of
regression=# create table foo (f1 int);
CREATE TABLE
regression=# create table bar (f1 foo);
CREATE TABLE
regression=# drop table foo;
ERROR: cannot drop table foo because other objects depend on it
DETAIL: column f1 of table bar depends on type foo
HINT: Use DROP ... CASCADE to drop the dependent objects too.
Conclusions:
(1) I don't know Turkish, but it seems clear that this is a typo
and the string ought to be
msgstr "%2$s'nin %1$s sütunu"
(2) It seems like a bad idea that pvsnprintf(), which must have
reported this message, knows full well that it's dealing with a
broken format string and yet doesn't print out that format string.
It would have taken much less work to find this problem if it had.
(3) I'm quite unwilling to try to make snprintf.c accept absolutely
everything that's in the POSIX spec, especially seeing that this
particular omission caught a mistake.
(4) However, that leaves us with a translation problem, because msgfmt
doesn't detect some things that we'll fail on at runtime. What shall
we do about that?
regards, tom lane
Greetings,
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
(4) However, that leaves us with a translation problem, because msgfmt
doesn't detect some things that we'll fail on at runtime. What shall
we do about that?
I realize this is pretty pie-in-the-sky, but maybe we should have a
buildfarm member that's actually running the regression suite with every
translation we support..? Of course, then we should also have our
regression tests actually causing every error message to be hit at least
once or we'd still possibly miss things.
A shorter path, though one with less additional benefits, would
presumably be to find a way to make msgfmt fail on things that we'll
fail on..? Or otherwise detect such cases..?
Thanks!
Stephen
Stephen Frost <sfrost@snowman.net> writes:
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
(4) However, that leaves us with a translation problem, because msgfmt
doesn't detect some things that we'll fail on at runtime. What shall
we do about that?
A shorter path, though one with less additional benefits, would
presumably be to find a way to make msgfmt fail on things that we'll
fail on..? Or otherwise detect such cases..?
What I did to locate the broken translation in the first place was
to make a heavily lobotomized version of snprintf.c which would detect
EINVAL problems in a supplied format string, without trying to actually
fetch any values or emit any output. Then I applied it to every string
in tr.po. This was a complete kluge only suitable for one-shot use,
but you could imagine making something a bit more production-ready
and then inserting it into the translation pipeline somewhere. It'd
likely be worth applying it to the original strings as well as the
translated ones, in case we have any seldom-executed code in which
the original string is bad itself.
I noted while looking at the problem that it's somewhat accidental
that snprintf.c reports EINVAL rather than just producing bogus
output for this case. If we were planning to go this direction,
I'd be inclined to make it report EINVAL for any situation where
it hits an unrecognized format code --- ie put an error into the
default cases in the switches there.
That's just handwaving though, as I have little idea what the
"translation pipeline" looks like.
regards, tom lane
On 2018-Nov-17, Tom Lane wrote:
I poked at that to the extent of testing every string in 11.1's tr.po
against our implementation of snprintf, and what I find is this:#. translator: second %s is, e.g., "table %s"
#: catalog/objectaddress.c:2694
#, c-format
msgid "column %s of %s"
msgstr "%2$s'nin %1$ s�tunu"
I fixed this string in tr.po. At least it'll be correct in 11.2.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
On 2018-Nov-17, Tom Lane wrote:
I poked at that to the extent of testing every string in 11.1's tr.po
against our implementation of snprintf, and what I find is this:
msgid "column %s of %s"
msgstr "%2$s'nin %1$ sütunu"
I fixed this string in tr.po. At least it'll be correct in 11.2.
Thanks! I'm also just about to push code changes for the easy parts
of what I recommended earlier (have psprintf report the bogus format
string, and make snprintf.c throw EINVAL for more error cases).
Is anybody interested in the idea of making our own validator for
the .po files? I might be willing to code it up, if I knew what
its API ought to be, but I don't know the workflow in that area.
regards, tom lane
On 2018-Dec-06, Tom Lane wrote:
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
On 2018-Nov-17, Tom Lane wrote:
I poked at that to the extent of testing every string in 11.1's tr.po
against our implementation of snprintf, and what I find is this:
msgid "column %s of %s"
msgstr "%2$s'nin %1$ s�tunu"I fixed this string in tr.po. At least it'll be correct in 11.2.
Thanks! I'm also just about to push code changes for the easy parts
of what I recommended earlier (have psprintf report the bogus format
string, and make snprintf.c throw EINVAL for more error cases).
Great.
Is anybody interested in the idea of making our own validator for
the .po files? I might be willing to code it up, if I knew what
its API ought to be, but I don't know the workflow in that area.
I think it's definitely worthwhile to avoid introducing bogus files in
the Pg repo.
The work is done by these scripts:
https://git.postgresql.org/gitweb/?p=pgtranslation/admin.git;a=tree;h=refs/heads/master;hb=refs/heads/master
cp-po does some validation while copying the file from the pgtranslation
repo back to Postgres, but I think that's the wrong time (at release
time). The good one I think is wwwtools/pg-make-po, which runs a few
times daily and produces the error file that ends up in the status table
in babel.postgresql.org. We'd want something that produces output that
can be appended to the $outdir/$catalogname-$lang.po.err file (lines
103ff of that script).
Peter is the authority on this, of course.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
On 2018-Dec-06, Tom Lane wrote:
Is anybody interested in the idea of making our own validator for
the .po files? I might be willing to code it up, if I knew what
its API ought to be, but I don't know the workflow in that area.
I think it's definitely worthwhile to avoid introducing bogus files in
the Pg repo.
The work is done by these scripts:
https://git.postgresql.org/gitweb/?p=pgtranslation/admin.git;a=tree;h=refs/heads/master;hb=refs/heads/master
cp-po does some validation while copying the file from the pgtranslation
repo back to Postgres, but I think that's the wrong time (at release
time). The good one I think is wwwtools/pg-make-po, which runs a few
times daily and produces the error file that ends up in the status table
in babel.postgresql.org. We'd want something that produces output that
can be appended to the $outdir/$catalogname-$lang.po.err file (lines
103ff of that script).
Hm. So that's all Perl code...
In principle, we could write some Perl code that exactly matches what
snprintf.c thinks is valid input, but I think that keeping it in sync
would be a nightmare. The concept I had in mind was to make a variant
version of snprintf.c that just validates a format string, and can be
compared to snprintf.c by diff'ing. (Or, perhaps, sprinkle snprintf.c
with #ifdefs so that compiling it with the right -D flag produces what
we want; though that might look too ugly.) If you don't mind adding
a C compiler to the list of dependencies for pg-make-po, we could imagine
having it compile up such a program at startup and then apply it to
each catalog.
Peter is the authority on this, of course.
Yup. Peter, any comments?
regards, tom lane
On 2018-Dec-06, Tom Lane wrote:
In principle, we could write some Perl code that exactly matches what
snprintf.c thinks is valid input, but I think that keeping it in sync
would be a nightmare. The concept I had in mind was to make a variant
version of snprintf.c that just validates a format string, and can be
compared to snprintf.c by diff'ing. (Or, perhaps, sprinkle snprintf.c
with #ifdefs so that compiling it with the right -D flag produces what
we want; though that might look too ugly.) If you don't mind adding
a C compiler to the list of dependencies for pg-make-po, we could imagine
having it compile up such a program at startup and then apply it to
each catalog.
I don't follow. Why don't we just compile snprintf.c as-is and another
.c file with a function that invokes vsnprintf on each translated string
on a .po file and prints an error if vsnprintf returns EINVAL?
This code runs completely under our control, and we can install whatever
tools are needed. We don't need a C compiler today, but installing one
is trivial. Also, we already have postgres source trees for each PG
version available.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
On 2018-Dec-06, Tom Lane wrote:
In principle, we could write some Perl code that exactly matches what
snprintf.c thinks is valid input, but I think that keeping it in sync
would be a nightmare. The concept I had in mind was to make a variant
version of snprintf.c that just validates a format string, and can be
compared to snprintf.c by diff'ing. (Or, perhaps, sprinkle snprintf.c
with #ifdefs so that compiling it with the right -D flag produces what
we want; though that might look too ugly.) If you don't mind adding
a C compiler to the list of dependencies for pg-make-po, we could imagine
having it compile up such a program at startup and then apply it to
each catalog.
I don't follow. Why don't we just compile snprintf.c as-is and another
.c file with a function that invokes vsnprintf on each translated string
on a .po file and prints an error if vsnprintf returns EINVAL?
Because figuring out what additional arguments go with a particular
format string is pretty hard, and we can't just not do that, or
snprintf will likely dump core. We need a lobotomized version that
doesn't attempt to fetch any of the variable parameters.
This code runs completely under our control, and we can install whatever
tools are needed. We don't need a C compiler today, but installing one
is trivial. Also, we already have postgres source trees for each PG
version available.
OK. I'd visualize only adding the test code to HEAD, and pulling it from
there to use for all back branches.
regards, tom lane
... so my preliminary result is that there's a *boatload* of "invalid"
format strings in our translation files, all of them apparently with
the extra-space disease. See attached.
While it looks like every one of these is a typo, I'm beginning to
have second thoughts about the premise here. Presumably, the reason
we've not noticed this issue before is that almost all non-Windows
platforms were using libc's printf, which is going to implement the
full set of POSIX-specified flags including space. As of HEAD that
is no longer the case, because we use snprintf.c everywhere. While
it might not be a problem for our own code if snprintf.c is a couple
of flags short of full POSIX, I'm worrying that third-party code
might be unhappy if *printf in the PG environment behaves oddly.
So maybe the best compromise is to upgrade snprintf.c. I think it'd
only take a few more lines to implement the space flag per spec.
The other flags that we aren't implementing are '#' ("alternate form")
and "'" (thousands grouping). For those, I'd be a bit inclined to accept
them but treat 'em as no-ops. On the other hand, that still leaves
some daylight between us and POSIX, in the form of stuff like "long
double" support, and I still don't want to go near that.
Thoughts?
For the archives' sake, I attach the code I used for this.
I'm now thinking maybe we wouldn't commit it, though.
regards, tom lane
On 07/12/2018 00:38, Tom Lane wrote:
... so my preliminary result is that there's a *boatload* of "invalid"
format strings in our translation files, all of them apparently with
the extra-space disease. See attached.
A bit of follow-up: I have committed fixes for all the erroneous uses
of space "flags" in translated format strings. These will then be in
the next minor releases.
I also looked into the gettext/msgfmt source code. We know that it does
check for format string compatibility, but I found that it only checks
the general type, not the formatting flags. I suppose this is not
unreasonable, since a translation might want to space or align something
differently. So we still don't have an easy way to check for this in
the future.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services