Trouble with error message encoding

Started by Darko Prenosilover 22 years ago5 messages

darko.prenosil@finteh.hr

over 22 years ago

I have encoding problems using translated error messages (7.4beta1).
When database encoding is set to SQL_ASCII, all mesages arrive to client
correctly respecting the CLIENT_ENCODING, but if I create database WITH
ENCODING='unicode' or WITH ENCODING='latin2', messages are displayed
correctly only when CLIENT_ENCODING is same as database encoding.
I checked, and this is working this way also in 7.3. Is that known problem, or
maybe I'm doing something wrong?

Regards !

Peter Eisentraut

peter_e@gmx.net

over 22 years ago

In reply to: Darko Prenosil (#1)

Re: Trouble with error message encoding

Darko Prenosil writes:

I have encoding problems using translated error messages (7.4beta1).
When database encoding is set to SQL_ASCII, all mesages arrive to client
correctly respecting the CLIENT_ENCODING, but if I create database WITH
ENCODING='unicode' or WITH ENCODING='latin2', messages are displayed
correctly only when CLIENT_ENCODING is same as database encoding.
I checked, and this is working this way also in 7.3. Is that known problem, or
maybe I'm doing something wrong?

In general, the server encoding is S, the client encoding is C, and the
messages are stored (in the source, or in the PO files) in encoding M.
When the server sends a message to the client, it tries to convert a
string of encoding M, thinking it is in encoding S, to encoding C. So,
yes, there is a problem, but it's not easy to fix.

--
Peter Eisentraut peter_e@gmx.net

Darko Prenosil

Darko.Prenosil@finteh.hr

over 22 years ago

In reply to: Peter Eisentraut (#2)

Re: Trouble with error message encoding

----- Original Message -----
From: "Peter Eisentraut" <peter_e@gmx.net>
To: "Darko Prenosil" <darko.prenosil@finteh.hr>
Cc: <pgsql-hackers@postgresql.org>
Sent: Wednesday, September 10, 2003 7:20 PM
Subject: Re: [HACKERS] Trouble with error message encoding

Darko Prenosil writes:

I have encoding problems using translated error messages (7.4beta1).
When database encoding is set to SQL_ASCII, all mesages arrive to client
correctly respecting the CLIENT_ENCODING, but if I create database WITH
ENCODING='unicode' or WITH ENCODING='latin2', messages are displayed
correctly only when CLIENT_ENCODING is same as database encoding.
I checked, and this is working this way also in 7.3. Is that known

problem, or

maybe I'm doing something wrong?

In general, the server encoding is S, the client encoding is C, and the
messages are stored (in the source, or in the PO files) in encoding M.
When the server sends a message to the client, it tries to convert a
string of encoding M, thinking it is in encoding S, to encoding C. So,
yes, there is a problem, but it's not easy to fix.

I found quick and I believe dirty solution for this problem, so I need
opinion from hackers.
Here is the idea: there is problem to find out in which encoding is using mo
file, but we can force gettext to serve known encoding for example utf8.
After that we can always convert from unicode to client encoding.

In /src/backend/main/main.c :

#ifdef ENABLE_NLS
bindtextdomain("postgres", LOCALEDIR);
bind_textdomain_codeset("postgres", "utf8");
textdomain("postgres");
#endif

in /src/backend/utils/error/elog.c

#define EVALUATE_MESSAGE(targetfield, appendval) \
{ \
char *fmtbuf; \
StringInfoData buf; \
/* Internationalize the error format string */ \
fmt = gettext(fmt); \
fmt = pg_server_to_client((unsigned char*)fmt, strlen(fmt)); \
....

Of course this is working only for backend messages, but this was enough for
testing.
I did a quick test on database created with 'latin2' and I got correctly
encoded messages for latin2, unicode and sql_ascii client encoding.
I realize that this way message is translated 2 times: by gettext and
pg_server_to_client, but after all we want as less error messages as
possible :-)
Sorry if the whole Idea is stupid, but I could not resist.

Regards !

Peter Eisentraut

peter_e@gmx.net

over 22 years ago

In reply to: Darko Prenosil (#3)

Re: Trouble with error message encoding

Darko Prenosil writes:

Here is the idea: there is problem to find out in which encoding is using mo
file, but we can force gettext to serve known encoding for example utf8.
After that we can always convert from unicode to client encoding.

Hmm, I've never heard of bind_textdomain_codeset(). How portable is it?

--
Peter Eisentraut peter_e@gmx.net

Darko Prenosil

darko.prenosil@finteh.hr

over 22 years ago

In reply to: Peter Eisentraut (#4)

Re: Trouble with error message encoding

On Thursday 11 September 2003 20:13, Peter Eisentraut wrote:

Darko Prenosil writes:

Here is the idea: there is problem to find out in which encoding is using
mo file, but we can force gettext to serve known encoding for example
utf8. After that we can always convert from unicode to client encoding.

Hmm, I've never heard of bind_textdomain_codeset(). How portable is it?

I send message Yesterday, but it looks like it did not make through.

See: http://www.gnu.org/manual/gettext/
It is according to that documentation standard part of GNU gettext.
Few Gnome applications are using it - saw that on mailing lists.
Also I did found it in UNIX gettext package documentation.
I do not know about other platforms.
Sorry if You already got previous message, but I do not see it on the list.

P.S. I messed up that line in elog.c, because conversion should go from utf8
source, but You understand the idea.

Regards