The server's LC_CTYPE locale

Started by Ben-Nes Michaelalmost 20 years ago6 messagesgeneral
Jump to latest
#1Ben-Nes Michael
miki@canaan.co.il

Hello

Im got the following error when the query string was one of the Hebrew
chars:

SELECT upper('ש');
ERROR: invalid multibyte character for locale
HINT: The server's LC_CTYPE locale is probably incompatible with the
database encoding.

after few minutes while gathering info i stoped getting the previous
error and started to get:

#SELECT lower('ש');
ERROR: invalid UTF-8 byte sequence detected near byte 0xf9

# SELECT upper('ש');
ERROR: invalid UTF-8 byte sequence detected near byte 0xf9

#SELECT version();
PostgreSQL 8.1.3 on i486-pc-linux-gnu, compiled by GCC cc (GCC) 4.0.3
(Debian 4.0.3-1)

#show lc_ctype ;
he_IL.utf8

#SHOW SERVER_ENCODING;
UTF8

Any ideas what the problem ?

--

--------------------------------------------------
Michael Ben-Nes - Internet Consultant and Director.
http://www.epoch.co.il - weaving the Net.
Cellular: 054-4848113
--------------------------------------------------

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Ben-Nes Michael (#1)
Re: The server's LC_CTYPE locale

Michael Ben-Nes <miki@canaan.co.il> writes:

Im got the following error when the query string was one of the Hebrew
chars:

SELECT upper('ש');
ERROR: invalid multibyte character for locale
HINT: The server's LC_CTYPE locale is probably incompatible with the
database encoding.

Hmph. I can't reproduce that here (using Fedora 4's version of he_IL.utf8
anyway). I assume your client_encoding was also UTF8? The troublesome
character came through in your email as \327\251 (D7 A9) ... is that
what you were actually entering? The reference to F9 in the other error
message makes me think the character got munged somewhere in the email
chain ...

regards, tom lane

#3Ben-Nes Michael
miki@canaan.co.il
In reply to: Tom Lane (#2)
Re: The server's LC_CTYPE locale

Tom Lane wrote:

Michael Ben-Nes <miki@canaan.co.il> writes:

Im got the following error when the query string was one of the Hebrew
chars:

SELECT upper('׳©');
ERROR: invalid multibyte character for locale
HINT: The server's LC_CTYPE locale is probably incompatible with the
database encoding.

Hmph. I can't reproduce that here (using Fedora 4's version of he_IL.utf8
anyway). I assume your client_encoding was also UTF8? The troublesome
character came through in your email as \327\251 (D7 A9) ... is that
what you were actually entering? The reference to F9 in the other error
message makes me think the character got munged somewhere in the email
chain ...

the Client Encoding is UTF8.

Strangely I no longer get the second error:
ERROR: invalid UTF-8 byte sequence detected near byte 0xf9

The first error returned:
# SELECT lower('ש');
ERROR: invalid multibyte character for locale
HINT: The server's LC_CTYPE locale is probably incompatible with the
database encoding.

The character that I sent is:
[ש‎] U+05E9 &#1513; HEBREW LETTER SHIN

Im out of ideas, What else I should check ?

regards, tom lane

--

--------------------------------------------------
Michael Ben-Nes - Internet Consultant and Director.
http://www.epoch.co.il - weaving the Net.
Cellular: 054-4848113
--------------------------------------------------

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Ben-Nes Michael (#3)
Re: The server's LC_CTYPE locale

Michael Ben-Nes <miki@canaan.co.il> writes:

The character that I sent is:
[ש‎] U+05E9 &#1513; HEBREW LETTER SHIN

Well, that does work out to D7 A9 in UTF8, if I'm doing the arithmetic
correctly.

I can't replicate any problem in either 8.1.4 or HEAD. It's possible
that this is a bug that's been fixed since 8.1.3, but I don't recall
any change in that area. I think more likely the difference is between
the he_IL.utf8 locale definitions in Fedora 4 and Debian. Perhaps you
should check for available updates to the locale.

regards, tom lane

#5Ben-Nes Michael
miki@canaan.co.il
In reply to: Ben-Nes Michael (#1)
Re: The server's LC_CTYPE locale

For the record:

Those are the records in my locale.gen

# cat /etc/locale.gen.old
en_US ISO-8859-1
he_IL UTF-8
he_IL ISO-8859-8

I found out that by removing "he_IL ISO-8859-8" i fixed the problem.

Why ? i have no idea ( maybe some collisions because the double he_IL ? ).

Cheers

Michael Ben-Nes wrote:

Hello

Im got the following error when the query string was one of the Hebrew
chars:

SELECT upper('ש');
ERROR: invalid multibyte character for locale
HINT: The server's LC_CTYPE locale is probably incompatible with the
database encoding.

after few minutes while gathering info i stoped getting the previous
error and started to get:

#SELECT lower('ש');
ERROR: invalid UTF-8 byte sequence detected near byte 0xf9

# SELECT upper('ש');
ERROR: invalid UTF-8 byte sequence detected near byte 0xf9

#SELECT version();
PostgreSQL 8.1.3 on i486-pc-linux-gnu, compiled by GCC cc (GCC) 4.0.3
(Debian 4.0.3-1)

#show lc_ctype ;
he_IL.utf8

#SHOW SERVER_ENCODING;
UTF8

Any ideas what the problem ?

--

--------------------------------------------------
Michael Ben-Nes - Internet Consultant and Director.
http://www.epoch.co.il - weaving the Net.
Cellular: 054-4848113
--------------------------------------------------

#6Martijn van Oosterhout
kleptog@svana.org
In reply to: Ben-Nes Michael (#5)
Re: The server's LC_CTYPE locale

On Tue, Sep 05, 2006 at 02:56:21PM +0300, Michael Ben-Nes wrote:

For the record:

Those are the records in my locale.gen

# cat /etc/locale.gen.old
en_US ISO-8859-1
he_IL UTF-8
he_IL ISO-8859-8

Yeah, that's wrong. The first column is the identifier, so the last
entry should something like:

he_IL.ISO-8859-8 ISO-8859-8

Why ? i have no idea ( maybe some collisions because the double he_IL ? ).

You can't do that.

Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/

Show quoted text

From each according to his ability. To each according to his ability to litigate.