PostgreSQL
Hello, all!
I have a good question for PostgreSQL FAQ.
How to use string functions (like UPPER()/LOWER()) for non-latin strings?
Why UPPER() function doesn't work with my UNICODE PostgreSQL database which contains non-latin characters (like cyrillic)?
How to make case insensetive search by text field which contains non-latin characters?
Thanks for your answers!
Best regards
Eugeny
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
I confirm this behavour: cyrilic words are not changed by lower()/upper()
functions, nor catched by ilike.
I am using :
=> SELECT version();
version
- ---------------------------------------------------------------
PostgreSQL 7.2.2 on i686-pc-linux-gnu, compiled by GCC 2.95.2
(1 row)
Nothing special was done during database creation (no encoding selected).
Not sure. I thought it would work.
How to use string functions (like UPPER()/LOWER()) for non-latin strings?
Why UPPER() function doesn't work with my UNICODE PostgreSQL database
which contains non-latin characters (like cyrillic)? How to make case
insensetive search by text field which contains non-latin characters?
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)
iD8DBQE/Nw7wV+WKOINIfOYRAuhmAJwMEkdgqXkt6ZhgJsFZfQH2mELRwgCfeDeV
L9TbSItEb0tAC7cI0cKwg6A=
=veHN
-----END PGP SIGNATURE-----
Import Notes
Reply to msg id not found: 200308110401.h7B41kp10302@candle.pha.pa.usReference msg id not found: 200308110401.h7B41kp10302@candle.pha.pa.us | Resolved by subject fallback
Not sure. I thought it would work.
---------------------------------------------------------------------------
Eugeny Balakhonov wrote:
Hello, all!
I have a good question for PostgreSQL FAQ.
How to use string functions (like UPPER()/LOWER()) for non-latin strings?
Why UPPER() function doesn't work with my UNICODE PostgreSQL database which contains non-latin characters (like cyrillic)?
How to make case insensetive search by text field which contains non-latin characters?Thanks for your answers!
Best regards
Eugeny
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
On Mon, 11 Aug 2003, Bruce Momjian wrote:
Not sure. I thought it would work.
No, it doesn't works. Several people already complained about bad
unicode support. I recall Tatsuo comment some piece of code.
I have a little page http://www.sai.msu.su/~megera/postgres/utf8.html
about my experience with UTF8 and cyrillic.
---------------------------------------------------------------------------
Eugeny Balakhonov wrote:
Hello, all!
I have a good question for PostgreSQL FAQ.
How to use string functions (like UPPER()/LOWER()) for non-latin strings?
Why UPPER() function doesn't work with my UNICODE PostgreSQL database which contains non-latin characters (like cyrillic)?
How to make case insensetive search by text field which contains non-latin characters?Thanks for your answers!
Best regards
Eugeny
Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83
Well, I have no mention of this problem in the TODO list, so I would
like to get a good description of why it isn't working.
Looking at the code, I see upper() is defined in oracle_compat.c (you
would think it would be more standard), and it calls toupper(), so it
probably works on single-bytes encodings, but not multi-byte ones. Is
this correct? is there a way to do multi-byte toupper? Perhaps
converting to wide characters and calling towupper()?
---------------------------------------------------------------------------
Oleg Bartunov wrote:
On Mon, 11 Aug 2003, Bruce Momjian wrote:
Not sure. I thought it would work.
No, it doesn't works. Several people already complained about bad
unicode support. I recall Tatsuo comment some piece of code.
I have a little page http://www.sai.msu.su/~megera/postgres/utf8.html
about my experience with UTF8 and cyrillic.---------------------------------------------------------------------------
Eugeny Balakhonov wrote:
Hello, all!
I have a good question for PostgreSQL FAQ.
How to use string functions (like UPPER()/LOWER()) for non-latin strings?
Why UPPER() function doesn't work with my UNICODE PostgreSQL database which contains non-latin characters (like cyrillic)?
How to make case insensetive search by text field which contains non-latin characters?Thanks for your answers!
Best regards
EugenyRegards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
I think if Postgres were to be completely UTF8 compatible, and as the default configuration, we'd do a lot better against 'the others', and take more of Oracle's market.
Bruce Momjian wrote:
Show quoted text
Well, I have no mention of this problem in the TODO list, so I would
like to get a good description of why it isn't working.Looking at the code, I see upper() is defined in oracle_compat.c (you
would think it would be more standard), and it calls toupper(), so it
probably works on single-bytes encodings, but not multi-byte ones. Is
this correct? is there a way to do multi-byte toupper? Perhaps
converting to wide characters and calling towupper()?---------------------------------------------------------------------------
Oleg Bartunov wrote:
On Mon, 11 Aug 2003, Bruce Momjian wrote:
Not sure. I thought it would work.
No, it doesn't works. Several people already complained about bad
unicode support. I recall Tatsuo comment some piece of code.
I have a little page http://www.sai.msu.su/~megera/postgres/utf8.html
about my experience with UTF8 and cyrillic.---------------------------------------------------------------------------
Eugeny Balakhonov wrote:
Hello, all!
I have a good question for PostgreSQL FAQ.
How to use string functions (like UPPER()/LOWER()) for non-latin strings?
Why UPPER() function doesn't work with my UNICODE PostgreSQL database which contains non-latin characters (like cyrillic)?
How to make case insensetive search by text field which contains non-latin characters?Thanks for your answers!
Best regards
EugenyRegards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83
Added to TODO:
* Fix upper()/lower() to work for multibyte encodings
---------------------------------------------------------------------------
Alexander Litvinov wrote:
[ PGP not available, raw data follows ]
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1I confirm this behavour: cyrilic words are not changed by lower()/upper()
functions, nor catched by ilike.I am using :
=> SELECT version();
version
- ---------------------------------------------------------------
PostgreSQL 7.2.2 on i686-pc-linux-gnu, compiled by GCC 2.95.2
(1 row)Nothing special was done during database creation (no encoding selected).
Not sure. I thought it would work.
How to use string functions (like UPPER()/LOWER()) for non-latin strings?
Why UPPER() function doesn't work with my UNICODE PostgreSQL database
which contains non-latin characters (like cyrillic)? How to make case
insensetive search by text field which contains non-latin characters?-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)iD8DBQE/Nw7wV+WKOINIfOYRAuhmAJwMEkdgqXkt6ZhgJsFZfQH2mELRwgCfeDeV
L9TbSItEb0tAC7cI0cKwg6A=
=veHN
-----END PGP SIGNATURE-----
[ End of raw data]
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073