ilike and utf-8

Started by Raphael Bauduinalmost 20 years ago7 messagesgeneral
Jump to latest
#1Raphael Bauduin
rblists@gmail.com

Hi,

Does the ilike operator work fine with cyrillic text put in a UTF-8
encoded database?
I've had remarks of a user (of http://myowndb.com, a web database)
with text in cyrillic that his searches are not case insensitive,
although I use the ilke operator in the code. And it works perfectly
for my data (that are not in cyrillic).

Thanks

Raph

#2Martijn van Oosterhout
kleptog@svana.org
In reply to: Raphael Bauduin (#1)
Re: ilike and utf-8

On Fri, Apr 14, 2006 at 03:16:01PM +0200, Raphael Bauduin wrote:

Hi,

Does the ilike operator work fine with cyrillic text put in a UTF-8
encoded database?
I've had remarks of a user (of http://myowndb.com, a web database)
with text in cyrillic that his searches are not case insensitive,
although I use the ilke operator in the code. And it works perfectly
for my data (that are not in cyrillic).

UTF-8 support for case-comparison is operatnig system dependant. What
systems are we comparing here?

Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/

Show quoted text

Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
tool for doing 5% of the work and then sitting around waiting for someone
else to do the other 95% so you can sue them.

#3Tomi NA
hefest@gmail.com
In reply to: Martijn van Oosterhout (#2)
Re: ilike and utf-8

On 4/14/06, Martijn van Oosterhout <kleptog@svana.org> wrote:

On Fri, Apr 14, 2006 at 03:16:01PM +0200, Raphael Bauduin wrote:

Hi,

Does the ilike operator work fine with cyrillic text put in a UTF-8
encoded database?
I've had remarks of a user (of http://myowndb.com, a web database)
with text in cyrillic that his searches are not case insensitive,
although I use the ilke operator in the code. And it works perfectly
for my data (that are not in cyrillic).

UTF-8 support for case-comparison is operatnig system dependant. What
systems are we comparing here?

I'd like to know the same thing. I'm using GNU/linux and ISO-8859-2 (when
UTF-8 isn't an option).

Tomislav

#4Raphael Bauduin
rblists@gmail.com
In reply to: Martijn van Oosterhout (#2)
Re: ilike and utf-8

It's a Debian GNU/Linux, with a self-compiled 8.1.3 postgresql.

Raph

Show quoted text

On 4/14/06, Martijn van Oosterhout <kleptog@svana.org> wrote:

On Fri, Apr 14, 2006 at 03:16:01PM +0200, Raphael Bauduin wrote:

Hi,

Does the ilike operator work fine with cyrillic text put in a UTF-8
encoded database?
I've had remarks of a user (of http://myowndb.com, a web database)
with text in cyrillic that his searches are not case insensitive,
although I use the ilke operator in the code. And it works perfectly
for my data (that are not in cyrillic).

UTF-8 support for case-comparison is operatnig system dependant. What
systems are we comparing here?

Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/

Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
tool for doing 5% of the work and then sitting around waiting for someone
else to do the other 95% so you can sue them.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)

iD8DBQFEP6gDIB7bNG8LQkwRAgyUAJsGusLIxrdkiaDg11727770bquYCgCfWgCZ
/SYTVp84hAf/jx8pO+js8pY=
=afee
-----END PGP SIGNATURE-----

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Raphael Bauduin (#1)
Re: ilike and utf-8

"Raphael Bauduin" <rblists@gmail.com> writes:

Does the ilike operator work fine with cyrillic text put in a UTF-8
encoded database?

If you've initdb'd in an appropriate locale (probably named something
like ru_RU.utf8) then it should work. I wouldn't expect a random
non-Russian locale to necessarily know about Cyrillic case conversions,
however.

Martijn's nearby comment about OS dependency really boils down to the
fact that different OSes may have different definitions for similarly
named locales. We need to know what locale you're using (try "SHOW
LC_CTYPE") as well as the OS.

regards, tom lane

#6Raphael Bauduin
rblists@gmail.com
In reply to: Tom Lane (#5)
Re: ilike and utf-8

On 4/14/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:

"Raphael Bauduin" <rblists@gmail.com> writes:

Does the ilike operator work fine with cyrillic text put in a UTF-8
encoded database?

If you've initdb'd in an appropriate locale (probably named something
like ru_RU.utf8) then it should work. I wouldn't expect a random
non-Russian locale to necessarily know about Cyrillic case conversions,
however.

The problem is that the system is serving, at the same time, content
for different locales, so I can't set it at the environment level.
Maybe I should set a user setting so a user can choose which locale to
use.

Thanks for the help!

Raph

Show quoted text

Martijn's nearby comment about OS dependency really boils down to the
fact that different OSes may have different definitions for similarly
named locales. We need to know what locale you're using (try "SHOW
LC_CTYPE") as well as the OS.

regards, tom lane

#7SunWuKung
Balazs.Klein@t-online.hu
In reply to: Raphael Bauduin (#6)
Re: ilike and utf-8

I have a similar problem that I raised here (see link) but I don't have
the solution yet.
I received several ideas, but so far not a solution that would actually
work for me.
You may want to give the function that you find in this thread a try.
It didn't work for me, but maybe it will for you - let me know please
if it does, I am still looking for an answer.

http://groups.google.com/group/pgsql.general/browse_thread/thread/20aed89ab0e19e3d/4771fb1be397afea#4771fb1be397afea

Regards,

Balázs