Problem while ordering Turkish chars

Started by Devrim GUNDUZover 22 years ago2 messages
#1Devrim GUNDUZ
devrim@gunduz.org

Hi,

I'm experiencing some problems while ordering Turkish characters.

Here is that I mean:

Let's say we have two records:

Onder <- There are two dots on O. (ASCII CODE: 153)
Ozan

In Turkish alphabet, O (dotless) comes before O (with dots). So, Ozan
should be listed before Onder, but it's listed after Onder; since
PostgreSQL thinks that those letters are the same...

To begin with:
operdb=# SELECT version();
version
---------------------------------------------------------------------------------------------------------
PostgreSQL 7.3.2 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.2
20020903 (Red Hat Linux 8.0 3.2-7)

(installed from rpm)

Here is an simple example: Lines 1, 10,11,12 have O with dots as the first
letter.
==========
SELECT oper_uzun_adi FROM operler ORDER BY oper_uzun_adi;
...

Ömer KORKMAZ
Oner BINBAS
Onur BAŞ
Onur TURHAN
Osman Selçuk SARIOĞLU
Osman TOPAN
Ozan GÜLEN
Ozan SARI
Ozge YAZICIOGLU
Özgür ÇAKICI
Özgür ÖZDEMİR
Özlem BAL
...
==========

But if I apply a select query and use LIKE, PostgreSQL gives me correct
solutions:

operdb=# SELECT oper_uzun_adi FROM operler WHERE oper_uzun_adi ILIKE 'O%';
oper_uzun_adi
------------------------------------------
Osman TOPAN
Oner BINBAS
Ozge YAZICIOGLU
Onur BAŞ
Onur TURHAN
Ozan GÜLEN
Osman Selçuk SARIOĞLU
Ozan SARI
(8 rows)

Now for O with dots:

operdb=# SELECT oper_uzun_adi FROM operler WHERE oper_uzun_adi ILIKE 'Ö%';
oper_uzun_adi
------------------------------------------
Özgür ÇAKICI
Özlem BAL
Ömer KORKMAZ
Özgür ÖZDEMİR

So, LIKE understands that they are different chars; but if I simply order
them; I do not get the correct result.

Here is the locale setting:

# Locale settings
#
# (initialized by initdb -- may be changed)
LC_MESSAGES = 'en_US'
LC_MONETARY = 'en_US'
LC_NUMERIC = 'en_US'
LC_TIME = 'en_US'

If I change it to tr_TR, nothing changes.

Could anyone help me?

Best regards,
--
Devrim GUNDUZ
devrim@gunduz.org devrim.gunduz@linux.org.tr
http://www.gunduz.org

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Devrim GUNDUZ (#1)
Re: Problem while ordering Turkish chars

Devrim GUNDUZ <devrim@gunduz.org> writes:

Here is the locale setting:
LC_MESSAGES = 'en_US'
LC_MONETARY = 'en_US'
LC_NUMERIC = 'en_US'
LC_TIME = 'en_US'

If I change it to tr_TR, nothing changes.

LC_COLLATE is what determines sort order, and you can't change it on the
fly; it's determined at initdb time. You'll need to re-initdb in tr_TR
locale. (See pg_controldata printout.)

regards, tom lane