cyrillic and sort order (ORDER BY)

Started by Mihail Marinovover 25 years ago7 messagesgeneral
Jump to latest
#1Mihail Marinov
liahim@bcci.bg

Hello,

My problem is that ORDER BY does not alphabetically order strings in
proper way. Worst - it does sometimes work right but I can't find out
exactly why. But after a while it stops until I rebuild and
reconfigure.

I have read *everything* I managed to find in the archives and the
docs. I am using RedHat 6.2, 2.2.14; PostgreSQL 7.0.2 (at the start),
now (hoping it would have magically save me from this nightmare)
7.0.3. Built with --enable-locale and --enable-multibyte=WIN (does
this mean more than the default encoding choice for createdb?; I have
used KOI8 with the same "success"). I have LC_ALL set to ru_RU.KOI8
and a database with encoding KOI8. The postmaster is started with
these locale settings. I use charset.conf with the shipped recoding
tables. All my clients are win. I hope I haven't missed something
important. Can someone, please, help me with a little. I have spent
huge amounts of time on this and all get is an erratic behaviour
which drives me crazy. I would have preferred if it was not working
at all.

Thank you for your time.

#2Peter Eisentraut
peter_e@gmx.net
In reply to: Mihail Marinov (#1)
Re: cyrillic and sort order (ORDER BY)

Mihail Marinov writes:

My problem is that ORDER BY does not alphabetically order strings in
proper way. Worst - it does sometimes work right but I can't find out
exactly why. But after a while it stops until I rebuild and
reconfigure.

You must have the same collation locale setting (LC_COLLATE, or LC_ALL as
default) when running initdb and when running postmaster. Changing the
setting in between will result in random and incorrect results and
database corruption.

--
Peter Eisentraut peter_e@gmx.net http://yi.org/peter-e/

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#2)
Re: cyrillic and sort order (ORDER BY)

Peter Eisentraut <peter_e@gmx.net> writes:

Mihail Marinov writes:

My problem is that ORDER BY does not alphabetically order strings in
proper way. Worst - it does sometimes work right but I can't find out
exactly why. But after a while it stops until I rebuild and
reconfigure.

You must have the same collation locale setting (LC_COLLATE, or LC_ALL as
default) when running initdb and when running postmaster. Changing the
setting in between will result in random and incorrect results and
database corruption.

To enlarge a little more: the most common way to get burnt by this is
to have different LC_xxx environment variables when starting the
postmaster from a boot script as you do when running initdb or starting
the postmaster from an interactive shell. Best to explicitly set the
desired locale in the script that fires up the postmaster.

Postgres 7.1 solves this problem by locking down a database's locale at
initdb time. Subsequent postmaster runs will adopt the LC_COLLATE value
that was prevalent when initdb ran, no matter what their environment is.

regards, tom lane

#4Mihail Marinov
liahim@bcci.bg
In reply to: Tom Lane (#3)
Re[2]: cyrillic and sort order (ORDER BY)

Thanx for the help. Actually it has happened the way you have
described. ORDER BY works ok now. It all would have been great if I
didn't find out that LOWER and UPPER don't do what they are supposed
to do on cyrillic characters. But that is LC_COLLATE issue I am
dealing with now.

#5Radoslaw Stachowiak
radek@alter.pl
In reply to: Tom Lane (#3)
Re: cyrillic and sort order (ORDER BY)

*** Tom Lane <tgl@sss.pgh.pa.us> [Tuesday, 02.January.2001, 18:38 -0500]:

To enlarge a little more: the most common way to get burnt by this is
to have different LC_xxx environment variables when starting the
postmaster from a boot script as you do when running initdb or starting
the postmaster from an interactive shell. Best to explicitly set the
desired locale in the script that fires up the postmaster.

Postgres 7.1 solves this problem by locking down a database's locale at
initdb time. Subsequent postmaster runs will adopt the LC_COLLATE value
that was prevalent when initdb ran, no matter what their environment is.

Could You write what is suggested path to change encoding from SQLASCII to
another (ISO88592) when I'd upgrade to 7.1 from 7.03 ?

I know have SQLASCII, but I'll need to change it to sth more suitable.

--
radoslaw.stachowiak.........................................http://alter.pl/

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Radoslaw Stachowiak (#5)
Re: cyrillic and sort order (ORDER BY)

Radoslaw Stachowiak <radek@alter.pl> writes:

Could You write what is suggested path to change encoding from SQLASCII to
another (ISO88592) when I'd upgrade to 7.1 from 7.03 ?
I know have SQLASCII, but I'll need to change it to sth more suitable.

Should be sufficient to make sure you have the LOCALE environment
variables set correctly when you run initdb for 7.1.

regards, tom lane

#7Mihail Marinov
liahim@bcci.bg
In reply to: Radoslaw Stachowiak (#5)
Re[2]: cyrillic and sort order (ORDER BY)

RS> *** Tom Lane <tgl@sss.pgh.pa.us> [Tuesday, 02.January.2001, 18:38 -0500]:

To enlarge a little more: the most common way to get burnt by this is
to have different LC_xxx environment variables when starting the
postmaster from a boot script as you do when running initdb or starting
the postmaster from an interactive shell. Best to explicitly set the
desired locale in the script that fires up the postmaster.

Postgres 7.1 solves this problem by locking down a database's locale at
initdb time. Subsequent postmaster runs will adopt the LC_COLLATE value
that was prevalent when initdb ran, no matter what their environment is.

RS> Could You write what is suggested path to change encoding from SQLASCII to
RS> another (ISO88592) when I'd upgrade to 7.1 from 7.03 ?

RS> I know have SQLASCII, but I'll need to change it to sth more suitable.

Wouldn't it be enough to dump the database out, recreate a new one with the
required encoding, than dump it in ?