getting 'order by' working with unicode locale? ICU?

Started by Palle Girgensohnabout 21 years ago6 messages
#1Palle Girgensohn
girgen@pingpong.net

Hi!

I'm using Postgresql on FreeBSD, and would like to get "order by" to work
with unicode. The OS does have collation implemented for unicode (UTF-8)
locales. Some freebsd people point me towards IBM:s ICU kit.

How much effort would be required to get postgresql to sort properly,
mainly using the sv_SE.UTF-8 locale (so the problem is not *that* hard, I
don't need to sort Chinese [yet] :). What needs to be done to get
postgresql to use ICU (or some other working mechanism?)

Thanks,
Palle

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Palle Girgensohn (#1)
Re: getting 'order by' working with unicode locale? ICU?

Palle Girgensohn <girgen@pingpong.net> writes:

I'm using Postgresql on FreeBSD, and would like to get "order by" to work
with unicode.

What makes you think it doesn't? Use the right locale and you're set.

regards, tom lane

#3Palle Girgensohn
girgen@pingpong.net
In reply to: Tom Lane (#2)
Re: getting 'order by' working with unicode locale? ICU?

--On onsdag, december 15, 2004 23.21.13 -0500 Tom Lane <tgl@sss.pgh.pa.us>
wrote:

Palle Girgensohn <girgen@pingpong.net> writes:

I'm using Postgresql on FreeBSD, and would like to get "order by" to
work with unicode.

What makes you think it doesn't? Use the right locale and you're set.

Not on FreeBSD, since collation is not implemented in unicode locales. One
way would be to implement it in the OS, of course...

/Palle

#4Peter Eisentraut
peter_e@gmx.net
In reply to: Palle Girgensohn (#3)
Re: getting 'order by' working with unicode locale? ICU?

Palle Girgensohn wrote:

Not on FreeBSD, since collation is not implemented in unicode
locales. One way would be to implement it in the OS, of course...

Try taking the locale definition files from another system and use
localedef to build locale files for your local system. The localedef
source files are supposed to be portable.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

#5Palle Girgensohn
girgen@pingpong.net
In reply to: Peter Eisentraut (#4)
Re: getting 'order by' working with unicode locale? ICU?

--On torsdag, december 16, 2004 09.20.50 +0100 Peter Eisentraut
<peter_e@gmx.net> wrote:

Palle Girgensohn wrote:

Not on FreeBSD, since collation is not implemented in unicode
locales. One way would be to implement it in the OS, of course...

Try taking the locale definition files from another system and use
localedef to build locale files for your local system. The localedef
source files are supposed to be portable.

As far as I understand, there is no code in FreeBSD to specify the
collating order for multibyte locales. Would ot be easier to fix the OS or
hack ICU into PostgreSQL?

A bit off topic: I'm still dreaming of a way to get "order by" working with
different locales for the same database (different clients getting
different collation depending on their locale choice). Now this is
hardcoded at initdb time. Is there any way this could work, ever, in
PostgreSQL, or will I have to sort client side?

Regards,
Palle

#6Hannu Krosing
hannu@tm.ee
In reply to: Palle Girgensohn (#5)
Re: getting 'order by' working with unicode locale? ICU?

Ühel kenal päeval (laupäev, 18. detsember 2004, 02:41+0100), kirjutas
Palle Girgensohn:

--On torsdag, december 16, 2004 09.20.50 +0100 Peter Eisentraut
<peter_e@gmx.net> wrote:

Palle Girgensohn wrote:

Not on FreeBSD, since collation is not implemented in unicode
locales. One way would be to implement it in the OS, of course...

Try taking the locale definition files from another system and use
localedef to build locale files for your local system. The localedef
source files are supposed to be portable.

As far as I understand, there is no code in FreeBSD to specify the
collating order for multibyte locales. Would ot be easier to fix the OS or
hack ICU into PostgreSQL?

A bit off topic: I'm still dreaming of a way to get "order by" working with
different locales for the same database (different clients getting
different collation depending on their locale choice). Now this is
hardcoded at initdb time. Is there any way this could work, ever, in
PostgreSQL, or will I have to sort client side?

I guess you can write a function that returns something client-specific
and sort on that.

select weirdnames
from namelist
order by localesort(weirdnames, 'SE');

You can even build and index on localesort(weirdnames, 'SE') to speed
things up for some queries.

And yes, I think using ICU is the right way to do it ;)

------------------
Hannu