How can I add a new language localization(locale) support

Started by Ameen - Etemadyalmost 22 years ago3 messagesgeneral
Jump to latest
#1Ameen  - Etemady
eetemadi@ce.sharif.edu

hello,
I want to add the Locale Support for my language (Farsi) in this database.
I found that the postgresql have a good support for Unicode and also,
Unicode character support, in my language.
But some ordering and sorting changes are necessary.
Documents says that there is some collation order function that we can
define our custom collation order for alphabetic sorting.
Where I can found it, What document can help me, Where is this function?
Is there any document that can help me for implementing all localization
support in my language?

with thanks and regards.

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Ameen - Etemady (#1)
Re: How can I add a new language localization(locale) support

eetemadi@ce.sharif.edu writes:

Is there any document that can help me for implementing all localization
support in my language?

What you need to fix your ordering issues is a new locale definition
that sorts the way you want. The Postgres docs won't help you with
that; you'll need to go find some documentation about how libc works
with locales.

If you want to develop a set of localized Postgres error messages,
we do have some documentation about how to do that --- see the "Native
Language Support" chapter in the "Internals" section.

regards, tom lane

#3joseph speigle
joe@hoveymotorcars.com
In reply to: Ameen - Etemady (#1)
Re: How can I add a new language localization(locale) support

joe speigle

hello,
I am interested in this issue, too, and have lightly investigated how postgres internationalizes its applications. I have been asking for help with korean. However, it has all its problems solved. I had to ask in a korean forum
http://database.sarang.net/?inc=read&aid=5368&criteria=pgsql&subcrit=qna&id=&limit=20&keyword=&page=1
the locale interacts in some way with the internationalization code,
which point I haven't gotten to understand yet. internationalization code is scattered throughout the source code.
"hackers" et. al. say that to change the way it works would be a complete database rewrite. Meaning, to change encoding from database-wide and unchangeable to a column datatype.
AFAIK you can set in clients (e.g. libpq) the encoding in the connect string, then depending on what the encoding of the database is set to, it will do a conversion. If you look in your library directory, there are all kinds of *.so which are used for such conversions. If yours doesn't exist, you should analyze those conversions. I think the charsets are used, but you will have to provide yours from somewhere else. The conversion code is always encoding-specific depending on the ranges of values your language's atomic units take and the rules of the encoding.
I am unsure about whether or not indexing is possible with database-wide encodings. i raised that question on the korean forum, but received no really good answer. If you have some time, so do I, I would like to write a small source file which would extract teh column information from the tuple to see in what encoding it is stored in at which point, to see if my above guess is right.
Can somebody tell me if it were stored in unicode, and client encoding set to utf8 or unicode, if would be possible write datatype as C function, to allow comparisons and indexing of the character types, and has this been attempted?
In any case, I would like to write first that test code module.
--
joe speigle
www.sirfsup.com