unicode upper/lower functions

Started by John Hansenalmost 21 years ago8 messages
#1John Hansen
john@geeknet.com.au
1 attachment(s)

Hi list,

Attached for your perusal, unicode versions of upper/lower, that work
independent of locale except for the following languages:

Turkish, Azeri, and Lithuanian.
There are 15 locale specific cases in total not covered.

--
John Hansen <john@geeknet.com.au>
GeekNET

Attachments:

collate.tar.gzapplication/x-compressed-tar; name=collate.tar.gzDownload
#2John Hansen
john@oztralis.com.au
In reply to: John Hansen (#1)
Re: unicode upper/lower functions

uhmm,...

Forgot to change the copyright.

Please accept this under the same terms as postgresql itself.

... John

#3Bruce Momjian
pgman@candle.pha.pa.us
In reply to: John Hansen (#1)
Re: unicode upper/lower functions

This has been saved for the 8.1 release:

http://momjian.postgresql.org/cgi-bin/pgpatches2

---------------------------------------------------------------------------

John Hansen wrote:

Hi list,

Attached for your perusal, unicode versions of upper/lower, that work
independent of locale except for the following languages:

Turkish, Azeri, and Lithuanian.
There are 15 locale specific cases in total not covered.

--
John Hansen <john@geeknet.com.au>
GeekNET

[ Attachment, skipping... ]

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#4Bruce Momjian
pgman@candle.pha.pa.us
In reply to: John Hansen (#1)
Re: unicode upper/lower functions

I think we have decided to use the ICU library to implement multiple locales.

---------------------------------------------------------------------------

John Hansen wrote:

Hi list,

Attached for your perusal, unicode versions of upper/lower, that work
independent of locale except for the following languages:

Turkish, Azeri, and Lithuanian.
There are 15 locale specific cases in total not covered.

--
John Hansen <john@geeknet.com.au>
GeekNET

[ Attachment, skipping... ]

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#5John Hansen
john@geeknet.com.au
In reply to: Bruce Momjian (#4)
Re: unicode upper/lower functions

Yes,

Thank you! :)

.. John

Show quoted text

-----Original Message-----
From: Bruce Momjian [mailto:pgman@candle.pha.pa.us]
Sent: Tuesday, June 07, 2005 10:07 AM
To: John Hansen
Cc: pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] unicode upper/lower functions

I think we have decided to use the ICU library to implement
multiple locales.

--------------------------------------------------------------
-------------

John Hansen wrote:

Hi list,

Attached for your perusal, unicode versions of upper/lower,

that work

independent of locale except for the following languages:

Turkish, Azeri, and Lithuanian.
There are 15 locale specific cases in total not covered.

--
John Hansen <john@geeknet.com.au>
GeekNET

[ Attachment, skipping... ]

---------------------------(end of
broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

-- 
Bruce Momjian                        |  http://candle.pha.pa.us
pgman@candle.pha.pa.us               |  (610) 359-1001
+  If your life is a hard drive,     |  13 Roberts Road
+  Christ can be your backup.        |  Newtown Square, 
Pennsylvania 19073
#6John Hansen
john@geeknet.com.au
In reply to: John Hansen (#5)
Re: unicode upper/lower functions

... Except,.. It was never decided if the 'C' locale optimisations was
going to be removed if/when implementing ICU.

Tho I think the conclusion was a postgresql.conf parameter to
enable/disable the optimisations.
Either way, this code is now obsolete.

... John

Show quoted text

-----Original Message-----
From: Bruce Momjian [mailto:pgman@candle.pha.pa.us]
Sent: Tuesday, June 07, 2005 10:07 AM
To: John Hansen
Cc: pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] unicode upper/lower functions

I think we have decided to use the ICU library to implement
multiple locales.

--------------------------------------------------------------
-------------

John Hansen wrote:

Hi list,

Attached for your perusal, unicode versions of upper/lower,

that work

independent of locale except for the following languages:

Turkish, Azeri, and Lithuanian.
There are 15 locale specific cases in total not covered.

--
John Hansen <john@geeknet.com.au>
GeekNET

[ Attachment, skipping... ]

---------------------------(end of
broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

-- 
Bruce Momjian                        |  http://candle.pha.pa.us
pgman@candle.pha.pa.us               |  (610) 359-1001
+  If your life is a hard drive,     |  13 Roberts Road
+  Christ can be your backup.        |  Newtown Square, 
Pennsylvania 19073
#7Bruce Momjian
pgman@candle.pha.pa.us
In reply to: John Hansen (#6)
Re: unicode upper/lower functions

John Hansen wrote:

... Except,.. It was never decided if the 'C' locale optimisations was
going to be removed if/when implementing ICU.

Uh, why would we remove it? Oh, meaning if the locale is C we bypass
locale lookups? I think we will have to see what performance we have
with things.

Tho I think the conclusion was a postgresql.conf parameter to
enable/disable the optimisations.
Either way, this code is now obsolete.

Thanks.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#8John Hansen
john@geeknet.com.au
In reply to: Bruce Momjian (#7)
Re: unicode upper/lower functions

Bruce Momjian wrote:

John Hansen wrote:

... Except,.. It was never decided if the 'C' locale

optimisations was

going to be removed if/when implementing ICU.

Uh, why would we remove it? Oh, meaning if the locale is C
we bypass locale lookups? I think we will have to see what
performance we have with things.

Uhh, not quite: If locale is 'C' the current assumption is 7-bit ASCII
for upper/lower/initcap.
ICU is capable of properky doing upper/lower/initcap except for the
cases described in this (obsolete) patch.

Show quoted text

Tho I think the conclusion was a postgresql.conf parameter to
enable/disable the optimisations.
Either way, this code is now obsolete.

Thanks.

-- 
Bruce Momjian                        |  http://candle.pha.pa.us
pgman@candle.pha.pa.us               |  (610) 359-1001
+  If your life is a hard drive,     |  13 Roberts Road
+  Christ can be your backup.        |  Newtown Square, 
Pennsylvania 19073