another multibyte question

Started by Joe Conwayover 23 years ago2 messages
#1Joe Conway
mail@joeconway.com

Do any of the encodings with encoding max length > 1 have a constant
character size (e.g. unicode?). If so, how hard would it be to add
another member to pg_wchar_tbl, say:

bool mblen_is_const; /* all chars = max bytes this charset */

Then those character sets code gain back much of the same speed
advantages as single byte character sets when it comes to string processing.

Joe

#2Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Joe Conway (#1)
Re: another multibyte question

Do any of the encodings with encoding max length > 1 have a constant
character size (e.g. unicode?). If so, how hard would it be to add
another member to pg_wchar_tbl, say:

bool mblen_is_const; /* all chars = max bytes this charset */

Then those character sets code gain back much of the same speed
advantages as single byte character sets when it comes to string processing.

Sounds interesting idea, but none of encodings currently PostgreSQL
supports has fixed length character size. UCS-2/UCS-4 is such an
encoding, we do not support it however.
--
Tatsuo Ishii