ascii() for utf8
Does Postgresql have a function like ascii() that will
return the unicode codepoint value for a utf8 character?
(And symmetrically same for question chr() of course).
I didn't find anything in the docs so I think the answer
is no which leads me to ask... Why not? (Hard to believe
lack of need without concluding that either ascii() is
not needed, of utf8 text is little used.)
Are there technical problems in implementing such a
function? Has anyone else already done this (ie, is
there somewhere I could get it from?)
Is there some other non-obvious way to get the cp value
for the utf8 character?
I think I could use plperl or plpython for this but
this seems like an awful lot of overhead for such a
basic task.
Moving to -hackers.
On Jul 27, 2007, at 1:22 PM, Stuart wrote:
Does Postgresql have a function like ascii() that will
return the unicode codepoint value for a utf8 character?
(And symmetrically same for question chr() of course).I didn't find anything in the docs so I think the answer
is no which leads me to ask... Why not? (Hard to believe
lack of need without concluding that either ascii() is
not needed, of utf8 text is little used.)Are there technical problems in implementing such a
function? Has anyone else already done this (ie, is
there somewhere I could get it from?)Is there some other non-obvious way to get the cp value
for the utf8 character?I think I could use plperl or plpython for this but
this seems like an awful lot of overhead for such a
basic task.
I suspect that this is just a matter of no one scratching the itch. I
suspect a patch would be accepted, or you could possibly put
something on pgFoundry. I'd set it up so that ascii() and chr() act
according to the appropriate locale setting (I'm not sure which one
would be appropriate).
--
Decibel!, aka Jim Nasby decibel@decibel.org
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
Decibel! wrote:
Moving to -hackers.
On Jul 27, 2007, at 1:22 PM, Stuart wrote:
Does Postgresql have a function like ascii() that will
return the unicode codepoint value for a utf8 character?
(And symmetrically same for question chr() of course).
I suspect that this is just a matter of no one scratching the itch. I
suspect a patch would be accepted, or you could possibly put something on
pgFoundry.
Nay; there were some discussions about this not long ago, and I think
one conclusion you could draw from them is that many people want these
functions in the backend.
I'd set it up so that ascii() and chr() act according to the
appropriate locale setting (I'm not sure which one would be appropriate).
I don't see why any of them would react to the locale, but they surely
must honor client encoding.
--
Alvaro Herrera http://www.PlanetPostgreSQL.org/
"I dream about dreams about dreams", sang the nightingale
under the pale moon (Sandman)
From: Alvaro Herrera
Decibel! wrote:
Moving to -hackers.
On Jul 27, 2007, at 1:22 PM, Stuart wrote:
Does Postgresql have a function like ascii() that will
return the unicode codepoint value for a utf8 character?
(And symmetrically same for question chr() of course).I suspect that this is just a matter of no one scratching the itch. I
suspect a patch would be accepted, or you could possibly put something on
pgFoundry.Nay; there were some discussions about this not long ago, and I think
one conclusion you could draw from them is that many people want these
functions in the backend.
That would certainly be my preference. I will be distributing an
application, the database part of which may (not sure yet) require
this function, to multiple platforms including Windows and (though
I have never done it) am anticipating it will be significantly harder
if I have to worry about the recipient compiling an external function
or making sure a dll goes in the right place, gets updated, etc.
I'd set it up so that ascii() and chr() act according to the
appropriate locale setting (I'm not sure which one would be appropriate).I don't see why any of them would react to the locale, but they surely
must honor client encoding.
Wouldn't this be the database encoding? (I have been using
strictly utf-8 and admit I am pretty fuzzy on encoding issues.)
If one had written an external function, how much more effort
would it be to make it acceptable for inclusion in the backend?
This has been saved for the 8.4 release:
http://momjian.postgresql.org/cgi-bin/pgpatches_hold
---------------------------------------------------------------------------
Alvaro Herrera wrote:
Decibel! wrote:
Moving to -hackers.
On Jul 27, 2007, at 1:22 PM, Stuart wrote:
Does Postgresql have a function like ascii() that will
return the unicode codepoint value for a utf8 character?
(And symmetrically same for question chr() of course).I suspect that this is just a matter of no one scratching the itch. I
suspect a patch would be accepted, or you could possibly put something on
pgFoundry.Nay; there were some discussions about this not long ago, and I think
one conclusion you could draw from them is that many people want these
functions in the backend.I'd set it up so that ascii() and chr() act according to the
appropriate locale setting (I'm not sure which one would be appropriate).I don't see why any of them would react to the locale, but they surely
must honor client encoding.--
Alvaro Herrera http://www.PlanetPostgreSQL.org/
"I dream about dreams about dreams", sang the nightingale
under the pale moon (Sandman)---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://www.enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +
Actually, I am working on this as part of the fixes for invalid encoding
stuff, as recently discussed.
cheers
andrew
Bruce Momjian wrote:
Show quoted text
This has been saved for the 8.4 release:
http://momjian.postgresql.org/cgi-bin/pgpatches_hold
---------------------------------------------------------------------------
Alvaro Herrera wrote:
Decibel! wrote:
Moving to -hackers.
On Jul 27, 2007, at 1:22 PM, Stuart wrote:
Does Postgresql have a function like ascii() that will
return the unicode codepoint value for a utf8 character?
(And symmetrically same for question chr() of course).I suspect that this is just a matter of no one scratching the itch. I
suspect a patch would be accepted, or you could possibly put something on
pgFoundry.Nay; there were some discussions about this not long ago, and I think
one conclusion you could draw from them is that many people want these
functions in the backend.I'd set it up so that ascii() and chr() act according to the
appropriate locale setting (I'm not sure which one would be appropriate).I don't see why any of them would react to the locale, but they surely
must honor client encoding.--
Alvaro Herrera http://www.PlanetPostgreSQL.org/
"I dream about dreams about dreams", sang the nightingale
under the pale moon (Sandman)---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly
Andrew Dunstan wrote:
Actually, I am working on this as part of the fixes for invalid encoding
stuff, as recently discussed.
OK, I have moved the item into the 8.3 queue.
---------------------------------------------------------------------------
cheers
andrew
Bruce Momjian wrote:
This has been saved for the 8.4 release:
http://momjian.postgresql.org/cgi-bin/pgpatches_hold
---------------------------------------------------------------------------
Alvaro Herrera wrote:
Decibel! wrote:
Moving to -hackers.
On Jul 27, 2007, at 1:22 PM, Stuart wrote:
Does Postgresql have a function like ascii() that will
return the unicode codepoint value for a utf8 character?
(And symmetrically same for question chr() of course).I suspect that this is just a matter of no one scratching the itch. I
suspect a patch would be accepted, or you could possibly put something on
pgFoundry.Nay; there were some discussions about this not long ago, and I think
one conclusion you could draw from them is that many people want these
functions in the backend.I'd set it up so that ascii() and chr() act according to the
appropriate locale setting (I'm not sure which one would be appropriate).I don't see why any of them would react to the locale, but they surely
must honor client encoding.--
Alvaro Herrera http://www.PlanetPostgreSQL.org/
"I dream about dreams about dreams", sang the nightingale
under the pale moon (Sandman)---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://www.enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +