Re: Postgresql 8.1: plperl code works with LATIN1, fail
In an 8.1.6 UTF-8 database this example returns false; in 8.2.1 it
returns true. See the following commit message and the related bug
report regarding PL/Perl and UTF-8:http://archives.postgresql.org/pgsql-committers/2006-10/msg00277.php
http://archives.postgresql.org/pgsql-bugs/2006-10/msg00077.phpIf you can't upgrade to 8.2 then you might be able to work around
the problem by creating the function as plperlu and adding 'use utf8;'.
--
Michael Fuhr
Hello Michael!
As fas as i know 'use utf8;' normally just tells Perl that the source code
is written in UTF-8 and noting more.
For converting from and to UTF-8 in data usually the Encode modul is used.
Or is this different for plperlu?
Greetings,
Matthias
On Mon, Jan 29, 2007 at 01:34:47PM +0100, Matthias.Pitzl@izb.de wrote:
If you can't upgrade to 8.2 then you might be able to work around
the problem by creating the function as plperlu and adding 'use utf8;'.As fas as i know 'use utf8;' normally just tells Perl that the source code
is written in UTF-8 and noting more.
The string literals in the PL/Perl function body are UTF-8 but Perl
isn't treating them as such. Isn't "use utf8" or "use encoding 'utf8'"
the way to tell Perl to do so? The perluniintro manual page says this:
Only one case remains where an explicit "use utf8" is needed: if
your Perl script itself is encoded in UTF-8, you can use UTF-8
in your identifier names, and in string and regular expression
literals, by saying "use utf8".
Isn't that the situation here? The PL/Perl function body is a
string encoded in the database's encoding, which in this case is
UTF-8.
For converting from and to UTF-8 in data usually the Encode modul is used.
Or is this different for plperlu?
Isn't the Encode module used for doing explicit conversions? I think
the goal is not to have to do so, i.e., to have PL/Perl treat string
literals as UTF-8 if the database encoding is UTF-8. PostgreSQL 8.2
does so but earlier versions don't.
--
Michael Fuhr
"Michael" == Michael Fuhr <mike@fuhr.org> writes:
Michael> Isn't that the situation here? The PL/Perl function body is a
Michael> string encoded in the database's encoding, which in this case is
Michael> UTF-8.
If that's always the case, then the embedded Perl interpreter should
be started in that mode, perhaps by adding "-Mutf8" to the arg list
of the embedded interpreter.
--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!