utf8 encoding problem with plperlu
The following short function illustrates a problem I'm having with the
plperlu module.
CREATE OR REPLACE FUNCTION
doublezero ()
RETURNS VOID
AS $$
use Encode qw/encode decode/;
$pass = "double00";
elog( INFO, "$pass" );
$mspass = encode( 'UTF-16LE', qq("$pass") );
elog( INFO, "$mspass" );
$$ LANGUAGE plperlu
STRICT;
# select * from doublezero();
INFO: double00
CONTEXT: PL/Perl function "doublezero"
ERROR: invalid byte sequence for encoding "UTF8": 0x00 at line 8, <DATA>
line 558.
CONTEXT: PL/Perl function "doublezero"
I don't understand this. I need to pass $mspass to Active Directory, and
it the encoding is exactly as it should be, which is to say, it works for
strings that don't include two consecutive zeros. Is this a bug?
-R-
On 07/15/2015 07:14 AM, Ronald Peterson wrote:
The following short function illustrates a problem I'm having with the
plperlu module.CREATE OR REPLACE FUNCTION
doublezero ()
RETURNS VOID
AS $$
use Encode qw/encode decode/;
$pass = "double00";
elog( INFO, "$pass" );
$mspass = encode( 'UTF-16LE', qq("$pass") );
elog( INFO, "$mspass" );
$$ LANGUAGE plperlu
STRICT;# select * from doublezero();
INFO: double00
CONTEXT: PL/Perl function "doublezero"
ERROR: invalid byte sequence for encoding "UTF8": 0x00 at line 8,
<DATA> line 558.
CONTEXT: PL/Perl function "doublezero"I don't understand this. I need to pass $mspass to Active Directory,
and it the encoding is exactly as it should be, which is to say, it
works for strings that don't include two consecutive zeros. Is this a bug?
I am not a Perl user, but the question that came to mind is-
Does this:
mspass = encode( 'UTF-16LE', qq("$pass") )
work in Perl outside of plperlu?
-R-
--
Adrian Klaver
adrian.klaver@aklaver.com
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Ronald Peterson wrote:
# select * from doublezero();
INFO: double00
CONTEXT: PL/Perl function "doublezero"
ERROR: invalid byte sequence for encoding "UTF8": 0x00 at line 8, <DATA>
line 558.
CONTEXT: PL/Perl function "doublezero"I don't understand this. I need to pass $mspass to Active Directory, and it
the encoding is exactly as it should be, which is to say, it works for
strings that don't include two consecutive zeros. Is this a bug?
When replacing the literal "double00" with "foobar" in your function,
the same error occurs for me:
test=# select doublezero();
INFO: foobar
CONTEXT: PL/Perl function "doublezero"
ERROR: invalid byte sequence for encoding "UTF8": 0x00 at line 6.
CONTEXT: fonction PL/Perl « doublezero »
Anyway it's not clear what you expect. PG doesn't support UTF-16,
and even if it did, it wouldn't accept such strings when the current
encoding is UTF-8.
If Active Directory wants UTF-16LE, you have to do that conversion, but
don't pass the result back to postgres in this format.
Best regards,
--
Daniel
PostgreSQL-powered mail user agent and storage: http://www.manitou-mail.org
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
That's interesting. What I'm really doing, instead of the second elog
statement, is this:
$ret = $ldap->modify( $dn,
replace => {
unicodePwd => $mspass
} );
This does work for strings that don't contain consecutive zeroes. I'm not
really passing the string to PostgreSQL, but to Net::LDAP, but it must hit
PostgreSQL anyway? Active Directory requires this encoding, so I'm not
sure what to do here.
On Wed, Jul 15, 2015 at 11:57 AM, Daniel Verite <daniel@manitou-mail.org>
wrote:
Ronald Peterson wrote:
# select * from doublezero();
INFO: double00
CONTEXT: PL/Perl function "doublezero"
ERROR: invalid byte sequence for encoding "UTF8": 0x00 at line 8, <DATA>
line 558.
CONTEXT: PL/Perl function "doublezero"I don't understand this. I need to pass $mspass to Active Directory,
and it
the encoding is exactly as it should be, which is to say, it works for
strings that don't include two consecutive zeros. Is this a bug?When replacing the literal "double00" with "foobar" in your function,
the same error occurs for me:test=# select doublezero();
INFO: foobar
CONTEXT: PL/Perl function "doublezero"
ERROR: invalid byte sequence for encoding "UTF8": 0x00 at line 6.
CONTEXT: fonction PL/Perl « doublezero »Anyway it's not clear what you expect. PG doesn't support UTF-16,
and even if it did, it wouldn't accept such strings when the current
encoding is UTF-8.
If Active Directory wants UTF-16LE, you have to do that conversion, but
don't pass the result back to postgres in this format.Best regards,
--
Daniel
PostgreSQL-powered mail user agent and storage:
http://www.manitou-mail.org
--
-R-
2015-07-15 20:20 GMT+02:00 Ronald Peterson <ron@hub.yellowbank.com>:
That's interesting. What I'm really doing, instead of the second elog
statement, is this:$ret = $ldap->modify( $dn,
replace => {
unicodePwd => $mspass
} );This does work for strings that don't contain consecutive zeroes. I'm not
really passing the string to PostgreSQL, but to Net::LDAP, but it must hit
PostgreSQL anyway? Active Directory requires this encoding, so I'm not
sure what to do here.
I had some issues, when I used some Perl libraries with UTF strings - some
requires, some not UTF flag in string. And Postgres didn't well set thist
UTF flag well.
http://blog.endpoint.com/2014/02/dbdpg-utf-8-perl-postgresql.html
Maybe you have similar issue - on server side.
Pavel
Show quoted text
On Wed, Jul 15, 2015 at 11:57 AM, Daniel Verite <daniel@manitou-mail.org>
wrote:Ronald Peterson wrote:
# select * from doublezero();
INFO: double00
CONTEXT: PL/Perl function "doublezero"
ERROR: invalid byte sequence for encoding "UTF8": 0x00 at line 8,<DATA>
line 558.
CONTEXT: PL/Perl function "doublezero"I don't understand this. I need to pass $mspass to Active Directory,
and it
the encoding is exactly as it should be, which is to say, it works for
strings that don't include two consecutive zeros. Is this a bug?When replacing the literal "double00" with "foobar" in your function,
the same error occurs for me:test=# select doublezero();
INFO: foobar
CONTEXT: PL/Perl function "doublezero"
ERROR: invalid byte sequence for encoding "UTF8": 0x00 at line 6.
CONTEXT: fonction PL/Perl « doublezero »Anyway it's not clear what you expect. PG doesn't support UTF-16,
and even if it did, it wouldn't accept such strings when the current
encoding is UTF-8.
If Active Directory wants UTF-16LE, you have to do that conversion, but
don't pass the result back to postgres in this format.Best regards,
--
Daniel
PostgreSQL-powered mail user agent and storage:
http://www.manitou-mail.org--
-R-
Thanks Pavel, this looks promising. I didn't know about the Data::Peek
module - that might help me figure out what is going on.
On Wed, Jul 15, 2015 at 2:28 PM, Pavel Stehule <pavel.stehule@gmail.com>
wrote:
2015-07-15 20:20 GMT+02:00 Ronald Peterson <ron@hub.yellowbank.com>:
That's interesting. What I'm really doing, instead of the second elog
statement, is this:$ret = $ldap->modify( $dn,
replace => {
unicodePwd => $mspass
} );This does work for strings that don't contain consecutive zeroes. I'm
not really passing the string to PostgreSQL, but to Net::LDAP, but it must
hit PostgreSQL anyway? Active Directory requires this encoding, so I'm not
sure what to do here.I had some issues, when I used some Perl libraries with UTF strings - some
requires, some not UTF flag in string. And Postgres didn't well set thist
UTF flag well.http://blog.endpoint.com/2014/02/dbdpg-utf-8-perl-postgresql.html
Maybe you have similar issue - on server side.
Pavel
On Wed, Jul 15, 2015 at 11:57 AM, Daniel Verite <daniel@manitou-mail.org>
wrote:Ronald Peterson wrote:
# select * from doublezero();
INFO: double00
CONTEXT: PL/Perl function "doublezero"
ERROR: invalid byte sequence for encoding "UTF8": 0x00 at line 8,<DATA>
line 558.
CONTEXT: PL/Perl function "doublezero"I don't understand this. I need to pass $mspass to Active Directory,
and it
the encoding is exactly as it should be, which is to say, it works for
strings that don't include two consecutive zeros. Is this a bug?When replacing the literal "double00" with "foobar" in your function,
the same error occurs for me:test=# select doublezero();
INFO: foobar
CONTEXT: PL/Perl function "doublezero"
ERROR: invalid byte sequence for encoding "UTF8": 0x00 at line 6.
CONTEXT: fonction PL/Perl « doublezero »Anyway it's not clear what you expect. PG doesn't support UTF-16,
and even if it did, it wouldn't accept such strings when the current
encoding is UTF-8.
If Active Directory wants UTF-16LE, you have to do that conversion, but
don't pass the result back to postgres in this format.Best regards,
--
Daniel
PostgreSQL-powered mail user agent and storage:
http://www.manitou-mail.org--
-R-
--
-R-
Ronald Peterson <ron@hub.yellowbank.com> writes:
This does work for strings that don't contain consecutive zeroes. I'm not
really passing the string to PostgreSQL, but to Net::LDAP, but it must hit
PostgreSQL anyway? Active Directory requires this encoding, so I'm not
sure what to do here.
Hm, well, the concrete example you showed involved passing the string to
elog(), which definitely will complain if what it's fed isn't legal data
according to the database encoding; as would any other attempt to push
data into the Postgres server environment. I don't see why operations
that are strictly within Perl would have a problem, though.
regards, tom lane
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Still trying to figure this out, still confused, but like most frustrating
programming problems, I think I may be looking in the wrong place for the
source of this error. Perhaps.
On Wed, Jul 15, 2015 at 11:25 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Ronald Peterson <ron@hub.yellowbank.com> writes:
This does work for strings that don't contain consecutive zeroes. I'm
not
really passing the string to PostgreSQL, but to Net::LDAP, but it must
hit
PostgreSQL anyway? Active Directory requires this encoding, so I'm not
sure what to do here.Hm, well, the concrete example you showed involved passing the string to
elog(), which definitely will complain if what it's fed isn't legal data
according to the database encoding; as would any other attempt to push
data into the Postgres server environment. I don't see why operations
that are strictly within Perl would have a problem, though.regards, tom lane
--
-R-