Bytea PL/Perl transform
Hi,
PostgreSQL passes bytea arguments to PL/Perl functions as hexadecimal strings, which is not only inconvenient, but also memory and time consuming.
So I decided to propose a simple transform extension to pass bytea as native Perl octet strings.
Please find the patch attached.
Regards,
Ivan Panchenko
Attachments:
bytea_plperl.patchtext/x-diff; name="=?UTF-8?B?Ynl0ZWFfcGxwZXJsLnBhdGNo?="Download+265-3
On 18 Mar 2023, at 23:25, Иван Панченко <wao@mail.ru> wrote:
Hi,
PostgreSQL passes bytea arguments to PL/Perl functions as hexadecimal strings, which is not only inconvenient, but also memory and time consuming.
So I decided to propose a simple transform extension to pass bytea as native Perl octet strings.
Please find the patch attached.
Thanks for the patch, I recommend registering this in the currently open
Commitfest to make sure it's kept track of:
https://commitfest.postgresql.org/43/
--
Daniel Gustafsson
Среда, 22 марта 2023, 12:45 +03:00 от Daniel Gustafsson <daniel@yesql.se>:
On 18 Mar 2023, at 23:25, Иван Панченко < wao@mail.ru > wrote:
Hi,
PostgreSQL passes bytea arguments to PL/Perl functions as hexadecimal strings, which is not only inconvenient, but also memory and time consuming.
So I decided to propose a simple transform extension to pass bytea as native Perl octet strings.
Please find the patch attached.Thanks for the patch, I recommend registering this in the currently open
Commitfest to make sure it's kept track of:
Thanks, done:
https://commitfest.postgresql.org/43/4252/
--
Daniel Gustafsson
--
Ivan
So I decided to propose a simple transform extension to pass bytea as
native Perl octet strings.
Quick review, mostly housekeeping things:
* Needs a rebase, minor failure on Mkvcbuild.pm
* Code needs standardized formatting, esp. bytea_plperl.c
* Needs to be meson-i-fied (i.e. add a "meson.build" file)
* Do all of these transforms need to be their own contrib modules? So much
duplicated code across contrib/*_plperl already (and *plpython too for that
matter) ...
Cheers,
Greg
On 2023-06-22 Th 16:56, Greg Sabino Mullane wrote:
* Do all of these transforms need to be their own contrib modules? So
much duplicated code across contrib/*_plperl already (and *plpython
too for that matter) ...
Yeah, that's a bit of a mess. Not sure what we can do about it now.
cheers
andrew
--
Andrew Dunstan
EDB:https://www.enterprisedb.com
Andrew Dunstan <andrew@dunslane.net> writes:
On 2023-06-22 Th 16:56, Greg Sabino Mullane wrote:
* Do all of these transforms need to be their own contrib modules? So
much duplicated code across contrib/*_plperl already (and *plpython
too for that matter) ...Yeah, that's a bit of a mess. Not sure what we can do about it now.
Would it be possible to move the functions and other objects to a new
combined extension, and make the existing ones depend on that?
I see ALTER EXTENSION has both ADD and DROP subcommands which don't
affect the object itself, only the extension membership. The challenge
would be getting the ordering right between the upgrade/install scripts
dropping the objects from the existing extension and adding them to the
new extension.
- ilmari
=?utf-8?Q?Dagfinn_Ilmari_Manns=C3=A5ker?= <ilmari@ilmari.org> writes:
Andrew Dunstan <andrew@dunslane.net> writes:
On 2023-06-22 Th 16:56, Greg Sabino Mullane wrote:
* Do all of these transforms need to be their own contrib modules? So
much duplicated code across contrib/*_plperl already (and *plpython
too for that matter) ...
Yeah, that's a bit of a mess. Not sure what we can do about it now.
Would it be possible to move the functions and other objects to a new
combined extension, and make the existing ones depend on that?
Perhaps another way could be to accept that the packaging is what it
is, but look for ways to share the repetitive source code. The .so's
wouldn't get any smaller, but they're not that big anyway.
regards, tom lane
On 22.06.23 22:56, Greg Sabino Mullane wrote:
* Do all of these transforms need to be their own contrib modules? So
much duplicated code across contrib/*_plperl already (and *plpython too
for that matter) ...
The reason the first transform modules were separate extensions is that
they interfaced between one extension (plpython, plperl) and another
extension (ltree, hstore), so it wasn't clear where to put them without
creating an additional dependency for one of them.
If the transform deals with a built-in type, then they should just be
added to the respective pl extension directly.
Четверг, 6 июля 2023, 14:48 +03:00 от Peter Eisentraut < peter@eisentraut.org >:
On 22.06.23 22:56, Greg Sabino Mullane wrote:* Do all of these transforms need to be their own contrib modules? So
much duplicated code across contrib/*_plperl already (and *plpython too
for that matter) ...The reason the first transform modules were separate extensions is that
they interfaced between one extension (plpython, plperl) and another
extension (ltree, hstore), so it wasn't clear where to put them without
creating an additional dependency for one of them.If the transform deals with a built-in type, then they should just be
added to the respective pl extension directly.
Looks reasonable.
The new extension bytea_plperl can be easily moved into plperl now, but what should be do with the existing ones, namely jsonb_plperl and bool_plperl ?
If we leave them where they are, it would be hard to explain why some transforms are inside plperl while other ones live separately. If we move them into plperl also, wouldn’t it break some compatibility?
=?UTF-8?B?SXZhbiBQYW5jaGVua28=?= <wao@mail.ru> writes:
Четверг, 6 июля 2023, 14:48 +03:00 от Peter Eisentraut < peter@eisentraut.org >:
If the transform deals with a built-in type, then they should just be
added to the respective pl extension directly.
The new extension bytea_plperl can be easily moved into plperl now, but what should be do with the existing ones, namely jsonb_plperl and bool_plperl ?
If we leave them where they are, it would be hard to explain why some transforms are inside plperl while other ones live separately. If we move them into plperl also, wouldn’t it break some compatibility?
It's kind of a mess, indeed. But I think we could make plperl 1.1
contain the additional transforms and just tell people they have
to drop the obsolete extensions before they upgrade to 1.1.
Fortunately, it doesn't look like functions using a transform
have any hard dependency on the transform, so "drop extension
jsonb_plperl" followed by "alter extension plperl update" should
work without cascading to all your plperl functions.
Having said that, we'd still be in the position of having to
explain why some transforms are packaged with plperl and others
not. The distinction between built-in and contrib types might
be obvious to us hackers, but I bet a lot of users see it as
pretty artificial. So maybe the existing packaging design is
fine and we should just look for a way to reduce the code
duplication.
regards, tom lane
Friday, 14 July 2023, 23:27 +03:00 от Tom Lane <tgl@sss.pgh.pa.us>:
=?UTF-8?B?SXZhbiBQYW5jaGVua28=?= < wao@mail.ru > writes:Четверг, 6 июля 2023, 14:48 +03:00 от Peter Eisentraut < peter@eisentraut.org >:
If the transform deals with a built-in type, then they should just be
added to the respective pl extension directly.The new extension bytea_plperl can be easily moved into plperl now, but what should be do with the existing ones, namely jsonb_plperl and bool_plperl ?
If we leave them where they are, it would be hard to explain why some transforms are inside plperl while other ones live separately. If we move them into plperl also, wouldn’t it break some compatibility?It's kind of a mess, indeed. But I think we could make plperl 1.1
contain the additional transforms and just tell people they have
to drop the obsolete extensions before they upgrade to 1.1.
Fortunately, it doesn't look like functions using a transform
have any hard dependency on the transform, so "drop extension
jsonb_plperl" followed by "alter extension plperl update" should
work without cascading to all your plperl functions.Having said that, we'd still be in the position of having to
explain why some transforms are packaged with plperl and others
not. The distinction between built-in and contrib types might
be obvious to us hackers, but I bet a lot of users see it as
pretty artificial. So maybe the existing packaging design is
fine and we should just look for a way to reduce the code
duplication.
The code duplication between different transforms is not in C code, but mostly in copy-pasted Makefile, *.control, *.sql and meson-build. These files could be generated from some universal templates. But, keeping in mind Windows compatibility and presence of three build system, this hardly looks like a simplification.
Probably at present time it would be better to keep the existing code duplication until plperl 1.1.
Nevertheless, dealing with code duplication is a wider task than the bytea transform, so let me suggest to keep this extension in the present form. If we decide what to do with the duplication, it would be another patch.
The mesonified and rebased version of the transform patch is attached.
regards, tom lane
Regards, Ivan
Attachments:
bytea-plperl-mesonified.patchtext/x-diff; name="=?UTF-8?B?Ynl0ZWEtcGxwZXJsLW1lc29uaWZpZWQucGF0Y2g=?="Download+278-3
On Fri, 21 Jul 2023 at 02:59, Ivan Panchenko <wao@mail.ru> wrote:
Friday, 14 July 2023, 23:27 +03:00 от Tom Lane <tgl@sss.pgh.pa.us>:
=?UTF-8?B?SXZhbiBQYW5jaGVua28=?= <wao@mail.ru> writes:
Четверг, 6 июля 2023, 14:48 +03:00 от Peter Eisentraut < peter@eisentraut.org >:
If the transform deals with a built-in type, then they should just be
added to the respective pl extension directly.The new extension bytea_plperl can be easily moved into plperl now, but what should be do with the existing ones, namely jsonb_plperl and bool_plperl ?
If we leave them where they are, it would be hard to explain why some transforms are inside plperl while other ones live separately. If we move them into plperl also, wouldn’t it break some compatibility?It's kind of a mess, indeed. But I think we could make plperl 1.1
contain the additional transforms and just tell people they have
to drop the obsolete extensions before they upgrade to 1.1.
Fortunately, it doesn't look like functions using a transform
have any hard dependency on the transform, so "drop extension
jsonb_plperl" followed by "alter extension plperl update" should
work without cascading to all your plperl functions.Having said that, we'd still be in the position of having to
explain why some transforms are packaged with plperl and others
not. The distinction between built-in and contrib types might
be obvious to us hackers, but I bet a lot of users see it as
pretty artificial. So maybe the existing packaging design is
fine and we should just look for a way to reduce the code
duplication.The code duplication between different transforms is not in C code, but mostly in copy-pasted Makefile, *.control, *.sql and meson-build. These files could be generated from some universal templates. But, keeping in mind Windows compatibility and presence of three build system, this hardly looks like a simplification.
Probably at present time it would be better to keep the existing code duplication until plperl 1.1.
Nevertheless, dealing with code duplication is a wider task than the bytea transform, so let me suggest to keep this extension in the present form. If we decide what to do with the duplication, it would be another patch.The mesonified and rebased version of the transform patch is attached.
The patch needs to be rebased as these changes are not required anymore:
diff --git a/src/tools/msvc/Mkvcbuild.pm b/src/tools/msvc/Mkvcbuild.pm
index 9e05eb91b1..ec0a3f8097 100644
--- a/src/tools/msvc/Mkvcbuild.pm
+++ b/src/tools/msvc/Mkvcbuild.pm
@@ -43,7 +43,7 @@ my $contrib_extralibs = { 'libpq_pipeline' =>
['ws2_32.lib'] };
my $contrib_extraincludes = {};
my $contrib_extrasource = {};
my @contrib_excludes = (
- 'bool_plperl', 'commit_ts',
+ 'bool_plperl', 'bytea_plperl', 'commit_ts',
'hstore_plperl', 'hstore_plpython',
'intagg', 'jsonb_plperl',
'jsonb_plpython', 'ltree_plpython',
@@ -791,6 +791,9 @@ sub mkvcbuild
my $bool_plperl = AddTransformModule(
'bool_plperl', 'contrib/bool_plperl',
'plperl', 'src/pl/plperl');
+ my $bytea_plperl = AddTransformModule(
+ 'bytea_plperl', 'contrib/bytea_plperl',
+ 'plperl', 'src/pl/plperl');
Regards,
Vignesh
Hi
so 6. 1. 2024 v 16:51 odesílatel vignesh C <vignesh21@gmail.com> napsal:
On Fri, 21 Jul 2023 at 02:59, Ivan Panchenko <wao@mail.ru> wrote:
Friday, 14 July 2023, 23:27 +03:00 от Tom Lane <tgl@sss.pgh.pa.us>:
=?UTF-8?B?SXZhbiBQYW5jaGVua28=?= <wao@mail.ru> writes:
Четверг, 6 июля 2023, 14:48 +03:00 от Peter Eisentraut <
peter@eisentraut.org >:
If the transform deals with a built-in type, then they should just be
added to the respective pl extension directly.The new extension bytea_plperl can be easily moved into plperl now,
but what should be do with the existing ones, namely jsonb_plperl and
bool_plperl ?If we leave them where they are, it would be hard to explain why some
transforms are inside plperl while other ones live separately. If we move
them into plperl also, wouldn’t it break some compatibility?It's kind of a mess, indeed. But I think we could make plperl 1.1
contain the additional transforms and just tell people they have
to drop the obsolete extensions before they upgrade to 1.1.
Fortunately, it doesn't look like functions using a transform
have any hard dependency on the transform, so "drop extension
jsonb_plperl" followed by "alter extension plperl update" should
work without cascading to all your plperl functions.Having said that, we'd still be in the position of having to
explain why some transforms are packaged with plperl and others
not. The distinction between built-in and contrib types might
be obvious to us hackers, but I bet a lot of users see it as
pretty artificial. So maybe the existing packaging design is
fine and we should just look for a way to reduce the code
duplication.The code duplication between different transforms is not in C code, but
mostly in copy-pasted Makefile, *.control, *.sql and meson-build. These
files could be generated from some universal templates. But, keeping in
mind Windows compatibility and presence of three build system, this hardly
looks like a simplification.Probably at present time it would be better to keep the existing code
duplication until plperl 1.1.
Nevertheless, dealing with code duplication is a wider task than the
bytea transform, so let me suggest to keep this extension in the present
form. If we decide what to do with the duplication, it would be another
patch.The mesonified and rebased version of the transform patch is attached.
The patch needs to be rebased as these changes are not required anymore: diff --git a/src/tools/msvc/Mkvcbuild.pm b/src/tools/msvc/Mkvcbuild.pm index 9e05eb91b1..ec0a3f8097 100644 --- a/src/tools/msvc/Mkvcbuild.pm +++ b/src/tools/msvc/Mkvcbuild.pm @@ -43,7 +43,7 @@ my $contrib_extralibs = { 'libpq_pipeline' => ['ws2_32.lib'] }; my $contrib_extraincludes = {}; my $contrib_extrasource = {}; my @contrib_excludes = ( - 'bool_plperl', 'commit_ts', + 'bool_plperl', 'bytea_plperl', 'commit_ts', 'hstore_plperl', 'hstore_plpython', 'intagg', 'jsonb_plperl', 'jsonb_plpython', 'ltree_plpython', @@ -791,6 +791,9 @@ sub mkvcbuild my $bool_plperl = AddTransformModule( 'bool_plperl', 'contrib/bool_plperl', 'plperl', 'src/pl/plperl'); + my $bytea_plperl = AddTransformModule( + 'bytea_plperl', 'contrib/bytea_plperl', + 'plperl', 'src/pl/plperl');Regards,
Vignesh
I am checking this patch, it looks well. All tests passed. I am sending a
cleaned patch.
I did minor formatting cleaning.
I inserted perl reference support - hstore_plperl and json_plperl does it.
+<->/* Dereference references recursively. */
+<->while (SvROK(in))
+<-><-->in = SvRV(in);
Regards
Pavel
Attachments:
0001-supports-bytea-transformation-for-plperl.patchtext/x-patch; charset=UTF-8; name=0001-supports-bytea-transformation-for-plperl.patchDownload+328-3
Pavel Stehule <pavel.stehule@gmail.com> writes:
I inserted perl reference support - hstore_plperl and json_plperl does it.
+<->/* Dereference references recursively. */ +<->while (SvROK(in)) +<-><-->in = SvRV(in);
That code in hstore_plperl and json_plperl is only relevant because they
deal with non-scalar values (hashes for hstore, and also arrays for
json) which must be passed as references. The recursive nature of the
dereferencing is questionable, and masked the bug fixed by commit
1731e3741cbbf8e0b4481665d7d523bc55117f63.
bytea_plperl only deals with scalars (specifically strings), so should
not concern itself with references. In fact, this code breaks returning
objects with overloaded stringification, for example:
CREATE FUNCTION plperlu_overload() RETURNS bytea LANGUAGE plperlu
TRANSFORM FOR TYPE bytea
AS $$
package StringOverload { use overload '""' => sub { "stuff" }; }
return bless {}, "StringOverload";
$$;
This makes the server crash with an assertion failure from Perl because
SvPVbyte() was passed a non-scalar value:
postgres: ilmari regression_bytea_plperl [local] SELECT: sv.c:2865: Perl_sv_2pv_flags:
Assertion `SvTYPE(sv) != SVt_PVAV && SvTYPE(sv) != SVt_PVHV && SvTYPE(sv) != SVt_PVFM' failed.
If I remove the dereferincing loop it succeeds:
SELECT encode(plperlu_overload(), 'escape') AS string;
string
--------
stuff
(1 row)
Attached is a v2 patch which removes the dereferencing and includes the
above example as a test.
- ilmari
Attachments:
v2-0001-Add-bytea-transformation-for-plperl.patchtext/x-diff; charset=utf-8Download+368-3
út 30. 1. 2024 v 16:43 odesílatel Dagfinn Ilmari Mannsåker <
ilmari@ilmari.org> napsal:
Pavel Stehule <pavel.stehule@gmail.com> writes:
I inserted perl reference support - hstore_plperl and json_plperl does
it.
+<->/* Dereference references recursively. */ +<->while (SvROK(in)) +<-><-->in = SvRV(in);That code in hstore_plperl and json_plperl is only relevant because they
deal with non-scalar values (hashes for hstore, and also arrays for
json) which must be passed as references. The recursive nature of the
dereferencing is questionable, and masked the bug fixed by commit
1731e3741cbbf8e0b4481665d7d523bc55117f63.bytea_plperl only deals with scalars (specifically strings), so should
not concern itself with references. In fact, this code breaks returning
objects with overloaded stringification, for example:CREATE FUNCTION plperlu_overload() RETURNS bytea LANGUAGE plperlu
TRANSFORM FOR TYPE bytea
AS $$
package StringOverload { use overload '""' => sub { "stuff" }; }
return bless {}, "StringOverload";
$$;This makes the server crash with an assertion failure from Perl because
SvPVbyte() was passed a non-scalar value:postgres: ilmari regression_bytea_plperl [local] SELECT: sv.c:2865:
Perl_sv_2pv_flags:
Assertion `SvTYPE(sv) != SVt_PVAV && SvTYPE(sv) != SVt_PVHV && SvTYPE(sv)
!= SVt_PVFM' failed.If I remove the dereferincing loop it succeeds:
SELECT encode(plperlu_overload(), 'escape') AS string;
string
--------
stuff
(1 row)Attached is a v2 patch which removes the dereferencing and includes the
above example as a test.
But without dereference it returns bad value.
Maybe there should be a check so references cannot be returned? Probably is
not safe pass pointers between Perl and Postgres.
Show quoted text
- ilmari
Pavel Stehule <pavel.stehule@gmail.com> writes:
út 30. 1. 2024 v 16:43 odesílatel Dagfinn Ilmari Mannsåker <
ilmari@ilmari.org> napsal:Pavel Stehule <pavel.stehule@gmail.com> writes:
I inserted perl reference support - hstore_plperl and json_plperl does
it.
+<->/* Dereference references recursively. */ +<->while (SvROK(in)) +<-><-->in = SvRV(in);That code in hstore_plperl and json_plperl is only relevant because they
deal with non-scalar values (hashes for hstore, and also arrays for
json) which must be passed as references. The recursive nature of the
dereferencing is questionable, and masked the bug fixed by commit
1731e3741cbbf8e0b4481665d7d523bc55117f63.bytea_plperl only deals with scalars (specifically strings), so should
not concern itself with references. In fact, this code breaks returning
objects with overloaded stringification, for example:CREATE FUNCTION plperlu_overload() RETURNS bytea LANGUAGE plperlu
TRANSFORM FOR TYPE bytea
AS $$
package StringOverload { use overload '""' => sub { "stuff" }; }
return bless {}, "StringOverload";
$$;This makes the server crash with an assertion failure from Perl because
SvPVbyte() was passed a non-scalar value:postgres: ilmari regression_bytea_plperl [local] SELECT: sv.c:2865:
Perl_sv_2pv_flags:
Assertion `SvTYPE(sv) != SVt_PVAV && SvTYPE(sv) != SVt_PVHV && SvTYPE(sv)
!= SVt_PVFM' failed.If I remove the dereferincing loop it succeeds:
SELECT encode(plperlu_overload(), 'escape') AS string;
string
--------
stuff
(1 row)Attached is a v2 patch which removes the dereferencing and includes the
above example as a test.But without dereference it returns bad value.
Where exactly does it return a bad value? The existing tests pass, and
the one I included shows that it does the right thing in that case too.
If you pass it an unblessed reference it returns the stringified version
of that, as expected.
CREATE FUNCTION plperl_reference() RETURNS bytea LANGUAGE plperl
TRANSFORM FOR TYPE bytea
AS $$ return []; $$;
SELECT encode(plperl_reference(), 'escape') string;
string
-----------------------
ARRAY(0x559a3109f0a8)
(1 row)
This would also crash if the dereferencing loop was left in place.
Maybe there should be a check so references cannot be returned? Probably is
not safe pass pointers between Perl and Postgres.
There's no reason to ban references, that would break every Perl
programmer's expectations. And there are no pointers being passed,
SvPVbyte() returns the stringified form of whatever's passed in, which
is well-behaved for both blessed and unblessed references.
If we really want to be strict, we should at least allow references to
objects that overload stringification, as they are explicitly designed
to be well-behaved as strings. But that would be a lot of extra code
for very little benefit over just letting Perl stringify everything.
- ilmari
út 30. 1. 2024 v 17:18 odesílatel Dagfinn Ilmari Mannsåker <
ilmari@ilmari.org> napsal:
Pavel Stehule <pavel.stehule@gmail.com> writes:
út 30. 1. 2024 v 16:43 odesílatel Dagfinn Ilmari Mannsåker <
ilmari@ilmari.org> napsal:Pavel Stehule <pavel.stehule@gmail.com> writes:
I inserted perl reference support - hstore_plperl and json_plperl does
it.
+<->/* Dereference references recursively. */ +<->while (SvROK(in)) +<-><-->in = SvRV(in);That code in hstore_plperl and json_plperl is only relevant because they
deal with non-scalar values (hashes for hstore, and also arrays for
json) which must be passed as references. The recursive nature of the
dereferencing is questionable, and masked the bug fixed by commit
1731e3741cbbf8e0b4481665d7d523bc55117f63.bytea_plperl only deals with scalars (specifically strings), so should
not concern itself with references. In fact, this code breaks returning
objects with overloaded stringification, for example:CREATE FUNCTION plperlu_overload() RETURNS bytea LANGUAGE plperlu
TRANSFORM FOR TYPE bytea
AS $$
package StringOverload { use overload '""' => sub { "stuff" }; }
return bless {}, "StringOverload";
$$;This makes the server crash with an assertion failure from Perl because
SvPVbyte() was passed a non-scalar value:postgres: ilmari regression_bytea_plperl [local] SELECT: sv.c:2865:
Perl_sv_2pv_flags:
Assertion `SvTYPE(sv) != SVt_PVAV && SvTYPE(sv) != SVt_PVHV &&SvTYPE(sv)
!= SVt_PVFM' failed.
If I remove the dereferincing loop it succeeds:
SELECT encode(plperlu_overload(), 'escape') AS string;
string
--------
stuff
(1 row)Attached is a v2 patch which removes the dereferencing and includes the
above example as a test.But without dereference it returns bad value.
Where exactly does it return a bad value? The existing tests pass, and
the one I included shows that it does the right thing in that case too.
If you pass it an unblessed reference it returns the stringified version
of that, as expected.
ugly test code
(2024-01-30 13:44:28) postgres=# CREATE or replace FUNCTION
perl_inverse_bytes(bytea) RETURNS bytea
TRANSFORM FOR TYPE bytea
AS $$ my $bytes = pack 'H*', '0123'; my $ref = \$bytes;
return $ref;
$$ LANGUAGE plperlu;
CREATE FUNCTION
(2024-01-30 13:44:33) postgres=# select perl_inverse_bytes(''), ' '::bytea;
┌──────────────────────────────────────┬───────┐
│ perl_inverse_bytes │ bytea │
╞══════════════════════════════════════╪═══════╡
│ \x5343414c41522830783130656134333829 │ \x20 │
└──────────────────────────────────────┴───────┘
(1 row)
expected
(2024-01-30 13:46:58) postgres=# select perl_inverse_bytes(''), ' '::bytea;
┌────────────────────┬───────┐
│ perl_inverse_bytes │ bytea │
╞════════════════════╪═══════╡
│ \x0123 │ \x20 │
└────────────────────┴───────┘
(1 row)
Show quoted text
CREATE FUNCTION plperl_reference() RETURNS bytea LANGUAGE plperl
TRANSFORM FOR TYPE bytea
AS $$ return []; $$;SELECT encode(plperl_reference(), 'escape') string;
string
-----------------------
ARRAY(0x559a3109f0a8)
(1 row)This would also crash if the dereferencing loop was left in place.
Maybe there should be a check so references cannot be returned? Probably
is
not safe pass pointers between Perl and Postgres.
There's no reason to ban references, that would break every Perl
programmer's expectations. And there are no pointers being passed,
SvPVbyte() returns the stringified form of whatever's passed in, which
is well-behaved for both blessed and unblessed references.If we really want to be strict, we should at least allow references to
objects that overload stringification, as they are explicitly designed
to be well-behaved as strings. But that would be a lot of extra code
for very little benefit over just letting Perl stringify everything.- ilmari
Pavel Stehule <pavel.stehule@gmail.com> writes:
út 30. 1. 2024 v 17:18 odesílatel Dagfinn Ilmari Mannsåker <
ilmari@ilmari.org> napsal:Pavel Stehule <pavel.stehule@gmail.com> writes:
út 30. 1. 2024 v 16:43 odesílatel Dagfinn Ilmari Mannsåker <
ilmari@ilmari.org> napsal:Pavel Stehule <pavel.stehule@gmail.com> writes:
I inserted perl reference support - hstore_plperl and json_plperl does
it.
+<->/* Dereference references recursively. */ +<->while (SvROK(in)) +<-><-->in = SvRV(in);That code in hstore_plperl and json_plperl is only relevant because they
deal with non-scalar values (hashes for hstore, and also arrays for
json) which must be passed as references. The recursive nature of the
dereferencing is questionable, and masked the bug fixed by commit
1731e3741cbbf8e0b4481665d7d523bc55117f63.bytea_plperl only deals with scalars (specifically strings), so should
not concern itself with references. In fact, this code breaks returning
objects with overloaded stringification, for example:CREATE FUNCTION plperlu_overload() RETURNS bytea LANGUAGE plperlu
TRANSFORM FOR TYPE bytea
AS $$
package StringOverload { use overload '""' => sub { "stuff" }; }
return bless {}, "StringOverload";
$$;This makes the server crash with an assertion failure from Perl because
SvPVbyte() was passed a non-scalar value:postgres: ilmari regression_bytea_plperl [local] SELECT: sv.c:2865:
Perl_sv_2pv_flags:
Assertion `SvTYPE(sv) != SVt_PVAV && SvTYPE(sv) != SVt_PVHV &&SvTYPE(sv)
!= SVt_PVFM' failed.
If I remove the dereferincing loop it succeeds:
SELECT encode(plperlu_overload(), 'escape') AS string;
string
--------
stuff
(1 row)Attached is a v2 patch which removes the dereferencing and includes the
above example as a test.But without dereference it returns bad value.
Where exactly does it return a bad value? The existing tests pass, and
the one I included shows that it does the right thing in that case too.
If you pass it an unblessed reference it returns the stringified version
of that, as expected.ugly test code
(2024-01-30 13:44:28) postgres=# CREATE or replace FUNCTION
perl_inverse_bytes(bytea) RETURNS bytea
TRANSFORM FOR TYPE bytea
AS $$ my $bytes = pack 'H*', '0123'; my $ref = \$bytes;
You are returning a reference, not a string.
return $ref;
$$ LANGUAGE plperlu;
CREATE FUNCTION
(2024-01-30 13:44:33) postgres=# select perl_inverse_bytes(''), ' '::bytea;
┌──────────────────────────────────────┬───────┐
│ perl_inverse_bytes │ bytea │
╞══════════════════════════════════════╪═══════╡
│ \x5343414c41522830783130656134333829 │ \x20 │
└──────────────────────────────────────┴───────┘
(1 row)
~=# select encode('\x5343414c41522830783130656134333829', 'escape');
┌───────────────────┐
│ encode │
├───────────────────┤
│ SCALAR(0x10ea438) │
└───────────────────┘
This is how Perl stringifies references in the absence of overloading.
Return the byte string directly from your function and it will do the
right thing:
CREATE FUNCTION plperlu_bytes() RETURNS bytea LANGUAGE plperlu
TRANSFORM FOR TYPE bytea
AS $$ return pack 'H*', '0123'; $$;
SELECT plperlu_bytes();
plperlu_bytes
---------------
\x0123
(1 row)
- ilmari
út 30. 1. 2024 v 17:46 odesílatel Dagfinn Ilmari Mannsåker <
ilmari@ilmari.org> napsal:
Pavel Stehule <pavel.stehule@gmail.com> writes:
út 30. 1. 2024 v 17:18 odesílatel Dagfinn Ilmari Mannsåker <
ilmari@ilmari.org> napsal:Pavel Stehule <pavel.stehule@gmail.com> writes:
út 30. 1. 2024 v 16:43 odesílatel Dagfinn Ilmari Mannsåker <
ilmari@ilmari.org> napsal:Pavel Stehule <pavel.stehule@gmail.com> writes:
I inserted perl reference support - hstore_plperl and json_plperl
does
it.
+<->/* Dereference references recursively. */ +<->while (SvROK(in)) +<-><-->in = SvRV(in);That code in hstore_plperl and json_plperl is only relevant because
they
deal with non-scalar values (hashes for hstore, and also arrays for
json) which must be passed as references. The recursive nature ofthe
dereferencing is questionable, and masked the bug fixed by commit
1731e3741cbbf8e0b4481665d7d523bc55117f63.bytea_plperl only deals with scalars (specifically strings), so
should
not concern itself with references. In fact, this code breaks
returning
objects with overloaded stringification, for example:
CREATE FUNCTION plperlu_overload() RETURNS bytea LANGUAGE plperlu
TRANSFORM FOR TYPE bytea
AS $$
package StringOverload { use overload '""' => sub { "stuff" }; }
return bless {}, "StringOverload";
$$;This makes the server crash with an assertion failure from Perl
because
SvPVbyte() was passed a non-scalar value:
postgres: ilmari regression_bytea_plperl [local] SELECT: sv.c:2865:
Perl_sv_2pv_flags:
Assertion `SvTYPE(sv) != SVt_PVAV && SvTYPE(sv) != SVt_PVHV &&SvTYPE(sv)
!= SVt_PVFM' failed.
If I remove the dereferincing loop it succeeds:
SELECT encode(plperlu_overload(), 'escape') AS string;
string
--------
stuff
(1 row)Attached is a v2 patch which removes the dereferencing and includes
the
above example as a test.
But without dereference it returns bad value.
Where exactly does it return a bad value? The existing tests pass, and
the one I included shows that it does the right thing in that case too.
If you pass it an unblessed reference it returns the stringified version
of that, as expected.ugly test code
(2024-01-30 13:44:28) postgres=# CREATE or replace FUNCTION
perl_inverse_bytes(bytea) RETURNS bytea
TRANSFORM FOR TYPE bytea
AS $$ my $bytes = pack 'H*', '0123'; my $ref = \$bytes;You are returning a reference, not a string.
I know, but for this case, should not be raised an error?
Show quoted text
return $ref;
$$ LANGUAGE plperlu;
CREATE FUNCTION
(2024-01-30 13:44:33) postgres=# select perl_inverse_bytes(''), ''::bytea;
┌──────────────────────────────────────┬───────┐
│ perl_inverse_bytes │ bytea │
╞══════════════════════════════════════╪═══════╡
│ \x5343414c41522830783130656134333829 │ \x20 │
└──────────────────────────────────────┴───────┘
(1 row)~=# select encode('\x5343414c41522830783130656134333829', 'escape');
┌───────────────────┐
│ encode │
├───────────────────┤
│ SCALAR(0x10ea438) │
└───────────────────┘This is how Perl stringifies references in the absence of overloading.
Return the byte string directly from your function and it will do the
right thing:CREATE FUNCTION plperlu_bytes() RETURNS bytea LANGUAGE plperlu
TRANSFORM FOR TYPE bytea
AS $$ return pack 'H*', '0123'; $$;SELECT plperlu_bytes();
plperlu_bytes
---------------
\x0123
(1 row)- ilmari
Pavel Stehule <pavel.stehule@gmail.com> writes:
út 30. 1. 2024 v 17:46 odesílatel Dagfinn Ilmari Mannsåker <
ilmari@ilmari.org> napsal:Pavel Stehule <pavel.stehule@gmail.com> writes:
út 30. 1. 2024 v 17:18 odesílatel Dagfinn Ilmari Mannsåker <
ilmari@ilmari.org> napsal:Pavel Stehule <pavel.stehule@gmail.com> writes:
út 30. 1. 2024 v 16:43 odesílatel Dagfinn Ilmari Mannsåker <
ilmari@ilmari.org> napsal:Pavel Stehule <pavel.stehule@gmail.com> writes:
I inserted perl reference support - hstore_plperl and json_plperl
does
it.
+<->/* Dereference references recursively. */ +<->while (SvROK(in)) +<-><-->in = SvRV(in);That code in hstore_plperl and json_plperl is only relevant because
they
deal with non-scalar values (hashes for hstore, and also arrays for
json) which must be passed as references. The recursive nature ofthe
dereferencing is questionable, and masked the bug fixed by commit
1731e3741cbbf8e0b4481665d7d523bc55117f63.bytea_plperl only deals with scalars (specifically strings), so
should
not concern itself with references. In fact, this code breaks
returning
objects with overloaded stringification, for example:
CREATE FUNCTION plperlu_overload() RETURNS bytea LANGUAGE plperlu
TRANSFORM FOR TYPE bytea
AS $$
package StringOverload { use overload '""' => sub { "stuff" }; }
return bless {}, "StringOverload";
$$;This makes the server crash with an assertion failure from Perl
because
SvPVbyte() was passed a non-scalar value:
postgres: ilmari regression_bytea_plperl [local] SELECT: sv.c:2865:
Perl_sv_2pv_flags:
Assertion `SvTYPE(sv) != SVt_PVAV && SvTYPE(sv) != SVt_PVHV &&SvTYPE(sv)
!= SVt_PVFM' failed.
If I remove the dereferincing loop it succeeds:
SELECT encode(plperlu_overload(), 'escape') AS string;
string
--------
stuff
(1 row)Attached is a v2 patch which removes the dereferencing and includes
the
above example as a test.
But without dereference it returns bad value.
Where exactly does it return a bad value? The existing tests pass, and
the one I included shows that it does the right thing in that case too.
If you pass it an unblessed reference it returns the stringified version
of that, as expected.ugly test code
(2024-01-30 13:44:28) postgres=# CREATE or replace FUNCTION
perl_inverse_bytes(bytea) RETURNS bytea
TRANSFORM FOR TYPE bytea
AS $$ my $bytes = pack 'H*', '0123'; my $ref = \$bytes;You are returning a reference, not a string.
I know, but for this case, should not be raised an error?
I don't think so, as I explained in my previous reply:
There's no reason to ban references, that would break every Perl
programmer's expectations.
To elaborate on this: when a function is defined to return a string
(which bytea effectively is, as far as Perl is converned), I as a Perl
programmer would expect PL/Perl to just stringify whatever value I
returned, according to the usual Perl rules.
I also said:
If we really want to be strict, we should at least allow references to
objects that overload stringification, as they are explicitly designed
to be well-behaved as strings. But that would be a lot of extra code
for very little benefit over just letting Perl stringify everything.
By "a lot of code", I mean everything `string_amg`-related in the
amagic_applies() function
(https://github.com/Perl/perl5/blob/v5.38.0/gv.c#L3401-L3545). We can't
just call it: it's only available since Perl 5.38 (released last year),
and we support Perl versions all the way back to 5.14.
- ilmari