Initial refactoring of plperl.c - rebased [PATCH]
I've attached an update of my previous refactoring of plperl.c.
It's been rebased over the current (git) HEAD and has a few
very minor additions.
Background:
I've started work on the enhancements to plperl I outlined on pg-general
(in the "Wishlist of PL/Perl Enhancements for 8.5" thread).
I have a working implementation of those changes, plus some performance
enhancements, that I'm now re-working into a clean set of tested and
polished patches.
This patch is a first step that doesn't add any extra functionality.
It refactors the internals to make adding the extra functionality
easier (and more clearly visible).
Changes in this patch:
- Changed MULTIPLICITY check from runtime to compiletime.
No loads the large Config module.
- Changed plperl_init_interp() to return new interp
and not alter the global interp_state
- Moved plperl_safe_init() call into check_interp().
- Removed plperl_safe_init_done state variable
as interp_state now covers that role.
- Changed plperl_create_sub() to take a plperl_proc_desc argument.
- Simplified return value handling in plperl_create_sub.
- Added a test for the effect of the utf8fix function.
- Changed perl.com link in the docs to perl.org and tweaked
wording to clarify that require, not use, is what's blocked.
- Moved perl code in large multi-line C string literal macros
out to plc_*.pl files.
- Added a test2macro.pl utility to convert the plc_*.pl files to
macros in a perlchunks.h file which is #included
Additions since previous verion:
- Replaced calls to SvPV(val, PL_na) with SvPV_nolen(val)
- Simplifed plperl_safe_init() slightly
- Removed trailing whitespace from new plc_*.pl files.
I'd appreciate any feedback on the patch.
Tim.
Attachments:
master-plperl-refactor3.patchtext/x-patch; charset=us-asciiDownload+363-231
Tim,
Since there's a commitfest on right now, meaningful feedback on your
patch could be delayed. Just so you know.
--Josh Berkus
On Thu, Dec 03, 2009 at 03:47:21PM -0800, Josh Berkus wrote:
Tim,
Since there's a commitfest on right now, meaningful feedback on your
patch could be delayed. Just so you know.
Understood. Thanks Josh.
Tim.
Tim Bunce wrote:
I've attached an update of my previous refactoring of plperl.c.
It's been rebased over the current (git) HEAD and has a few
very minor additions.
[snip]
+ -- Test compilation of unicode regex + -- + CREATE OR REPLACE FUNCTION perl_unicode_regex(text) RETURNS INTEGER AS $$ + # see http://rt.perl.org/rt3/Ticket/Display.html?id=47576 + return ($_[0] =~ /\x{263A}|happy/i) ? 1 : 0; # unicode smiley + $$ LANGUAGE plperl;
This test is failing on my setup at least when the target db is not UTF8
encoded.
Maybe that's a bug we need to fix?
cheers
andrew
On Fri, Dec 25, 2009 at 12:54:13PM -0500, Andrew Dunstan wrote:
Tim Bunce wrote:
I've attached an update of my previous refactoring of plperl.c.
It's been rebased over the current (git) HEAD and has a few
very minor additions.[snip]
+ -- Test compilation of unicode regex + -- + CREATE OR REPLACE FUNCTION perl_unicode_regex(text) RETURNS INTEGER AS $$ + # see http://rt.perl.org/rt3/Ticket/Display.html?id=47576 + return ($_[0] =~ /\x{263A}|happy/i) ? 1 : 0; # unicode smiley + $$ LANGUAGE plperl;This test is failing on my setup at least when the target db is not UTF8
encoded.Maybe that's a bug we need to fix?
Yes. I believe the test is highlighting an existing problem: that plperl
function in non-PG_UTF8 databases can't use regular expressions that
require unicode character meta-data.
Either the (GetDatabaseEncoding() == PG_UTF8) test in plperl_safe_init()
should be removed, so the utf8fix function is always called, or the
test should be removed (or hacked to only apply to PG_UTF8 databases).
Tim.
p.s. There may be other problems using unicode in non-PG_UTF8 databases,
but I believe this patch doesn't change the behaviour for better or worse.
Tim Bunce wrote:
On Fri, Dec 25, 2009 at 12:54:13PM -0500, Andrew Dunstan wrote:
Tim Bunce wrote:
I've attached an update of my previous refactoring of plperl.c.
It's been rebased over the current (git) HEAD and has a few
very minor additions.[snip]
+ -- Test compilation of unicode regex + -- + CREATE OR REPLACE FUNCTION perl_unicode_regex(text) RETURNS INTEGER AS $$ + # see http://rt.perl.org/rt3/Ticket/Display.html?id=47576 + return ($_[0] =~ /\x{263A}|happy/i) ? 1 : 0; # unicode smiley + $$ LANGUAGE plperl;This test is failing on my setup at least when the target db is not UTF8
encoded.Maybe that's a bug we need to fix?
Yes. I believe the test is highlighting an existing problem: that plperl
function in non-PG_UTF8 databases can't use regular expressions that
require unicode character meta-data.Either the (GetDatabaseEncoding() == PG_UTF8) test in plperl_safe_init()
should be removed, so the utf8fix function is always called, or the
test should be removed (or hacked to only apply to PG_UTF8 databases).
I tried forcing the test, but it doesn't seem to work, possibly because
in the case that the db is not utf8 we aren't forcing argument strings
to UTF8 :-(
I think we might need to remove the test from the patch.
p.s. There may be other problems using unicode in non-PG_UTF8 databases,
but I believe this patch doesn't change the behaviour for better or worse.
Right.
cheers
andrew
Andrew Dunstan wrote:
Yes. I believe the test is highlighting an existing problem: that plperl
function in non-PG_UTF8 databases can't use regular expressions that
require unicode character meta-data.Either the (GetDatabaseEncoding() == PG_UTF8) test in plperl_safe_init()
should be removed, so the utf8fix function is always called, or the
test should be removed (or hacked to only apply to PG_UTF8 databases).I tried forcing the test, but it doesn't seem to work, possibly
because in the case that the db is not utf8 we aren't forcing argument
strings to UTF8 :-(I think we might need to remove the test from the patch.
I have not been able to come up with a fix for this - the whole thing
seems very fragile. I'm going to commit what remains of this patch, but
not add the extra regression test. I'll add a TODO to allow plperl to do
utf8 operations in non-utf8 databases.
cheers
andrew
On Mon, Jan 04, 2010 at 06:38:03PM -0500, Andrew Dunstan wrote:
Andrew Dunstan wrote:
Yes. I believe the test is highlighting an existing problem: that plperl
function in non-PG_UTF8 databases can't use regular expressions that
require unicode character meta-data.Either the (GetDatabaseEncoding() == PG_UTF8) test in plperl_safe_init()
should be removed, so the utf8fix function is always called, or the
test should be removed (or hacked to only apply to PG_UTF8 databases).I tried forcing the test, but it doesn't seem to work, possibly
because in the case that the db is not utf8 we aren't forcing
argument strings to UTF8 :-(I think we might need to remove the test from the patch.
I have not been able to come up with a fix for this - the whole
thing seems very fragile. I'm going to commit what remains of this
patch, but not add the extra regression test. I'll add a TODO to
allow plperl to do utf8 operations in non-utf8 databases.
I see you've not commited it yet, so to help out I've attached
a new diff, over the current CVS head, with two minor changes:
- Removed the test, as noted above.
- Optimized pg_verifymbstr calls to avoid unneeded strlen()s.
This should apply cleanly to cvs, saving you the need to resolve the
conflicts caused by the recent pg_verifymbstr patch.
I'll add it to the commitfest once it reaches the archives.
Tim.
Attachments:
master-a3-plperl-refactor2.patchtext/x-patch; charset=us-asciiDownload+359-246
Tim Bunce wrote:
I see you've not commited it yet, so to help out I've attached
a new diff, over the current CVS head, with two minor changes:- Removed the test, as noted above.
- Optimized pg_verifymbstr calls to avoid unneeded strlen()s.
I have committed this with minor edits. That should give us a clean base
for the features patch(es).
cheers
andrew
On fre, 2010-01-08 at 12:46 +0000, Tim Bunce wrote:
*** 45,50 **** --- 45,55 ----include $(top_srcdir)/src/Makefile.shlib
+ plperl.o: perlchunks.h + + perlchunks.h: plc_*.pl + $(PERL) text2macro.pl --strip='^(\#.*|\s*)$$' plc_*.pl > perlchunks.htmp + mv perlchunks.htmp perlchunks.hall: all-lib
What's the reason for the temp file here?
On Fri, Jan 08, 2010 at 09:47:01PM -0500, Andrew Dunstan wrote:
Tim Bunce wrote:
I see you've not commited it yet, so to help out I've attached
a new diff, over the current CVS head, with two minor changes:- Removed the test, as noted above.
- Optimized pg_verifymbstr calls to avoid unneeded strlen()s.I have committed this with minor edits. That should give us a clean
base for the features patch(es).
Wonderful. Many thanks Andrew.
Tim.
On Sat, Jan 09, 2010 at 11:16:18PM +0200, Peter Eisentraut wrote:
On fre, 2010-01-08 at 12:46 +0000, Tim Bunce wrote:
*** 45,50 **** --- 45,55 ----include $(top_srcdir)/src/Makefile.shlib
+ plperl.o: perlchunks.h + + perlchunks.h: plc_*.pl + $(PERL) text2macro.pl --strip='^(\#.*|\s*)$$' plc_*.pl > perlchunks.htmp + mv perlchunks.htmp perlchunks.hall: all-lib
What's the reason for the temp file here?
Defensive. If the text2macro.pl program fails/dies then you'd be left
with a broken output file with a newer timestamp, so the next make
wouldn't rerun text2macro.pl.
Tim.
p.s. In the makefile for perl we use a little utility called mv_if_diff
instead of a plain mv so that any downstream dependencies only get
rebuilt if the contents have changed.
On Sat, Jan 09, 2010 at 11:49:22PM +0000, Tim Bunce wrote:
On Sat, Jan 09, 2010 at 11:16:18PM +0200, Peter Eisentraut wrote:
On fre, 2010-01-08 at 12:46 +0000, Tim Bunce wrote:
*** 45,50 **** --- 45,55 ----include $(top_srcdir)/src/Makefile.shlib
+ plperl.o: perlchunks.h + + perlchunks.h: plc_*.pl + $(PERL) text2macro.pl --strip='^(\#.*|\s*)$$' plc_*.pl > perlchunks.htmp + mv perlchunks.htmp perlchunks.hall: all-lib
What's the reason for the temp file here?
Defensive. If the text2macro.pl program fails/dies then you'd be left
with a broken output file with a newer timestamp, so the next make
wouldn't rerun text2macro.pl.
An alternative would be for text2macro.pl to write to a temp file and
rename at the end.
Tim.
On sön, 2010-01-10 at 00:03 +0000, Tim Bunce wrote:
On Sat, Jan 09, 2010 at 11:49:22PM +0000, Tim Bunce wrote:
On Sat, Jan 09, 2010 at 11:16:18PM +0200, Peter Eisentraut wrote:
On fre, 2010-01-08 at 12:46 +0000, Tim Bunce wrote:
*** 45,50 **** --- 45,55 ----include $(top_srcdir)/src/Makefile.shlib
+ plperl.o: perlchunks.h + + perlchunks.h: plc_*.pl + $(PERL) text2macro.pl --strip='^(\#.*|\s*)$$' plc_*.pl > perlchunks.htmp + mv perlchunks.htmp perlchunks.hall: all-lib
What's the reason for the temp file here?
Defensive. If the text2macro.pl program fails/dies then you'd be left
with a broken output file with a newer timestamp, so the next make
wouldn't rerun text2macro.pl.An alternative would be for text2macro.pl to write to a temp file and
rename at the end.
Sounds better. I think any program should be written such that it
doesn't produce an output file at all if it cannot produce a correct
output file. So use a temp file or a trap or something like that. The
makefile should not have to clean up after everyone.
Tim Bunce <Tim.Bunce@pobox.com> writes:
On Sat, Jan 09, 2010 at 11:16:18PM +0200, Peter Eisentraut wrote:
What's the reason for the temp file here?
Defensive. If the text2macro.pl program fails/dies then you'd be left
with a broken output file with a newer timestamp, so the next make
wouldn't rerun text2macro.pl.
Doesn't make forcibly remove the target file if the command fails?
[ rummages in manual... ] It does if you specify .DELETE_ON_ERROR,
which we do (in Makefile.global); so this bit of complication is
a waste.
regards, tom lane
On Sun, Jan 10, 2010 at 01:16:13AM -0500, Tom Lane wrote:
Tim Bunce <Tim.Bunce@pobox.com> writes:
On Sat, Jan 09, 2010 at 11:16:18PM +0200, Peter Eisentraut wrote:
What's the reason for the temp file here?
Defensive. If the text2macro.pl program fails/dies then you'd be left
with a broken output file with a newer timestamp, so the next make
wouldn't rerun text2macro.pl.Doesn't make forcibly remove the target file if the command fails?
[ rummages in manual... ] It does if you specify .DELETE_ON_ERROR,
which we do (in Makefile.global); so this bit of complication is
a waste.
Okay.
Andrew, perhaps you could apply the attached to fix that. (Or I could
bundle it into one of the split out plperl feature patches.)
Tim.
Attachments:
master-fixtext2macro.patchtext/x-patch; charset=us-asciiDownload+5-5
Tim Bunce <Tim.Bunce@pobox.com> writes:
On Sun, Jan 10, 2010 at 01:16:13AM -0500, Tom Lane wrote:
Doesn't make forcibly remove the target file if the command fails?
Andrew, perhaps you could apply the attached to fix that. (Or I could
bundle it into one of the split out plperl feature patches.)
Applied.
regards, tom lane