Initial refactoring of plperl.c - rebased [PATCH]

Started by Tim Bunceover 16 years ago17 messageshackers
Jump to latest
#1Tim Bunce
Tim.Bunce@pobox.com

I've attached an update of my previous refactoring of plperl.c.
It's been rebased over the current (git) HEAD and has a few
very minor additions.

Background:

I've started work on the enhancements to plperl I outlined on pg-general
(in the "Wishlist of PL/Perl Enhancements for 8.5" thread).
I have a working implementation of those changes, plus some performance
enhancements, that I'm now re-working into a clean set of tested and
polished patches.

This patch is a first step that doesn't add any extra functionality.
It refactors the internals to make adding the extra functionality
easier (and more clearly visible).

Changes in this patch:

- Changed MULTIPLICITY check from runtime to compiletime.
No loads the large Config module.
- Changed plperl_init_interp() to return new interp
and not alter the global interp_state
- Moved plperl_safe_init() call into check_interp().
- Removed plperl_safe_init_done state variable
as interp_state now covers that role.
- Changed plperl_create_sub() to take a plperl_proc_desc argument.
- Simplified return value handling in plperl_create_sub.
- Added a test for the effect of the utf8fix function.
- Changed perl.com link in the docs to perl.org and tweaked
wording to clarify that require, not use, is what's blocked.
- Moved perl code in large multi-line C string literal macros
out to plc_*.pl files.
- Added a test2macro.pl utility to convert the plc_*.pl files to
macros in a perlchunks.h file which is #included
Additions since previous verion:
- Replaced calls to SvPV(val, PL_na) with SvPV_nolen(val)
- Simplifed plperl_safe_init() slightly
- Removed trailing whitespace from new plc_*.pl files.

I'd appreciate any feedback on the patch.

Tim.

Attachments:

master-plperl-refactor3.patchtext/x-patch; charset=us-asciiDownload+363-231
#2Josh Berkus
josh@agliodbs.com
In reply to: Tim Bunce (#1)
Re: Initial refactoring of plperl.c - rebased [PATCH]

Tim,

Since there's a commitfest on right now, meaningful feedback on your
patch could be delayed. Just so you know.

--Josh Berkus

#3Tim Bunce
Tim.Bunce@pobox.com
In reply to: Josh Berkus (#2)
Re: Initial refactoring of plperl.c - rebased [PATCH]

On Thu, Dec 03, 2009 at 03:47:21PM -0800, Josh Berkus wrote:

Tim,

Since there's a commitfest on right now, meaningful feedback on your
patch could be delayed. Just so you know.

Understood. Thanks Josh.

Tim.

#4Andrew Dunstan
andrew@dunslane.net
In reply to: Tim Bunce (#1)
Re: Initial refactoring of plperl.c - rebased [PATCH]

Tim Bunce wrote:

I've attached an update of my previous refactoring of plperl.c.
It's been rebased over the current (git) HEAD and has a few
very minor additions.

[snip]

+ -- Test compilation of unicode regex
+ --
+ CREATE OR REPLACE FUNCTION perl_unicode_regex(text) RETURNS INTEGER AS $$
+ # see http://rt.perl.org/rt3/Ticket/Display.html?id=47576
+ return ($_[0] =~ /\x{263A}|happy/i) ? 1 : 0; # unicode smiley
+ $$ LANGUAGE plperl;

This test is failing on my setup at least when the target db is not UTF8
encoded.

Maybe that's a bug we need to fix?

cheers

andrew

#5Tim Bunce
Tim.Bunce@pobox.com
In reply to: Andrew Dunstan (#4)
Re: Initial refactoring of plperl.c - rebased [PATCH]

On Fri, Dec 25, 2009 at 12:54:13PM -0500, Andrew Dunstan wrote:

Tim Bunce wrote:

I've attached an update of my previous refactoring of plperl.c.
It's been rebased over the current (git) HEAD and has a few
very minor additions.

[snip]

+ -- Test compilation of unicode regex
+ --
+ CREATE OR REPLACE FUNCTION perl_unicode_regex(text) RETURNS INTEGER AS $$
+ # see http://rt.perl.org/rt3/Ticket/Display.html?id=47576
+ return ($_[0] =~ /\x{263A}|happy/i) ? 1 : 0; # unicode smiley
+ $$ LANGUAGE plperl;

This test is failing on my setup at least when the target db is not UTF8
encoded.

Maybe that's a bug we need to fix?

Yes. I believe the test is highlighting an existing problem: that plperl
function in non-PG_UTF8 databases can't use regular expressions that
require unicode character meta-data.

Either the (GetDatabaseEncoding() == PG_UTF8) test in plperl_safe_init()
should be removed, so the utf8fix function is always called, or the
test should be removed (or hacked to only apply to PG_UTF8 databases).

Tim.

p.s. There may be other problems using unicode in non-PG_UTF8 databases,
but I believe this patch doesn't change the behaviour for better or worse.

#6Andrew Dunstan
andrew@dunslane.net
In reply to: Tim Bunce (#5)
Re: Initial refactoring of plperl.c - rebased [PATCH]

Tim Bunce wrote:

On Fri, Dec 25, 2009 at 12:54:13PM -0500, Andrew Dunstan wrote:

Tim Bunce wrote:

I've attached an update of my previous refactoring of plperl.c.
It's been rebased over the current (git) HEAD and has a few
very minor additions.

[snip]

+ -- Test compilation of unicode regex
+ --
+ CREATE OR REPLACE FUNCTION perl_unicode_regex(text) RETURNS INTEGER AS $$
+ # see http://rt.perl.org/rt3/Ticket/Display.html?id=47576
+ return ($_[0] =~ /\x{263A}|happy/i) ? 1 : 0; # unicode smiley
+ $$ LANGUAGE plperl;

This test is failing on my setup at least when the target db is not UTF8
encoded.

Maybe that's a bug we need to fix?

Yes. I believe the test is highlighting an existing problem: that plperl
function in non-PG_UTF8 databases can't use regular expressions that
require unicode character meta-data.

Either the (GetDatabaseEncoding() == PG_UTF8) test in plperl_safe_init()
should be removed, so the utf8fix function is always called, or the
test should be removed (or hacked to only apply to PG_UTF8 databases).

I tried forcing the test, but it doesn't seem to work, possibly because
in the case that the db is not utf8 we aren't forcing argument strings
to UTF8 :-(

I think we might need to remove the test from the patch.

p.s. There may be other problems using unicode in non-PG_UTF8 databases,
but I believe this patch doesn't change the behaviour for better or worse.

Right.

cheers

andrew

#7Andrew Dunstan
andrew@dunslane.net
In reply to: Andrew Dunstan (#6)
Re: Initial refactoring of plperl.c - rebased [PATCH]

Andrew Dunstan wrote:

Yes. I believe the test is highlighting an existing problem: that plperl
function in non-PG_UTF8 databases can't use regular expressions that
require unicode character meta-data.

Either the (GetDatabaseEncoding() == PG_UTF8) test in plperl_safe_init()
should be removed, so the utf8fix function is always called, or the
test should be removed (or hacked to only apply to PG_UTF8 databases).

I tried forcing the test, but it doesn't seem to work, possibly
because in the case that the db is not utf8 we aren't forcing argument
strings to UTF8 :-(

I think we might need to remove the test from the patch.

I have not been able to come up with a fix for this - the whole thing
seems very fragile. I'm going to commit what remains of this patch, but
not add the extra regression test. I'll add a TODO to allow plperl to do
utf8 operations in non-utf8 databases.

cheers

andrew

#8Tim Bunce
Tim.Bunce@pobox.com
In reply to: Andrew Dunstan (#7)
Initial refactoring of plperl.c - updated

On Mon, Jan 04, 2010 at 06:38:03PM -0500, Andrew Dunstan wrote:

Andrew Dunstan wrote:

Yes. I believe the test is highlighting an existing problem: that plperl
function in non-PG_UTF8 databases can't use regular expressions that
require unicode character meta-data.

Either the (GetDatabaseEncoding() == PG_UTF8) test in plperl_safe_init()
should be removed, so the utf8fix function is always called, or the
test should be removed (or hacked to only apply to PG_UTF8 databases).

I tried forcing the test, but it doesn't seem to work, possibly
because in the case that the db is not utf8 we aren't forcing
argument strings to UTF8 :-(

I think we might need to remove the test from the patch.

I have not been able to come up with a fix for this - the whole
thing seems very fragile. I'm going to commit what remains of this
patch, but not add the extra regression test. I'll add a TODO to
allow plperl to do utf8 operations in non-utf8 databases.

I see you've not commited it yet, so to help out I've attached
a new diff, over the current CVS head, with two minor changes:

- Removed the test, as noted above.
- Optimized pg_verifymbstr calls to avoid unneeded strlen()s.

This should apply cleanly to cvs, saving you the need to resolve the
conflicts caused by the recent pg_verifymbstr patch.
I'll add it to the commitfest once it reaches the archives.

Tim.

Attachments:

master-a3-plperl-refactor2.patchtext/x-patch; charset=us-asciiDownload+359-246
#9Andrew Dunstan
andrew@dunslane.net
In reply to: Tim Bunce (#8)
Re: Initial refactoring of plperl.c - updated

Tim Bunce wrote:

I see you've not commited it yet, so to help out I've attached
a new diff, over the current CVS head, with two minor changes:

- Removed the test, as noted above.
- Optimized pg_verifymbstr calls to avoid unneeded strlen()s.

I have committed this with minor edits. That should give us a clean base
for the features patch(es).

cheers

andrew

#10Peter Eisentraut
peter_e@gmx.net
In reply to: Tim Bunce (#8)
Re: Initial refactoring of plperl.c - updated

On fre, 2010-01-08 at 12:46 +0000, Tim Bunce wrote:

*** 45,50 ****
--- 45,55 ----

include $(top_srcdir)/src/Makefile.shlib

+ plperl.o: perlchunks.h
+ 
+ perlchunks.h: plc_*.pl
+       $(PERL) text2macro.pl --strip='^(\#.*|\s*)$$' plc_*.pl >
perlchunks.htmp
+       mv perlchunks.htmp perlchunks.h

all: all-lib

What's the reason for the temp file here?

#11Tim Bunce
Tim.Bunce@pobox.com
In reply to: Andrew Dunstan (#9)
Re: Initial refactoring of plperl.c - updated

On Fri, Jan 08, 2010 at 09:47:01PM -0500, Andrew Dunstan wrote:

Tim Bunce wrote:

I see you've not commited it yet, so to help out I've attached
a new diff, over the current CVS head, with two minor changes:

- Removed the test, as noted above.
- Optimized pg_verifymbstr calls to avoid unneeded strlen()s.

I have committed this with minor edits. That should give us a clean
base for the features patch(es).

Wonderful. Many thanks Andrew.

Tim.

#12Tim Bunce
Tim.Bunce@pobox.com
In reply to: Peter Eisentraut (#10)
Re: Initial refactoring of plperl.c - updated

On Sat, Jan 09, 2010 at 11:16:18PM +0200, Peter Eisentraut wrote:

On fre, 2010-01-08 at 12:46 +0000, Tim Bunce wrote:

*** 45,50 ****
--- 45,55 ----

include $(top_srcdir)/src/Makefile.shlib

+ plperl.o: perlchunks.h
+ 
+ perlchunks.h: plc_*.pl
+       $(PERL) text2macro.pl --strip='^(\#.*|\s*)$$' plc_*.pl >
perlchunks.htmp
+       mv perlchunks.htmp perlchunks.h

all: all-lib

What's the reason for the temp file here?

Defensive. If the text2macro.pl program fails/dies then you'd be left
with a broken output file with a newer timestamp, so the next make
wouldn't rerun text2macro.pl.

Tim.

p.s. In the makefile for perl we use a little utility called mv_if_diff
instead of a plain mv so that any downstream dependencies only get
rebuilt if the contents have changed.

#13Tim Bunce
Tim.Bunce@pobox.com
In reply to: Tim Bunce (#12)
Re: Initial refactoring of plperl.c - updated

On Sat, Jan 09, 2010 at 11:49:22PM +0000, Tim Bunce wrote:

On Sat, Jan 09, 2010 at 11:16:18PM +0200, Peter Eisentraut wrote:

On fre, 2010-01-08 at 12:46 +0000, Tim Bunce wrote:

*** 45,50 ****
--- 45,55 ----

include $(top_srcdir)/src/Makefile.shlib

+ plperl.o: perlchunks.h
+ 
+ perlchunks.h: plc_*.pl
+       $(PERL) text2macro.pl --strip='^(\#.*|\s*)$$' plc_*.pl >
perlchunks.htmp
+       mv perlchunks.htmp perlchunks.h

all: all-lib

What's the reason for the temp file here?

Defensive. If the text2macro.pl program fails/dies then you'd be left
with a broken output file with a newer timestamp, so the next make
wouldn't rerun text2macro.pl.

An alternative would be for text2macro.pl to write to a temp file and
rename at the end.

Tim.

#14Peter Eisentraut
peter_e@gmx.net
In reply to: Tim Bunce (#13)
Re: Initial refactoring of plperl.c - updated

On sön, 2010-01-10 at 00:03 +0000, Tim Bunce wrote:

On Sat, Jan 09, 2010 at 11:49:22PM +0000, Tim Bunce wrote:

On Sat, Jan 09, 2010 at 11:16:18PM +0200, Peter Eisentraut wrote:

On fre, 2010-01-08 at 12:46 +0000, Tim Bunce wrote:

*** 45,50 ****
--- 45,55 ----

include $(top_srcdir)/src/Makefile.shlib

+ plperl.o: perlchunks.h
+ 
+ perlchunks.h: plc_*.pl
+       $(PERL) text2macro.pl --strip='^(\#.*|\s*)$$' plc_*.pl >
perlchunks.htmp
+       mv perlchunks.htmp perlchunks.h

all: all-lib

What's the reason for the temp file here?

Defensive. If the text2macro.pl program fails/dies then you'd be left
with a broken output file with a newer timestamp, so the next make
wouldn't rerun text2macro.pl.

An alternative would be for text2macro.pl to write to a temp file and
rename at the end.

Sounds better. I think any program should be written such that it
doesn't produce an output file at all if it cannot produce a correct
output file. So use a temp file or a trap or something like that. The
makefile should not have to clean up after everyone.

#15Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tim Bunce (#12)
Re: Initial refactoring of plperl.c - updated

Tim Bunce <Tim.Bunce@pobox.com> writes:

On Sat, Jan 09, 2010 at 11:16:18PM +0200, Peter Eisentraut wrote:

What's the reason for the temp file here?

Defensive. If the text2macro.pl program fails/dies then you'd be left
with a broken output file with a newer timestamp, so the next make
wouldn't rerun text2macro.pl.

Doesn't make forcibly remove the target file if the command fails?

[ rummages in manual... ] It does if you specify .DELETE_ON_ERROR,
which we do (in Makefile.global); so this bit of complication is
a waste.

regards, tom lane

#16Tim Bunce
Tim.Bunce@pobox.com
In reply to: Tom Lane (#15)
Re: Initial refactoring of plperl.c - updated

On Sun, Jan 10, 2010 at 01:16:13AM -0500, Tom Lane wrote:

Tim Bunce <Tim.Bunce@pobox.com> writes:

On Sat, Jan 09, 2010 at 11:16:18PM +0200, Peter Eisentraut wrote:

What's the reason for the temp file here?

Defensive. If the text2macro.pl program fails/dies then you'd be left
with a broken output file with a newer timestamp, so the next make
wouldn't rerun text2macro.pl.

Doesn't make forcibly remove the target file if the command fails?

[ rummages in manual... ] It does if you specify .DELETE_ON_ERROR,
which we do (in Makefile.global); so this bit of complication is
a waste.

Okay.

Andrew, perhaps you could apply the attached to fix that. (Or I could
bundle it into one of the split out plperl feature patches.)

Tim.

Attachments:

master-fixtext2macro.patchtext/x-patch; charset=us-asciiDownload+5-5
#17Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tim Bunce (#16)
Re: Initial refactoring of plperl.c - updated

Tim Bunce <Tim.Bunce@pobox.com> writes:

On Sun, Jan 10, 2010 at 01:16:13AM -0500, Tom Lane wrote:

Doesn't make forcibly remove the target file if the command fails?

Andrew, perhaps you could apply the attached to fix that. (Or I could
bundle it into one of the split out plperl feature patches.)

Applied.

regards, tom lane