First feature patch for plperl - draft [PATCH]

Started by Tim Bunceabout 16 years ago29 messages
#1Tim Bunce
Tim.Bunce@pobox.com
1 attachment(s)

Building on my earlier plperl refactoring patch, here's a draft of my
first plperl feature patch.

Significant changes in this patch:

- New GUC plperl.on_perl_init='...perl...' for admin use.
- New GUC plperl.on_trusted_init='...perl...' for plperl user use.
- New GUC plperl.on_untrusted_init='...perl...' for plperlu user use.
- END blocks now run at backend exit (fixes bug #5066).
- Stored procedure subs are now given names ($name__$oid).
- More error checking and reporting.
- Warnings no longer have an extra newline in the NOTICE text.
- Various minor optimizations like pre-growing data structures.

I'm working on adding tests and documentation now, meanwhile I'd very
much appreciate any feedback on the patch.

Tim.

p.s. Once this patch is complete I plan to work on patches that:
- add quote_literal and quote_identifier functions in C.
- generalize the Safe setup code to enable more control.
- formalize namespace usage, moving things out of main::
- add a way to perform inter-sub calling (at least for simple cases).
- possibly rewrite _plperl_to_pg_array in C.

Attachments:

master-plperl-feature1.patchtext/x-patch; charset=us-asciiDownload
diff --git a/src/pl/plperl/GNUmakefile b/src/pl/plperl/GNUmakefile
index 8989b14..5a9ad2f 100644
*** a/src/pl/plperl/GNUmakefile
--- b/src/pl/plperl/GNUmakefile
*************** include $(top_srcdir)/src/Makefile.shlib
*** 48,54 ****
  plperl.o: perlchunks.h
  
  perlchunks.h: plc_*.pl
! 	$(PERL) text2macro.pl --strip='^(\#.*|\s*)$$' plc_*.pl > perlchunks.htmp
  	mv perlchunks.htmp perlchunks.h
  
  all: all-lib
--- 48,54 ----
  plperl.o: perlchunks.h
  
  perlchunks.h: plc_*.pl
! 	$(PERL) text2macro.pl --strip='^\s*(\#.*|)$$' plc_*.pl > perlchunks.htmp
  	mv perlchunks.htmp perlchunks.h
  
  all: all-lib
diff --git a/src/pl/plperl/expected/plperl_elog.out b/src/pl/plperl/expected/plperl_elog.out
index 1791d3c..89497e3 100644
*** a/src/pl/plperl/expected/plperl_elog.out
--- b/src/pl/plperl/expected/plperl_elog.out
*************** create or replace function perl_warn(tex
*** 21,27 ****
  $$;
  select perl_warn('implicit elog via warn');
  NOTICE:  implicit elog via warn at line 4.
- 
  CONTEXT:  PL/Perl function "perl_warn"
   perl_warn 
  -----------
--- 21,26 ----
diff --git a/src/pl/plperl/plc_perlboot.pl b/src/pl/plperl/plc_perlboot.pl
index d2d5518..b9c6878 100644
*** a/src/pl/plperl/plc_perlboot.pl
--- b/src/pl/plperl/plc_perlboot.pl
***************
*** 1,8 ****
--- 1,12 ----
  SPI::bootstrap();
+ 
+ use strict;
+ use warnings;
  use vars qw(%_SHARED);
  
  sub ::plperl_warn {
  	(my $msg = shift) =~ s/\(eval \d+\) //g;
+ 	chomp $msg;
  	&elog(&NOTICE, $msg);
  }
  $SIG{__WARN__} = \&::plperl_warn;
*************** sub ::plperl_die {
*** 13,28 ****
  }
  $SIG{__DIE__} = \&::plperl_die;
  
! sub ::mkunsafefunc {
! 	my $ret = eval(qq[ sub { $_[0] $_[1] } ]);
! 	$@ =~ s/\(eval \d+\) //g if $@;
! 	return $ret;
! }
  
! use strict;
  
! sub ::mk_strict_unsafefunc {
! 	my $ret = eval(qq[ sub { use strict; $_[0] $_[1] } ]);
  	$@ =~ s/\(eval \d+\) //g if $@;
  	return $ret;
  }
--- 17,44 ----
  }
  $SIG{__DIE__} = \&::plperl_die;
  
! sub ::mkfuncsrc {
! 	my ($name, $imports, $prolog, $src) = @_;
  
! 	my $BEGIN = join "\n", map {
! 		my $names = $imports->{$_} || [];
! 		"$_->import(qw(@$names));"
! 	} keys %$imports;
! 	$BEGIN &&= "BEGIN { $BEGIN }";
  
! 	$name =~ s/\\/\\\\/g;
! 	$name =~ s/::|'/_/g; # avoid package delimiters
! 
! 	my $funcsrc;
! 	$funcsrc .= qq[ undef *{'$name'}; *{'$name'} = sub { $BEGIN $prolog $src } ];
! 	#warn "plperl mkfuncsrc: $funcsrc\n";
! 	return $funcsrc;
! }
! 
! # see also mksafefunc() in plc_safe_ok.pl
! sub ::mkunsafefunc {
! 	no strict; # default to no strict for the eval
! 	my $ret = eval(::mkfuncsrc(@_));
  	$@ =~ s/\(eval \d+\) //g if $@;
  	return $ret;
  }
*************** sub ::_plperl_to_pg_array {
*** 39,46 ****
      }
      elsif (defined($elem)) {
        my $str = qq($elem);
!       $str =~ s/([\"\\])/\\$1/g;
!       $res .= qq(\"$str\");
      }
      else {
        $res .= 'NULL' ;
--- 55,62 ----
      }
      elsif (defined($elem)) {
        my $str = qq($elem);
!       $str =~ s/(["\\])/\\$1/g;
!       $res .= qq("$str");
      }
      else {
        $res .= 'NULL' ;
diff --git a/src/pl/plperl/plc_safe_bad.pl b/src/pl/plperl/plc_safe_bad.pl
index 838ccc6..da47341 100644
*** a/src/pl/plperl/plc_safe_bad.pl
--- b/src/pl/plperl/plc_safe_bad.pl
***************
*** 1,15 ****
! use vars qw($PLContainer);
! 
! $PLContainer = new Safe('PLPerl');
! $PLContainer->permit_only(':default');
! $PLContainer->share(qw[&elog &ERROR]);
  
  my $msg = 'trusted Perl functions disabled - please upgrade Perl Safe module to version 2.09 or later';
- sub ::mksafefunc {
-   return $PLContainer->reval(qq[sub { elog(ERROR,'$msg') }]);
- }
  
! sub ::mk_strict_safefunc {
!   return $PLContainer->reval(qq[sub { elog(ERROR,'$msg') }]);
  }
- 
--- 1,13 ----
! # Minimal version of plc_safe_ok.pl
! # Executed if Safe is too old or doesn't load for any reason
  
  my $msg = 'trusted Perl functions disabled - please upgrade Perl Safe module to version 2.09 or later';
  
! sub mksafefunc {
! 	my ($name, $pragma, $prolog, $src) = @_;
! 	# replace $src with code to generate an error
! 	$src = qq{ ::elog(::ERROR,"$msg\n") };
! 	my $ret = eval(::mkfuncsrc($name, $pragma, '', $src));
! 	$@ =~ s/\(eval \d+\) //g if $@;
! 	return $ret;
  }
diff --git a/src/pl/plperl/plc_safe_ok.pl b/src/pl/plperl/plc_safe_ok.pl
index 73c5573..cc8f433 100644
*** a/src/pl/plperl/plc_safe_ok.pl
--- b/src/pl/plperl/plc_safe_ok.pl
***************
*** 1,3 ****
--- 1,5 ----
+ use strict;
+ use warnings;
  use vars qw($PLContainer);
  
  $PLContainer = new Safe('PLPerl');
*************** $PLContainer->share(qw[&elog &return_nex
*** 17,33 ****
  # notice. It is quite safe, as caller is informational only, and in any case
  # we only enable it while we load the 'strict' module.
  $PLContainer->permit(qw[require caller]);
! $PLContainer->reval('use strict;');
  $PLContainer->deny(qw[require caller]);
  
! sub ::mksafefunc {
! 	my $ret = $PLContainer->reval(qq[sub { $_[0] $_[1] }]);
  	$@ =~ s/\(eval \d+\) //g if $@;
  	return $ret;
  }
  
! sub ::mk_strict_safefunc {
! 	my $ret = $PLContainer->reval(qq[sub { BEGIN { strict->import(); } $_[0] $_[1] }]);
! 	$@ =~ s/\(eval \d+\) //g if $@;
! 	return $ret;
  }
--- 19,34 ----
  # notice. It is quite safe, as caller is informational only, and in any case
  # we only enable it while we load the 'strict' module.
  $PLContainer->permit(qw[require caller]);
! $PLContainer->reval('require strict;') or die $@;
  $PLContainer->deny(qw[require caller]);
  
! # called directly for plperl.on_trusted_init
! sub ::safe_eval {
! 	my $ret = $PLContainer->reval(shift);
  	$@ =~ s/\(eval \d+\) //g if $@;
  	return $ret;
  }
  
! sub ::mksafefunc {
! 	return ::safe_eval(::mkfuncsrc(@_));
  }
diff --git a/src/pl/plperl/plperl.c b/src/pl/plperl/plperl.c
index f919f04..812d8ae 100644
*** a/src/pl/plperl/plperl.c
--- b/src/pl/plperl/plperl.c
*************** static HTAB *plperl_proc_hash = NULL;
*** 137,142 ****
--- 137,145 ----
  static HTAB *plperl_query_hash = NULL;
  
  static bool plperl_use_strict = false;
+ static char *plperl_on_perl_init = NULL;
+ static char *plperl_on_trusted_init = NULL;
+ static char *plperl_on_untrusted_init = NULL;
  
  /* this is saved and restored by plperl_call_handler */
  static plperl_call_data *current_call_data = NULL;
*************** Datum		plperl_inline_handler(PG_FUNCTION
*** 149,155 ****
  Datum		plperl_validator(PG_FUNCTION_ARGS);
  void		_PG_init(void);
  
! static PerlInterpreter *plperl_init_interp(void);
  
  static Datum plperl_func_handler(PG_FUNCTION_ARGS);
  static Datum plperl_trigger_handler(PG_FUNCTION_ARGS);
--- 152,160 ----
  Datum		plperl_validator(PG_FUNCTION_ARGS);
  void		_PG_init(void);
  
! static PerlInterpreter *plperl_create_interp(void);
! static void plperl_destroy_interp(PerlInterpreter **);
! static void plperl_fini(void);
  
  static Datum plperl_func_handler(PG_FUNCTION_ARGS);
  static Datum plperl_trigger_handler(PG_FUNCTION_ARGS);
*************** static plperl_proc_desc *compile_plperl_
*** 159,173 ****
  static SV  *plperl_hash_from_tuple(HeapTuple tuple, TupleDesc tupdesc);
  static void plperl_init_shared_libs(pTHX);
  static void plperl_safe_init(void);
  static HV  *plperl_spi_execute_fetch_result(SPITupleTable *, int, int);
  static SV  *newSVstring(const char *str);
  static SV **hv_store_string(HV *hv, const char *key, SV *val);
  static SV **hv_fetch_string(HV *hv, const char *key);
! static void plperl_create_sub(plperl_proc_desc *desc, char *s);
  static SV  *plperl_call_perl_func(plperl_proc_desc *desc, FunctionCallInfo fcinfo);
  static void plperl_compile_callback(void *arg);
  static void plperl_exec_callback(void *arg);
  static void plperl_inline_callback(void *arg);
  
  /*
   * This routine is a crock, and so is everyplace that calls it.  The problem
--- 164,180 ----
  static SV  *plperl_hash_from_tuple(HeapTuple tuple, TupleDesc tupdesc);
  static void plperl_init_shared_libs(pTHX);
  static void plperl_safe_init(void);
+ static SV  *plperl_eval_pv(const char *src, const char *errfmt);
  static HV  *plperl_spi_execute_fetch_result(SPITupleTable *, int, int);
  static SV  *newSVstring(const char *str);
  static SV **hv_store_string(HV *hv, const char *key, SV *val);
  static SV **hv_fetch_string(HV *hv, const char *key);
! static void plperl_create_sub(plperl_proc_desc *desc, char *s, Oid fn_oid);
  static SV  *plperl_call_perl_func(plperl_proc_desc *desc, FunctionCallInfo fcinfo);
  static void plperl_compile_callback(void *arg);
  static void plperl_exec_callback(void *arg);
  static void plperl_inline_callback(void *arg);
+ static char *strip_trailing_ws(const char *msg);
  
  /*
   * This routine is a crock, and so is everyplace that calls it.  The problem
*************** _PG_init(void)
*** 212,217 ****
--- 219,248 ----
  							 PGC_USERSET, 0,
  							 NULL, NULL);
  
+ 	DefineCustomStringVariable("plperl.on_perl_init",
+ 							gettext_noop("Perl code to execute when interpreter is initialized."),
+ 							NULL,
+ 							&plperl_on_perl_init,
+ 							NULL,
+ 							PGC_SUSET, 0,
+ 							NULL, NULL);
+ 
+ 	DefineCustomStringVariable("plperl.on_trusted_init",
+ 							gettext_noop("Perl code to execute when plperl is initialized for user."),
+ 							NULL,
+ 							&plperl_on_trusted_init,
+ 							NULL,
+ 							PGC_USERSET, 0,
+ 							NULL, NULL);
+ 
+ 	DefineCustomStringVariable("plperl.on_untrusted_init",
+ 							gettext_noop("Perl code to execute when plperlu is initialized for user."),
+ 							NULL,
+ 							&plperl_on_untrusted_init,
+ 							NULL,
+ 							PGC_USERSET, 0,
+ 							NULL, NULL);
+ 
  	EmitWarningsOnPlaceholders("plperl");
  
  	MemSet(&hash_ctl, 0, sizeof(hash_ctl));
*************** _PG_init(void)
*** 230,241 ****
  									&hash_ctl,
  									HASH_ELEM);
  
! 	plperl_held_interp = plperl_init_interp();
  	interp_state = INTERP_HELD;
  
  	inited = true;
  }
  
  #define SAFE_MODULE \
  	"require Safe; $Safe::VERSION"
  
--- 261,288 ----
  									&hash_ctl,
  									HASH_ELEM);
  
! 	plperl_held_interp = plperl_create_interp();
  	interp_state = INTERP_HELD;
  
+ 	atexit(plperl_fini);
+ 
  	inited = true;
  }
  
+ 
+ /*
+  * Cleanup perl interpreters, including running END blocks.
+  * Does not fully undo the actions of _PG_init() nor make it callable again.
+  */
+ static void
+ plperl_fini(void)
+ {
+ 	plperl_destroy_interp(&plperl_trusted_interp);
+ 	plperl_destroy_interp(&plperl_untrusted_interp);
+ 	plperl_destroy_interp(&plperl_held_interp);
+ }
+ 
+ 
  #define SAFE_MODULE \
  	"require Safe; $Safe::VERSION"
  
*************** _PG_init(void)
*** 246,259 ****
   * assign that interpreter if it is available to either the trusted or
   * untrusted interpreter. If it has already been assigned, and we need to
   * create the other interpreter, we do that if we can, or error out.
-  * We detect if it is safe to run two interpreters during the setup of the
-  * dummy interpreter.
   */
  
  
  static void
  check_interp(bool trusted)
  {
  	if (interp_state == INTERP_HELD)
  	{
  		if (trusted)
--- 293,325 ----
   * assign that interpreter if it is available to either the trusted or
   * untrusted interpreter. If it has already been assigned, and we need to
   * create the other interpreter, we do that if we can, or error out.
   */
  
  
  static void
  check_interp(bool trusted)
  {
+ 	/*
+ 	 * handle simple cases
+ 	 */
+ 	if (interp_state == INTERP_BOTH ||
+ 		( trusted && interp_state == INTERP_TRUSTED) ||
+ 		(!trusted && interp_state == INTERP_UNTRUSTED))
+ 	{
+ 		if (trusted_context != trusted)
+ 		{
+ 			if (trusted)
+ 				PERL_SET_CONTEXT(plperl_trusted_interp);
+ 			else
+ 				PERL_SET_CONTEXT(plperl_untrusted_interp);
+ 			trusted_context = trusted;
+ 		}
+ 		return;
+ 	}
+ 
+ 	/*
+ 	 * adopt held interp if free, else create new one if possible
+ 	 */
  	if (interp_state == INTERP_HELD)
  	{
  		if (trusted)
*************** check_interp(bool trusted)
*** 266,307 ****
  			plperl_untrusted_interp = plperl_held_interp;
  			interp_state = INTERP_UNTRUSTED;
  		}
- 		plperl_held_interp = NULL;
- 		trusted_context = trusted;
- 		if (trusted) /* done last to avoid recursion */
- 			plperl_safe_init();
- 	}
- 	else if (interp_state == INTERP_BOTH ||
- 			 (trusted && interp_state == INTERP_TRUSTED) ||
- 			 (!trusted && interp_state == INTERP_UNTRUSTED))
- 	{
- 		if (trusted_context != trusted)
- 		{
- 			if (trusted)
- 				PERL_SET_CONTEXT(plperl_trusted_interp);
- 			else
- 				PERL_SET_CONTEXT(plperl_untrusted_interp);
- 			trusted_context = trusted;
- 		}
  	}
  	else
  	{
  #ifdef MULTIPLICITY
! 		PerlInterpreter *plperl = plperl_init_interp();
  		if (trusted)
  			plperl_trusted_interp = plperl;
  		else
  			plperl_untrusted_interp = plperl;
- 		plperl_held_interp = NULL;
- 		trusted_context = trusted;
  		interp_state = INTERP_BOTH;
- 		if (trusted) /* done last to avoid recursion */
- 			plperl_safe_init();
  #else
  		elog(ERROR,
  			 "cannot allocate second Perl interpreter on this platform");
  #endif
  	}
  }
  
  /*
--- 332,369 ----
  			plperl_untrusted_interp = plperl_held_interp;
  			interp_state = INTERP_UNTRUSTED;
  		}
  	}
  	else
  	{
  #ifdef MULTIPLICITY
! 		PerlInterpreter *plperl = plperl_create_interp();
  		if (trusted)
  			plperl_trusted_interp = plperl;
  		else
  			plperl_untrusted_interp = plperl;
  		interp_state = INTERP_BOTH;
  #else
  		elog(ERROR,
  			 "cannot allocate second Perl interpreter on this platform");
  #endif
  	}
+ 	plperl_held_interp = NULL;
+ 	trusted_context = trusted;
+ 
+ 	/*
+ 	 * initialization - done last to ensure a clean state
+ 	 * (and thereby avoid recursion via plperl_safe_init)
+ 	 */
+ 	if (trusted)
+ 		plperl_safe_init();
+ 	else
+ 	{
+ 		if (plperl_on_untrusted_init && *plperl_on_untrusted_init)
+ 		{
+ 			plperl_eval_pv(plperl_on_untrusted_init,
+ 				"Error executing plperl.on_untrusted_init: %s");
+ 		}
+ 	}
  }
  
  /*
*************** restore_context(bool old_context)
*** 321,336 ****
  }
  
  static PerlInterpreter *
! plperl_init_interp(void)
  {
  	PerlInterpreter *plperl;
  	static int perl_sys_init_done;
  
! 	static char *embedding[3] = {
  		"", "-e", PLC_PERLBOOT
  	};
  	int			nargs = 3;
  
  #ifdef WIN32
  
  	/*
--- 383,408 ----
  }
  
  static PerlInterpreter *
! plperl_create_interp(void)
  {
  	PerlInterpreter *plperl;
  	static int perl_sys_init_done;
  
! 	/*
! 	 * The perl interpreter configuration can be altered via the environment variables
! 	 * like PERL5LIB, PERL5OPT, PERL_UNICODE etc., documented in the perlrun documentation.
! 	 */
! 	static char *embedding[3+2] = {
  		"", "-e", PLC_PERLBOOT
  	};
  	int			nargs = 3;
  
+ 	if (plperl_on_perl_init)
+ 	{
+ 		embedding[nargs++] = "-e";
+ 		embedding[nargs++] = plperl_on_perl_init;
+ 	}
+ 
  #ifdef WIN32
  
  	/*
*************** plperl_init_interp(void)
*** 399,404 ****
--- 471,478 ----
  
  	PERL_SET_CONTEXT(plperl);
  	perl_construct(plperl);
+ 	PL_exit_flags |= PERL_EXIT_DESTRUCT_END;
+ 
  	perl_parse(plperl, plperl_init_shared_libs,
  			   nargs, embedding, NULL);
  	perl_run(plperl);
*************** plperl_init_interp(void)
*** 449,459 ****
  
  
  static void
  plperl_safe_init(void)
  {
  	SV		   *safe_version_sv;
  
! 	safe_version_sv = eval_pv(SAFE_MODULE, FALSE);	/* TRUE = croak if failure */
  
  	/*
  	 * We actually want to reject Safe version < 2.09, but it's risky to
--- 523,545 ----
  
  
  static void
+ plperl_destroy_interp(PerlInterpreter **interp)
+ {
+ 	if (interp && *interp)
+ 	{
+ 		perl_destruct(*interp);
+ 		perl_free(*interp);
+ 		*interp = NULL;
+ 	}
+ }
+ 
+ 
+ static void
  plperl_safe_init(void)
  {
  	SV		   *safe_version_sv;
  
! 	safe_version_sv = plperl_eval_pv(SAFE_MODULE, "%s");
  
  	/*
  	 * We actually want to reject Safe version < 2.09, but it's risky to
*************** plperl_safe_init(void)
*** 463,473 ****
  	if (SvNV(safe_version_sv) < 2.0899)
  	{
  		/* not safe, so disallow all trusted funcs */
! 		eval_pv(PLC_SAFE_BAD, FALSE);
  	}
  	else
  	{
! 		eval_pv(PLC_SAFE_OK, FALSE);
  		if (GetDatabaseEncoding() == PG_UTF8)
  		{
  			/*
--- 549,560 ----
  	if (SvNV(safe_version_sv) < 2.0899)
  	{
  		/* not safe, so disallow all trusted funcs */
! 		plperl_eval_pv(PLC_SAFE_BAD, "Error initializing stub plperl: %s");
  	}
  	else
  	{
! 		plperl_eval_pv(PLC_SAFE_OK, "Error initializing plperl: %s");
! 
  		if (GetDatabaseEncoding() == PG_UTF8)
  		{
  			/*
*************** plperl_safe_init(void)
*** 488,494 ****
  
  			/* compile the function */
  			plperl_create_sub(&desc,
! 					"return shift =~ /\\xa9/i ? 'true' : 'false' ;");
  
  			/* set up to call the function with a single text argument 'a' */
  			fcinfo.arg[0] = CStringGetTextDatum("a");
--- 575,581 ----
  
  			/* compile the function */
  			plperl_create_sub(&desc,
! 					"return shift =~ /\\xa9/i ? 'true' : 'false' ;", 0);
  
  			/* set up to call the function with a single text argument 'a' */
  			fcinfo.arg[0] = CStringGetTextDatum("a");
*************** plperl_safe_init(void)
*** 497,506 ****
--- 584,626 ----
  			/* and make the call */
  			(void) plperl_call_perl_func(&desc, &fcinfo);
  		}
+ 
+ 		if (plperl_on_trusted_init && *plperl_on_trusted_init)
+ 		{
+ 			dSP;
+ 
+ 			PUSHMARK(SP);
+ 			XPUSHs(sv_2mortal(newSVstring(plperl_on_trusted_init)));
+ 			PUTBACK;
+ 
+ 			call_pv("::safe_eval", G_VOID);
+ 
+ 			if (SvTRUE(ERRSV))
+ 			{
+ 				elog(ERROR, "Error executing plperl.on_trusted_init: %s",
+ 					strip_trailing_ws(SvPV_nolen(ERRSV)));
+ 			}
+ 		}
  	}
  }
  
  /*
+  * wrapper for eval_pv that calls elog on error
+  */
+ static SV *
+ plperl_eval_pv(const char *src, const char *errfmt)
+ {
+ 	SV		   *sv;
+ 
+ 	sv = eval_pv(src, (errfmt) ? FALSE : TRUE); /* croak if error and errfmt is NULL */
+ 	if (SvTRUE(ERRSV))
+ 	{
+ 		elog(ERROR, errfmt, strip_trailing_ws(SvPV_nolen(ERRSV)));
+ 	}
+ 	return sv;
+ }
+ 
+ /*
   * Perl likes to put a newline after its error messages; clean up such
   */
  static char *
*************** plperl_convert_to_pg_array(SV *src)
*** 557,563 ****
  {
  	SV		   *rv;
  	int			count;
- 
  	dSP;
  
  	PUSHMARK(SP);
--- 677,682 ----
*************** plperl_trigger_build_args(FunctionCallIn
*** 594,599 ****
--- 713,719 ----
  	HV		   *hv;
  
  	hv = newHV();
+ 	hv_ksplit(hv, 12); /* pre-grow the hash */
  
  	tdata = (TriggerData *) fcinfo->context;
  	tupdesc = tdata->tg_relation->rd_att;
*************** plperl_trigger_build_args(FunctionCallIn
*** 648,653 ****
--- 768,774 ----
  	{
  		AV		   *av = newAV();
  
+ 		av_extend(av, tdata->tg_trigger->tgnargs);
  		for (i = 0; i < tdata->tg_trigger->tgnargs; i++)
  			av_push(av, newSVstring(tdata->tg_trigger->tgargs[i]));
  		hv_store_string(hv, "args", newRV_noinc((SV *) av));
*************** plperl_inline_handler(PG_FUNCTION_ARGS)
*** 870,876 ****
  
  		check_interp(desc.lanpltrusted);
  
! 		plperl_create_sub(&desc, codeblock->source_text);
  
  		if (!desc.reference)	/* can this happen? */
  			elog(ERROR, "could not create internal procedure for anonymous code block");
--- 991,997 ----
  
  		check_interp(desc.lanpltrusted);
  
! 		plperl_create_sub(&desc, codeblock->source_text, 0);
  
  		if (!desc.reference)	/* can this happen? */
  			elog(ERROR, "could not create internal procedure for anonymous code block");
*************** plperl_validator(PG_FUNCTION_ARGS)
*** 975,997 ****
  
  
  /*
!  * Uses mksafefunc/mkunsafefunc to create an anonymous sub whose text is
!  * supplied in s, and returns a reference to the closure.
   */
  static void
! plperl_create_sub(plperl_proc_desc *prodesc, char *s)
  {
  	dSP;
  	bool        trusted = prodesc->lanpltrusted;
! 	SV		   *subref;
! 	int			count;
! 	char	   *compile_sub;
  
  	ENTER;
  	SAVETMPS;
  	PUSHMARK(SP);
! 	XPUSHs(sv_2mortal(newSVstring("our $_TD; local $_TD=$_[0]; shift;")));
! 	XPUSHs(sv_2mortal(newSVstring(s)));
  	PUTBACK;
  
  	/*
--- 1096,1128 ----
  
  
  /*
!  * Uses mksafefunc/mkunsafefunc to create a subroutine whose text is
!  * supplied in s, and returns a reference to it
   */
  static void
! plperl_create_sub(plperl_proc_desc *prodesc, char *s, Oid fn_oid)
  {
  	dSP;
  	bool        trusted = prodesc->lanpltrusted;
! 	char        subname[NAMEDATALEN+40];
! 	HV         *pragma_hv = newHV();
! 	SV         *subref = NULL;
! 	int         count;
! 	char       *compile_sub;
! 
! 	sprintf(subname, "%s__%u", prodesc->proname, fn_oid);
! 
! 	if (plperl_use_strict)
! 		hv_store_string(pragma_hv, "strict", (SV*)newAV());
  
  	ENTER;
  	SAVETMPS;
  	PUSHMARK(SP);
! 	EXTEND(SP,4);
! 	PUSHs(sv_2mortal(newSVstring(subname)));
! 	PUSHs(sv_2mortal(newRV_noinc((SV*)pragma_hv)));
! 	PUSHs(sv_2mortal(newSVstring("our $_TD; local $_TD=shift;")));
! 	PUSHs(sv_2mortal(newSVstring(s)));
  	PUTBACK;
  
  	/*
*************** plperl_create_sub(plperl_proc_desc *prod
*** 999,1055 ****
  	 * errors properly.  Perhaps it's because there's another level of eval
  	 * inside mksafefunc?
  	 */
! 
! 	if (trusted && plperl_use_strict)
! 		compile_sub = "::mk_strict_safefunc";
! 	else if (plperl_use_strict)
! 		compile_sub = "::mk_strict_unsafefunc";
! 	else if (trusted)
! 		compile_sub = "::mksafefunc";
! 	else
! 		compile_sub = "::mkunsafefunc";
! 
  	count = perl_call_pv(compile_sub, G_SCALAR | G_EVAL | G_KEEPERR);
  	SPAGAIN;
  
! 	if (count != 1)
! 	{
! 		PUTBACK;
! 		FREETMPS;
! 		LEAVE;
! 		elog(ERROR, "didn't get a return item from mksafefunc");
  	}
  
! 	subref = POPs;
  
  	if (SvTRUE(ERRSV))
  	{
- 		PUTBACK;
- 		FREETMPS;
- 		LEAVE;
  		ereport(ERROR,
  				(errcode(ERRCODE_SYNTAX_ERROR),
  				 errmsg("%s", strip_trailing_ws(SvPV_nolen(ERRSV)))));
  	}
  
! 	if (!SvROK(subref) || SvTYPE(SvRV(subref)) != SVt_PVCV)
  	{
! 		PUTBACK;
! 		FREETMPS;
! 		LEAVE;
! 		elog(ERROR, "didn't get a code ref");
  	}
  
- 	/*
- 	 * need to make a copy of the return, it comes off the stack as a
- 	 * temporary.
- 	 */
  	prodesc->reference = newSVsv(subref);
  
- 	PUTBACK;
- 	FREETMPS;
- 	LEAVE;
- 
  	return;
  }
  
--- 1130,1165 ----
  	 * errors properly.  Perhaps it's because there's another level of eval
  	 * inside mksafefunc?
  	 */
! 	compile_sub = (trusted) ? "::mksafefunc" : "::mkunsafefunc";
  	count = perl_call_pv(compile_sub, G_SCALAR | G_EVAL | G_KEEPERR);
  	SPAGAIN;
  
! 	if (count == 1) {
! 		GV *sub_glob = (GV*)POPs;
! 		if (sub_glob && SvTYPE(sub_glob) == SVt_PVGV)
! 			subref = newRV((SV*)GvCVu((GV*)sub_glob));
  	}
  
! 	PUTBACK;
! 	FREETMPS;
! 	LEAVE;
  
  	if (SvTRUE(ERRSV))
  	{
  		ereport(ERROR,
  				(errcode(ERRCODE_SYNTAX_ERROR),
  				 errmsg("%s", strip_trailing_ws(SvPV_nolen(ERRSV)))));
  	}
  
! 	if (!subref)
  	{
! 		ereport(ERROR,
! 				(errcode(ERRCODE_INTERNAL_ERROR),
! 				 errmsg("didn't get a GLOB from compiling %s via %s", prodesc->proname, compile_sub)));
  	}
  
  	prodesc->reference = newSVsv(subref);
  
  	return;
  }
  
*************** plperl_call_perl_func(plperl_proc_desc *
*** 1089,1101 ****
  	SAVETMPS;
  
  	PUSHMARK(SP);
  
! 	XPUSHs(&PL_sv_undef);		/* no trigger data */
  
  	for (i = 0; i < desc->nargs; i++)
  	{
  		if (fcinfo->argnull[i])
! 			XPUSHs(&PL_sv_undef);
  		else if (desc->arg_is_rowtype[i])
  		{
  			HeapTupleHeader td;
--- 1199,1212 ----
  	SAVETMPS;
  
  	PUSHMARK(SP);
+ 	EXTEND(sp, 1 + desc->nargs);
  
! 	PUSHs(&PL_sv_undef);		/* no trigger data */
  
  	for (i = 0; i < desc->nargs; i++)
  	{
  		if (fcinfo->argnull[i])
! 			PUSHs(&PL_sv_undef);
  		else if (desc->arg_is_rowtype[i])
  		{
  			HeapTupleHeader td;
*************** plperl_call_perl_func(plperl_proc_desc *
*** 1115,1121 ****
  			tmptup.t_data = td;
  
  			hashref = plperl_hash_from_tuple(&tmptup, tupdesc);
! 			XPUSHs(sv_2mortal(hashref));
  			ReleaseTupleDesc(tupdesc);
  		}
  		else
--- 1226,1232 ----
  			tmptup.t_data = td;
  
  			hashref = plperl_hash_from_tuple(&tmptup, tupdesc);
! 			PUSHs(sv_2mortal(hashref));
  			ReleaseTupleDesc(tupdesc);
  		}
  		else
*************** plperl_call_perl_func(plperl_proc_desc *
*** 1125,1131 ****
  			tmp = OutputFunctionCall(&(desc->arg_out_func[i]),
  									 fcinfo->arg[i]);
  			sv = newSVstring(tmp);
! 			XPUSHs(sv_2mortal(sv));
  			pfree(tmp);
  		}
  	}
--- 1236,1242 ----
  			tmp = OutputFunctionCall(&(desc->arg_out_func[i]),
  									 fcinfo->arg[i]);
  			sv = newSVstring(tmp);
! 			PUSHs(sv_2mortal(sv));
  			pfree(tmp);
  		}
  	}
*************** compile_plperl_function(Oid fn_oid, bool
*** 1732,1738 ****
  
  		check_interp(prodesc->lanpltrusted);
  
! 		plperl_create_sub(prodesc, proc_source);
  
  		restore_context(oldcontext);
  
--- 1843,1849 ----
  
  		check_interp(prodesc->lanpltrusted);
  
! 		plperl_create_sub(prodesc, proc_source, fn_oid);
  
  		restore_context(oldcontext);
  
*************** plperl_hash_from_tuple(HeapTuple tuple, 
*** 1768,1773 ****
--- 1879,1885 ----
  	int			i;
  
  	hv = newHV();
+ 	hv_ksplit(hv, tupdesc->natts); /* pre-grow the hash */
  
  	for (i = 0; i < tupdesc->natts; i++)
  	{
*************** plperl_spi_execute_fetch_result(SPITuple
*** 1895,1900 ****
--- 2007,2013 ----
  		int			i;
  
  		rows = newAV();
+ 		av_extend(rows, processed);
  		for (i = 0; i < processed; i++)
  		{
  			row = plperl_hash_from_tuple(tuptable->vals[i], tuptable->tupdesc);
#2David E. Wheeler
david@kineticode.com
In reply to: Tim Bunce (#1)
Re: First feature patch for plperl - draft [PATCH]

On Dec 3, 2009, at 3:30 PM, Tim Bunce wrote:

- New GUC plperl.on_perl_init='...perl...' for admin use.
- New GUC plperl.on_trusted_init='...perl...' for plperl user use.
- New GUC plperl.on_untrusted_init='...perl...' for plperlu user use.

Since there is no documentation yet, how do these work, exactly? Or should I just wait for the docs?

- END blocks now run at backend exit (fixes bug #5066).
- Stored procedure subs are now given names ($name__$oid).
- More error checking and reporting.
- Warnings no longer have an extra newline in the NOTICE text.
- Various minor optimizations like pre-growing data structures.

Nice.

I'm working on adding tests and documentation now, meanwhile I'd very
much appreciate any feedback on the patch.

Tim.

p.s. Once this patch is complete I plan to work on patches that:
- add quote_literal and quote_identifier functions in C.

I expect you can just use the C versions in PostgreSQL. They're in utils/builtins.h, along with quote_nullable(), which might also be useful to add.

- generalize the Safe setup code to enable more control.
- formalize namespace usage, moving things out of main::

Nice.

- add a way to perform inter-sub calling (at least for simple cases).
- possibly rewrite _plperl_to_pg_array in C.

Sounds great, Tim. I'm not really qualified to say anything about the C code, but I'd be happy to try it out once there are docs.

Best,

David

#3Tim Bunce
Tim.Bunce@pobox.com
In reply to: David E. Wheeler (#2)
Re: First feature patch for plperl - draft [PATCH]

On Thu, Dec 03, 2009 at 04:53:47PM -0800, David E. Wheeler wrote:

On Dec 3, 2009, at 3:30 PM, Tim Bunce wrote:

- New GUC plperl.on_perl_init='...perl...' for admin use.
- New GUC plperl.on_trusted_init='...perl...' for plperl user use.
- New GUC plperl.on_untrusted_init='...perl...' for plperlu user use.

Since there is no documentation yet, how do these work, exactly? Or should I just wait for the docs?

The perl code in plperl.on_perl_init gets eval'd as soon as an
interpreter is created. That could be at server startup if
shared_preload_libraries is used. plperl.on_perl_init can only be set by
an admin (PGC_SUSET).

The perl code in plperl.on_trusted_init gets eval'd when an interpreter
is initialized into trusted mode, e.g., used for the plperl language.
The perl code is eval'd inside the Safe compartment.
plperl.on_trusted_init can be set by users but it's only useful if set
before the plperl interpreter is first used.

plperl.on_untrusted_init acts like plperl.on_trusted_init but for
plperlu code.

So, if all three were set then, before any perl stored procedure or DO
block is executed, the interpreter would have executed either
on_perl_init and then on_trusted_init (for plperl), or on_perl_init and
then on_untrusted_init (for plperlu).

- END blocks now run at backend exit (fixes bug #5066).
- Stored procedure subs are now given names ($name__$oid).
- More error checking and reporting.
- Warnings no longer have an extra newline in the NOTICE text.
- Various minor optimizations like pre-growing data structures.

Nice.

Thanks.

I'm working on adding tests and documentation now, meanwhile I'd very
much appreciate any feedback on the patch.

Tim.

p.s. Once this patch is complete I plan to work on patches that:
- add quote_literal and quote_identifier functions in C.

I expect you can just use the C versions in PostgreSQL. They're in utils/builtins.h,

That's my plan. (I've been discussing this and other issues with Andrew
Dunstan via IM.)

along with quote_nullable(), which might also be useful to add.

I was planning to build that behaviour into quote_literal since it fits
naturally into perl's idea of undef and mirrors DBI's quote() method.
So:
quote_literal(undef) => "NULL"
quote_literal('foo') => "'foo'"

- generalize the Safe setup code to enable more control.

Specifically control what gets loaded into the Compartment, what gets
shared with it (e.g. sharing *a & *b as a workaround for the sort bug),
and what class to use for Safe (to enable deeper changes if desired via
subclassing). Naturally all this is only possible for admin (via
plperl.on_perl_init).

- formalize namespace usage, moving things out of main::

Nice.

- add a way to perform inter-sub calling (at least for simple cases).

My current plan here is to use an SP::AUTOLOAD to handle loading and
dispatching. So calling SP::some_random_procedure(...) will trigger
SP::AUTOLOAD to try to resolve "some_random_procedure" to a particular
stored procedure. There are three tricky parts: handling polymorphism (at
least "well enough"), making autoloading of stored procedures work
inside Safe, making it fast. I think I have reasonable approaches for
those but I won't know for sure till I work on it.

- possibly rewrite _plperl_to_pg_array in C.

Sounds great, Tim. I'm not really qualified to say anything about the
C code, but I'd be happy to try it out once there are docs.

Great. Thanks David.

Tim.

#4Jeff
threshar@torgo.978.org
In reply to: Tim Bunce (#3)
Re: First feature patch for plperl - draft [PATCH]

On Dec 4, 2009, at 6:18 AM, Tim Bunce wrote:

- generalize the Safe setup code to enable more control.

Is there any possible way to enable "use strict;" for plperl (trusted)
modules?
I would love to have that feature. Sure does help cut down on bugs and
makes things nicer.

--
Jeff Trout <jeff@jefftrout.com>
http://www.stuarthamm.net/
http://www.dellsmartexitin.com/

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jeff (#4)
Re: First feature patch for plperl - draft [PATCH]

Jeff <threshar@threshar.is-a-geek.com> writes:

Is there any possible way to enable "use strict;" for plperl (trusted)
modules?

The plperl manual shows a way to do it using some weird syntax or
other. It'd sure be nice to be able to use the regular syntax though.

regards, tom lane

#6David E. Wheeler
david@kineticode.com
In reply to: Tim Bunce (#3)
Re: First feature patch for plperl - draft [PATCH]

On Dec 4, 2009, at 3:18 AM, Tim Bunce wrote:

The perl code in plperl.on_perl_init gets eval'd as soon as an
interpreter is created. That could be at server startup if
shared_preload_libraries is used. plperl.on_perl_init can only be set by
an admin (PGC_SUSET).

Are multiline GUCs allowed in the postgresql.conf file?

The perl code in plperl.on_trusted_init gets eval'd when an interpreter
is initialized into trusted mode, e.g., used for the plperl language.
The perl code is eval'd inside the Safe compartment.
plperl.on_trusted_init can be set by users but it's only useful if set
before the plperl interpreter is first used.

So immediately after connecting would be the place to make sure you do it, IOW.

plperl.on_untrusted_init acts like plperl.on_trusted_init but for
plperlu code.

So, if all three were set then, before any perl stored procedure or DO
block is executed, the interpreter would have executed either
on_perl_init and then on_trusted_init (for plperl), or on_perl_init and
then on_untrusted_init (for plperlu).

Awesome, thanks! This is really a great feature.

along with quote_nullable(), which might also be useful to add.

I was planning to build that behaviour into quote_literal since it fits
naturally into perl's idea of undef and mirrors DBI's quote() method.
So:
quote_literal(undef) => "NULL"
quote_literal('foo') => "'foo'"

Is there an existing `quote_literal()` in PL/Perl? If so, you might not want to change its behavior.

- generalize the Safe setup code to enable more control.

Specifically control what gets loaded into the Compartment, what gets
shared with it (e.g. sharing *a & *b as a workaround for the sort bug),
and what class to use for Safe (to enable deeper changes if desired via
subclassing). Naturally all this is only possible for admin (via
plperl.on_perl_init).

Sounds good.

- formalize namespace usage, moving things out of main::

Nice.

- add a way to perform inter-sub calling (at least for simple cases).

My current plan here is to use an SP::AUTOLOAD to handle loading and
dispatching. So calling SP::some_random_procedure(...) will trigger
SP::AUTOLOAD to try to resolve "some_random_procedure" to a particular
stored procedure. There are three tricky parts: handling polymorphism (at
least "well enough"), making autoloading of stored procedures work
inside Safe, making it fast. I think I have reasonable approaches for
those but I won't know for sure till I work on it.

I'm wondering if there might be some way to use some sort of attributes to identify data types passed to a PL/Perl function called from another PL/Perl function. Maybe some other functions that identify types, in the case of ambiguities?

foo(int(1), text('bar'));

? Kind of ugly, but perhaps only to be used if there are ambiguities? Not sure it's a great idea, mind. Just thinking out loud (so to speak).

Best,

David

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: David E. Wheeler (#6)
Re: First feature patch for plperl - draft [PATCH]

"David E. Wheeler" <david@kineticode.com> writes:

On Dec 4, 2009, at 3:18 AM, Tim Bunce wrote:

The perl code in plperl.on_perl_init gets eval'd as soon as an
interpreter is created. That could be at server startup if
shared_preload_libraries is used. plperl.on_perl_init can only be set by
an admin (PGC_SUSET).

Are multiline GUCs allowed in the postgresql.conf file?

I don't think so. In any case this seems like an extreme abuse of the
concept of a GUC, as well as being a solution in search of a problem,
as well as being something that should absolutely not ever happen inside
the postmaster process for both reliability and security reasons.
I vote a big no on this.

regards, tom lane

#8David E. Wheeler
david@kineticode.com
In reply to: Tom Lane (#7)
Re: First feature patch for plperl - draft [PATCH]

On Dec 4, 2009, at 10:36 AM, Tom Lane wrote:

Are multiline GUCs allowed in the postgresql.conf file?

I don't think so. In any case this seems like an extreme abuse of the
concept of a GUC, as well as being a solution in search of a problem,
as well as being something that should absolutely not ever happen inside
the postmaster process for both reliability and security reasons.
I vote a big no on this.

That's fine. It's relatively simple for an admin to create a Perl module that does everything she wants, call it PGInit or something, and then just make the GUC:

plperl.on_perl_init = 'use PGInit;'

Best,

David

#9Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#5)
Re: First feature patch for plperl - draft [PATCH]

Tom Lane wrote:

Jeff <threshar@threshar.is-a-geek.com> writes:

Is there any possible way to enable "use strict;" for plperl (trusted)
modules?

The plperl manual shows a way to do it using some weird syntax or
other. It'd sure be nice to be able to use the regular syntax though.

As is documented, all you have to do is have:

custom_variable_classes = 'plperl'
plperl.use_strict = 'true'

in your config. You only need to put the documented BEGIN block in your
function body if you want to do use strict mode on a case by case basis.

We can't allow an unrestricted "use strict;" in plperl functions because
it invokes an operation (require) that Safe.pm rightly regards as unsafe.

cheers

andrew

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: David E. Wheeler (#8)
Re: First feature patch for plperl - draft [PATCH]

"David E. Wheeler" <david@kineticode.com> writes:

On Dec 4, 2009, at 10:36 AM, Tom Lane wrote:

I vote a big no on this.

That's fine. It's relatively simple for an admin to create a Perl module that does everything she wants, call it PGInit or something, and then just make the GUC:

plperl.on_perl_init = 'use PGInit;'

No, you missed the point: I'm objecting to having any such thing as
plperl.on_perl_init, full stop.

Aside from the points I already made, it's not even well defined.
What is to happen if the admin changes the value when the system
is already up?

regards, tom lane

#11Jeff
threshar@torgo.978.org
In reply to: Andrew Dunstan (#9)
Re: First feature patch for plperl - draft [PATCH]

On Dec 4, 2009, at 1:44 PM, Andrew Dunstan wrote:

As is documented, all you have to do is have:

custom_variable_classes = 'plperl'
plperl.use_strict = 'true'

in your config. You only need to put the documented BEGIN block in
your function body if you want to do use strict mode on a case by
case basis.

We can't allow an unrestricted "use strict;" in plperl functions
because it invokes an operation (require) that Safe.pm rightly
regards as unsafe.

Yeah, saw that in the manual in the plperl functions & arguments page
(at the bottom).
I think my confusion came up because I'd read the trust/untrusted
thing which removes the ability to use use/require.

Maybe a blurb or moving that chunk of doc to the trusted/untrusted
page might make that tidbit easier to find?

--
Jeff Trout <jeff@jefftrout.com>
http://www.stuarthamm.net/
http://www.dellsmartexitin.com/

#12Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#10)
Re: First feature patch for plperl - draft [PATCH]

On Fri, Dec 4, 2009 at 1:51 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

"David E. Wheeler" <david@kineticode.com> writes:

On Dec 4, 2009, at 10:36 AM, Tom Lane wrote:

I vote a big no on this.

That's fine. It's relatively simple for an admin to create a Perl module that does everything she wants, call it PGInit or something, and then just make the GUC:

    plperl.on_perl_init = 'use PGInit;'

No, you missed the point: I'm objecting to having any such thing as
plperl.on_perl_init, full stop.

Aside from the points I already made, it's not even well defined.
What is to happen if the admin changes the value when the system
is already up?

So, do we look for another way to provide the functionality besides
having a GUC, or is the functionality itself bad?

...Robert

#13Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#12)
Re: First feature patch for plperl - draft [PATCH]

Robert Haas <robertmhaas@gmail.com> writes:

So, do we look for another way to provide the functionality besides
having a GUC, or is the functionality itself bad?

I don't think we want random Perl code running inside the postmaster,
no matter what the API to cause it is. I might hold my nose for "on
load" code if it can only run in backends, though I still say that
it's a badly designed concept because of the uncertainty about who
will run what when. Shlib load time is not an event that ought to be
user-visible.

regards, tom lane

#14David E. Wheeler
david@kineticode.com
In reply to: Tom Lane (#10)
Re: First feature patch for plperl - draft [PATCH]

On Dec 4, 2009, at 10:51 AM, Tom Lane wrote:

plperl.on_perl_init = 'use PGInit;'

No, you missed the point: I'm objecting to having any such thing as
plperl.on_perl_init, full stop.

Aside from the points I already made, it's not even well defined.
What is to happen if the admin changes the value when the system
is already up?

Nothing. Hence the "init".

Best,

David

#15David E. Wheeler
david@kineticode.com
In reply to: Tom Lane (#13)
Re: First feature patch for plperl - draft [PATCH]

On Dec 4, 2009, at 11:05 AM, Tom Lane wrote:

So, do we look for another way to provide the functionality besides
having a GUC, or is the functionality itself bad?

I don't think we want random Perl code running inside the postmaster,
no matter what the API to cause it is. I might hold my nose for "on
load" code if it can only run in backends, though I still say that
it's a badly designed concept because of the uncertainty about who
will run what when. Shlib load time is not an event that ought to be
user-visible.

So only the child processes would be allowed to load the code? That could make connections even slower if there's a lot of Perl code to be added, though that's also the issue we have today. I guess I could live with that, though I'd rather have such code shared across processes.

If it's a badly designed concept, do you have any ideas that are less bad?

Best,

David

#16Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#13)
Re: First feature patch for plperl - draft [PATCH]

On Fri, Dec 4, 2009 at 2:05 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

So, do we look for another way to provide the functionality besides
having a GUC, or is the functionality itself bad?

I don't think we want random Perl code running inside the postmaster,
no matter what the API to cause it is.  I might hold my nose for "on
load" code if it can only run in backends, though I still say that
it's a badly designed concept because of the uncertainty about who
will run what when.  Shlib load time is not an event that ought to be
user-visible.

I agree that the uncertainty is not a wonderful thing, but e.g. Apache
has the same problem with mod_perl, and you just deal with it. I
choose to deal with it by doing "apachectl graceful" every time I
change the source code; or you can install Perl modules that check
whether the mod-times on the other modules you've loaded have changed
and reload them if so. In practice, being able to pre-load the Perl
libraries you're going to want to execute is absolutely essential if
you don't want performance to be in the toilet. My code base is so
large now that it takes 3 or 4 seconds for Apache to pull it all in on
my crappy dev box, but it's blazingly fast once it's up and running.
Having that be something that happens on the production server only
once a week or once a month when I roll out a new release rather than
any more frequently is really important.

...Robert

#17Tim Bunce
Tim.Bunce@pobox.com
In reply to: Tom Lane (#5)
Re: First feature patch for plperl - draft [PATCH]

On Fri, Dec 04, 2009 at 11:01:42AM -0500, Tom Lane wrote:

Jeff <threshar@threshar.is-a-geek.com> writes:

Is there any possible way to enable "use strict;" for plperl (trusted)
modules?

The plperl manual shows a way to do it using some weird syntax or
other. It'd sure be nice to be able to use the regular syntax though.

Finding a solution is definitely on my list. I've spent a little time
exploring this already but haven't found a simple solution yet.

The neatest would have been overriding &CORE::GLOBAL::require but sadly
the Safe/Opcode mechanism takes priority over that and forbids compiling
code that does a use/require.

I may end up re-enabling the require opcode but redirecting it to run
some C code in plperl.c (the same 'opcode redirection' technique used by
my NYTProf profiler). That C code would only need to throw an exception
if the module hasn't been loaded already.

Tim.

#18Alvaro Herrera
alvherre@commandprompt.com
In reply to: David E. Wheeler (#15)
Re: First feature patch for plperl - draft [PATCH]

David E. Wheeler escribi�:

If it's a badly designed concept, do you have any ideas that are less bad?

I'm not sure that we want to duplicate this idea today, but in pltcl
there's a pltcl_modules table that is scanned on interpreter init and
loads user-defined code.

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

#19Tim Bunce
Tim.Bunce@pobox.com
In reply to: Tom Lane (#13)
Re: First feature patch for plperl - draft [PATCH]

On Fri, Dec 04, 2009 at 02:05:28PM -0500, Tom Lane wrote:

Robert Haas <robertmhaas@gmail.com> writes:

So, do we look for another way to provide the functionality besides
having a GUC, or is the functionality itself bad?

I don't think we want random Perl code running inside the postmaster,
no matter what the API to cause it is. I might hold my nose for "on
load" code if it can only run in backends, though I still say that
it's a badly designed concept because of the uncertainty about who
will run what when.

Robert's comparison with mod_perl is very apt. Preloading code gives
dramatic performance gains in production situations where there's a
significant codebase and connections are frequent.

The docs for plperl.on_perl_init could include a section relating to
it's use with shared_preload_libraries. That could document any issues
and caveats you feel are important.

Tim.

#20Dimitri Fontaine
dfontaine@hi-media.com
In reply to: Tim Bunce (#19)
Re: First feature patch for plperl - draft [PATCH]

Le 4 déc. 2009 à 20:40, Tim Bunce a écrit :

Robert's comparison with mod_perl is very apt. Preloading code gives
dramatic performance gains in production situations where there's a
significant codebase and connections are frequent.

How far do you go with using a connection pooler such as pgbouncer?

--
dim

#21David E. Wheeler
david@kineticode.com
In reply to: Tim Bunce (#19)
Re: First feature patch for plperl - draft [PATCH]

On Dec 4, 2009, at 11:40 AM, Tim Bunce wrote:

Robert's comparison with mod_perl is very apt. Preloading code gives
dramatic performance gains in production situations where there's a
significant codebase and connections are frequent.

The docs for plperl.on_perl_init could include a section relating to
it's use with shared_preload_libraries. That could document any issues
and caveats you feel are important.

+1

Tom, what's your objection to Shlib load time being user-visible?

Best,

David

#22Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#13)
Re: First feature patch for plperl - draft [PATCH]

Tom Lane wrote:

Robert Haas <robertmhaas@gmail.com> writes:

So, do we look for another way to provide the functionality besides
having a GUC, or is the functionality itself bad?

I don't think we want random Perl code running inside the postmaster,
no matter what the API to cause it is. I might hold my nose for "on
load" code if it can only run in backends, though I still say that
it's a badly designed concept because of the uncertainty about who
will run what when. Shlib load time is not an event that ought to be
user-visible.

But you can load an arbitrary shared lib inside the postmaster and it
can do what it likes, so I'm not clear that your caution is actually
saving us from much.

cheers

andrew

#23Tom Lane
tgl@sss.pgh.pa.us
In reply to: David E. Wheeler (#21)
Re: First feature patch for plperl - draft [PATCH]

"David E. Wheeler" <david@kineticode.com> writes:

Tom, what's your objection to Shlib load time being user-visible?

It's not really designed to be user-visible. Let me give you just
two examples:

* We call a plperl function for the first time in a session, causing
plperl.so to be loaded. Later the transaction fails and is rolled
back. If loading plperl.so caused some user-visible things to happen,
should those be rolled back? If so, how do we get perl to play along?
If not, how do we get postgres to play along?

* We call a plperl function for the first time in a session, causing
plperl.so to be loaded. This happens in the context of a superuser
calling a non-superuser security definer function, or perhaps vice
versa. Whose permissions apply to whatever the on_load code tries
to do? (Hint: every answer is wrong.)

That doesn't even begin to cover the problems with allowing any of
this to happen inside the postmaster. Recall that the postmaster
does not have any database access. Furthermore, it is a very long
established reliability principle around here that the postmaster
process should do as little as possible, because every thing that it
does creates another opportunity to have a nonrecoverable failure.
The postmaster can recover if a child crashes, but the other way
round, not so much.

regards, tom lane

#24Tim Bunce
Tim.Bunce@pobox.com
In reply to: Tom Lane (#23)
Re: First feature patch for plperl - draft [PATCH]

On Sat, Dec 05, 2009 at 01:21:22AM -0500, Tom Lane wrote:

"David E. Wheeler" <david@kineticode.com> writes:

Tom, what's your objection to Shlib load time being user-visible?

It's not really designed to be user-visible. Let me give you just
two examples:

* We call a plperl function for the first time in a session, causing
plperl.so to be loaded. Later the transaction fails and is rolled
back. If loading plperl.so caused some user-visible things to happen,
should those be rolled back?

No. Establishing initial state, no matter how that's triggered, is not
part of a transaction.

* We call a plperl function for the first time in a session, causing
plperl.so to be loaded. This happens in the context of a superuser
calling a non-superuser security definer function, or perhaps vice
versa. Whose permissions apply to whatever the on_load code tries
to do? (Hint: every answer is wrong.)

I'll modify the patch to disable the SPI functions during
initialization (both on_perl_init and on_(un)trusted_init).

Would that address your concerns?

That doesn't even begin to cover the problems with allowing any of
this to happen inside the postmaster. Recall that the postmaster
does not have any database access. Furthermore, it is a very long
established reliability principle around here that the postmaster
process should do as little as possible, because every thing that it
does creates another opportunity to have a nonrecoverable failure.
The postmaster can recover if a child crashes, but the other way
round, not so much.

I hope the combination of disabling the SPI functions during
initialization, and documenting the risks of combining on_perl_init and
shared_preload_libraries, is sufficient.

Tim.

#25Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tim Bunce (#24)
Re: First feature patch for plperl - draft [PATCH]

Tim Bunce <Tim.Bunce@pobox.com> writes:

I'll modify the patch to disable the SPI functions during
initialization (both on_perl_init and on_(un)trusted_init).

Yeah, in the shower this morning I was thinking that not loading
SPI till after the on_init code runs would alleviate the concerns
about transactionality and permissions --- that would ensure that
whatever on_init does affects only the Perl world and not the database
world.

However, we're not out of the woods yet. In a trusted interpreter
(plperl not plperlu), is the on_init code executed before we lock down
the interpreter with Safe? I would think it has to be since the main
point AFAICS is to let you preload code via "use". But then what is
left of the security guarantees of plperl? I can hardly imagine DBAs
wanting to vet a few thousand lines of random Perl code to see if it
contains anything that could be subverted. For example, the ability
to scribble on database files (like say pg_hba.conf) would almost surely
be easy to come by.

If you're willing to also confine the feature to plperlu, then maybe
the risk level could be decreased from insane to merely unreasonable.

regards, tom lane

#26Andrew Dunstan
andrew@dunslane.net
In reply to: Tim Bunce (#24)
Re: First feature patch for plperl - draft [PATCH]

Tim Bunce wrote:

That doesn't even begin to cover the problems with allowing any of
this to happen inside the postmaster. Recall that the postmaster
does not have any database access. Furthermore, it is a very long
established reliability principle around here that the postmaster
process should do as little as possible, because every thing that it
does creates another opportunity to have a nonrecoverable failure.
The postmaster can recover if a child crashes, but the other way
round, not so much.

I hope the combination of disabling the SPI functions during
initialization, and documenting the risks of combining on_perl_init and
shared_preload_libraries, is sufficient.

We already do a lot during library load - plperl's _PG_init() calls
plperl_init_interp() which sets up an interpreter, runs the boot code,
loads the Dynaloader and bootstraps the SPI module.

Pre-loading perl libraries in forking servers has well known benefits,
as Robert Haas noted.

We're not talking about touching the database at all.

If we turn Tim's proposal down, I suspect someone will create a fork of
plperl that allows it anyway - it's not like it needs anything changed
elsewhere in the backend - it would be a drop-in replacement, pretty much.

Here's a concrete example of something I was working on just yesterday,
where it would be useful. One of my clients has a Postgres based
application that needs to talk to a number of foreign databases, mostly
SQLServer. In some cases it pulls data from them, in this new case we
are pushing lots of data at arbitrary times into SQLServer, using
plperlu with DBI/DBD::Sybase. We would probably get a significant
performance gain if we could have DBI and DBD::Sybase preloaded. The
application does use connection pooling, but every so often a function
call will take significantly longer because it occurs in a new backend
that is having to reload the libraries.

I think if we do this the on_perl_init setting should probably be
PGC_POSTMASTER, which would remove any issue about it changing
underneath us.

cheers

andrew

#27Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#26)
Re: First feature patch for plperl - draft [PATCH]

Andrew Dunstan <andrew@dunslane.net> writes:

If we turn Tim's proposal down, I suspect someone will create a fork of
plperl that allows it anyway - it's not like it needs anything changed
elsewhere in the backend - it would be a drop-in replacement, pretty much.

The question is not about whether we think it's useful; the question
is about whether it's safe.

I think if we do this the on_perl_init setting should probably be
PGC_POSTMASTER, which would remove any issue about it changing
underneath us.

Yes, if the main intended usage is in combination with preloading perl
at postmaster start, it would be pointless to imagine that PGC_SIGHUP
is useful anyway.

regards, tom lane

#28Tim Bunce
Tim.Bunce@pobox.com
In reply to: Tom Lane (#25)
Re: First feature patch for plperl - draft [PATCH]

On Sat, Dec 05, 2009 at 11:41:36AM -0500, Tom Lane wrote:

Tim Bunce <Tim.Bunce@pobox.com> writes:

I'll modify the patch to disable the SPI functions during
initialization (both on_perl_init and on_(un)trusted_init).

Yeah, in the shower this morning I was thinking that not loading
SPI till after the on_init code runs would alleviate the concerns
about transactionality and permissions --- that would ensure that
whatever on_init does affects only the Perl world and not the database
world.

However, we're not out of the woods yet. In a trusted interpreter
(plperl not plperlu), is the on_init code executed before we lock down
the interpreter with Safe?

The on_perl_init code (PGC_SUSET) is run before Safe is loaded.

The on_trusted_init code (PGC_USERSET) is run inside Safe.

I would think it has to be since the main point AFAICS is to let you
preload code via "use".

The main use case being targeted at the moment for on_trusted_init
is setting values in %_SHARED, perhaps to enable debugging.

Inside Safe you'll only be able to 'use' modules that have already been
loaded inside Safe. In my draft patch that's currently just strict and
warnings.

(I am also adding an interface to enable DBAs to configure what gets
loaded into the Safe compartment and what gets shared with it.
That'll be the way extra modules can be used by plperl.
It'll be used via on_perl_init so be controlled via the DBA.)

I can hardly imagine DBAs wanting to vet a few thousand lines of
random Perl code to see if it contains anything that could be
subverted. For example, the ability to scribble on database files
(like say pg_hba.conf) would almost surely be easy to come by.

It's surely better to give the DBA that option than to remove the choice
entirely.

If you're willing to also confine the feature to plperlu, then maybe
the risk level could be decreased from insane to merely unreasonable.

I believe I can arrange for the SPI functions to be disabled during
on_*_init for both plperl and plperlu. Hopefully then the default risk
level will be better than unreasonable :)

Tim.

#29Alvaro Herrera
alvherre@commandprompt.com
In reply to: Tom Lane (#23)
Re: First feature patch for plperl - draft [PATCH]

Tom Lane escribi�:

"David E. Wheeler" <david@kineticode.com> writes:

Tom, what's your objection to Shlib load time being user-visible?

It's not really designed to be user-visible. Let me give you just
two examples:

* We call a plperl function for the first time in a session, causing
plperl.so to be loaded. Later the transaction fails and is rolled
back.

I don't think there's any way for this to work sanely unless the library
has been loaded previously. What about allowing those settings only if
plperl is specified in shared_preload_libraries?

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support