Add on_perl_init and proper destruction to plperl [PATCH]

Started by Tim Buncealmost 16 years ago44 messages
#1Tim Bunce
Tim.Bunce@pobox.com
1 attachment(s)

This is the third of the patches to be split out from the former 'plperl
feature patch 1'.

Changes in this patch:

- Added plperl.on_perl_init GUC for DBA use (PGC_SIGHUP)
SPI functions are not available when the code is run.

- Added normal interpreter destruction behaviour
END blocks, if any, are run then objects are
destroyed, calling their DESTROY methods, if any.
SPI functions will die if called at this time.

Tim.

Attachments:

plperl-initend.patchtext/x-patch; charset=us-asciiDownload
diff --git a/doc/src/sgml/plperl.sgml b/doc/src/sgml/plperl.sgml
index 6fee031..0054f5a 100644
*** a/doc/src/sgml/plperl.sgml
--- b/doc/src/sgml/plperl.sgml
*************** CREATE TRIGGER test_valid_id_trig
*** 1030,1036 ****
    </para>
   </sect1>
  
!  <sect1 id="plperl-missing">
    <title>Limitations and Missing Features</title>
  
    <para>
--- 1030,1100 ----
    </para>
   </sect1>
  
!  <sect1 id="plperl-under-the-hood">
!   <title>PL/Perl Under the Hood</title>
! 
!  <sect2 id="plperl-config">
!   <title>Configuration</title>
! 
!   <para>
!   This section lists configuration parameters that affect <application>PL/Perl</>.
!   To set any of these parameters before <application>PL/Perl</> has been loaded,
!   it is necessary to have added <quote><literal>plperl</></> to the
!   <xref linkend="guc-custom-variable-classes"> list in
!   <filename>postgresql.conf</filename>.
!   </para>
! 
!   <variablelist>
! 
!      <varlistentry id="guc-plperl-on-perl-init" xreflabel="plperl.on_perl_init">
!       <term><varname>plperl.on_perl_init</varname> (<type>string</type>)</term>
!       <indexterm>
!        <primary><varname>plperl.on_perl_init</> configuration parameter</primary>
!       </indexterm>
!       <listitem>
!        <para>
!        Specifies perl code to be executed when a perl interpreter is first initialized.
!        The SPI functions are not available when this code is executed.
!        If the code fails with an error it will abort the initialization of the interpreter
!        and propagate out to the calling query, causing the current transaction
!        or subtransaction to be aborted.
!        </para>
!        <para>
! 	   The perl code is limited to a single string. Longer code can be placed
! 	   into a module and loaded by the <literal>on_perl_init</> string.
! 	   Examples:
! <programlisting>
! plplerl.on_perl_init = '$ENV{NYTPROF}="start=no"; require Devel::NYTProf::PgPLPerl'
! plplerl.on_perl_init = 'use lib "/my/app"; use MyApp::PgInit;'
! </programlisting>
!        </para>
!        <para>
!        Initialization will happen in the postmaster if the plperl library is included
!        in <literal>shared_preload_libraries</> (see <xref linkend="shared_preload_libraries">),
!        in which case extra consideration should be given to the risk of destabilizing the postmaster.
!        </para>
!        <para>
!        This parameter can only be set in the postgresql.conf file or on the server command line.
!        </para>
!       </listitem>
!      </varlistentry>
! 
!      <varlistentry id="guc-plperl-use-strict" xreflabel="plperl.use_strict">
!       <term><varname>plperl.use_strict</varname> (<type>boolean</type>)</term>
!       <indexterm>
!        <primary><varname>plperl.use_strict</> configuration parameter</primary>
!       </indexterm>
!       <listitem>
!        <para>
!        When set true subsequent compilations of PL/Perl functions have the <literal>strict</> pragma enabled.
!        This parameter does not affect functions already compiled in the current session.
!        </para>
!       </listitem>
!      </varlistentry>
! 
!   </variablelist>
! 
!  <sect2 id="plperl-missing">
    <title>Limitations and Missing Features</title>
  
    <para>
diff --git a/src/pl/plperl/plc_perlboot.pl b/src/pl/plperl/plc_perlboot.pl
index 769721d..5f6ae91 100644
*** a/src/pl/plperl/plc_perlboot.pl
--- b/src/pl/plperl/plc_perlboot.pl
***************
*** 1,5 ****
  PostgreSQL::InServer::Util::bootstrap();
- PostgreSQL::InServer::SPI::bootstrap();
  
  use strict;
  use warnings;
--- 1,4 ----
diff --git a/src/pl/plperl/plperl.c b/src/pl/plperl/plperl.c
index 9277072..8315d5a 100644
*** a/src/pl/plperl/plperl.c
--- b/src/pl/plperl/plperl.c
*************** static HTAB *plperl_proc_hash = NULL;
*** 138,143 ****
--- 138,145 ----
  static HTAB *plperl_query_hash = NULL;
  
  static bool plperl_use_strict = false;
+ static char *plperl_on_perl_init = NULL;
+ static bool plperl_ending = false;
  
  /* this is saved and restored by plperl_call_handler */
  static plperl_call_data *current_call_data = NULL;
*************** Datum		plperl_validator(PG_FUNCTION_ARGS
*** 151,156 ****
--- 153,160 ----
  void		_PG_init(void);
  
  static PerlInterpreter *plperl_init_interp(void);
+ static void plperl_destroy_interp(PerlInterpreter **);
+ static void plperl_fini(void);
  
  static Datum plperl_func_handler(PG_FUNCTION_ARGS);
  static Datum plperl_trigger_handler(PG_FUNCTION_ARGS);
*************** _PG_init(void)
*** 237,242 ****
--- 241,254 ----
  							 PGC_USERSET, 0,
  							 NULL, NULL);
  
+ 	DefineCustomStringVariable("plperl.on_perl_init",
+ 							gettext_noop("Perl code to execute when the perl interpreter is initialized."),
+ 							NULL,
+ 							&plperl_on_perl_init,
+ 							NULL,
+ 							PGC_SIGHUP, 0,
+ 							NULL, NULL);
+ 
  	EmitWarningsOnPlaceholders("plperl");
  
  	MemSet(&hash_ctl, 0, sizeof(hash_ctl));
*************** _PG_init(void)
*** 261,266 ****
--- 273,293 ----
  	inited = true;
  }
  
+ 
+ /*
+  * Cleanup perl interpreters, including running END blocks.
+  * Does not fully undo the actions of _PG_init() nor make it callable again.
+  */
+ static void
+ plperl_fini(void)
+ {
+ 	plperl_ending = true;
+ 	plperl_destroy_interp(&plperl_trusted_interp);
+ 	plperl_destroy_interp(&plperl_untrusted_interp);
+ 	plperl_destroy_interp(&plperl_held_interp);
+ }
+ 
+ 
  #define SAFE_MODULE \
  	"require Safe; $Safe::VERSION"
  
*************** _PG_init(void)
*** 277,282 ****
--- 304,311 ----
  static void
  select_perl_context(bool trusted)
  {
+ 	EXTERN_C void boot_PostgreSQL__InServer__SPI(pTHX_ CV *cv);
+ 
  	/*
  	 * handle simple cases
  	 */
*************** select_perl_context(bool trusted)
*** 288,293 ****
--- 317,326 ----
  	 */
  	if (interp_state == INTERP_HELD)
  	{
+ 		/* first actual use of a perl interpreter */
+ 
+ 		atexit(plperl_fini);
+ 
  		if (trusted)
  		{
  			plperl_trusted_interp = plperl_held_interp;
*************** select_perl_context(bool trusted)
*** 325,330 ****
--- 358,379 ----
  		plperl_safe_init();
  		PL_ppaddr[OP_REQUIRE] = pp_require_safe;
  	}
+ 
+ 	/*
+ 	 * enable access to the database
+ 	 */
+ 	newXS("PostgreSQL::InServer::SPI::bootstrap",
+ 		boot_PostgreSQL__InServer__SPI, __FILE__);
+ 
+ 	eval_pv("PostgreSQL::InServer::SPI::bootstrap()", FALSE);
+ 	if (SvTRUE(ERRSV))
+ 	{
+ 		ereport(ERROR,
+ 			(errcode(ERRCODE_INTERNAL_ERROR),
+ 			errmsg("%s", strip_trailing_ws(SvPV_nolen(ERRSV))),
+ 			errdetail("While executing PostgreSQL::InServer::SPI::bootstrap")));
+ 	}
+ 
  }
  
  /*
*************** plperl_init_interp(void)
*** 361,371 ****
  	PerlInterpreter *plperl;
  	static int perl_sys_init_done;
  
! 	static char *embedding[3] = {
  		"", "-e", PLC_PERLBOOT
  	};
  	int			nargs = 3;
  
  #ifdef WIN32
  
  	/*
--- 410,426 ----
  	PerlInterpreter *plperl;
  	static int perl_sys_init_done;
  
! 	static char *embedding[3+2] = {
  		"", "-e", PLC_PERLBOOT
  	};
  	int			nargs = 3;
  
+ 	if (plperl_on_perl_init)
+ 	{
+ 		embedding[nargs++] = "-e";
+ 		embedding[nargs++] = plperl_on_perl_init;
+ 	}
+ 
  #ifdef WIN32
  
  	/*
*************** plperl_init_interp(void)
*** 437,442 ****
--- 492,500 ----
  	PERL_SET_CONTEXT(plperl);
  	perl_construct(plperl);
  
+ 	/* run END blocks in perl_destruct instead of perl_run */
+ 	PL_exit_flags |= PERL_EXIT_DESTRUCT_END;
+ 
  	/*
  	 * Record the original function for the 'require' opcode.
  	 * Ensure it's used for new interpreters.
*************** plperl_init_interp(void)
*** 446,454 ****
  	else
  		PL_ppaddr[OP_REQUIRE] = pp_require_orig;
  
! 	perl_parse(plperl, plperl_init_shared_libs,
! 			   nargs, embedding, NULL);
! 	perl_run(plperl);
  
  #ifdef WIN32
  
--- 504,521 ----
  	else
  		PL_ppaddr[OP_REQUIRE] = pp_require_orig;
  
! 	if (perl_parse(plperl, plperl_init_shared_libs,
! 			   nargs, embedding, NULL) != 0)
! 		ereport(ERROR,
! 			(errcode(ERRCODE_INTERNAL_ERROR),
! 				errmsg("while parsing perl initialization"),
! 				errdetail("%s", strip_trailing_ws(SvPV_nolen(ERRSV))) ));
! 
! 	if (perl_run(plperl) != 0)
! 		ereport(ERROR,
! 			(errcode(ERRCODE_INTERNAL_ERROR),
! 				errmsg("while running perl initialization"),
! 				errdetail("%s", strip_trailing_ws(SvPV_nolen(ERRSV))) ));
  
  #ifdef WIN32
  
*************** pp_require_safe(pTHX)
*** 524,529 ****
--- 591,608 ----
  
  
  static void
+ plperl_destroy_interp(PerlInterpreter **interp)
+ {
+ 	if (interp && *interp)
+ 	{
+ 		perl_destruct(*interp);
+ 		perl_free(*interp);
+ 		*interp = NULL;
+ 	}
+ }
+ 
+ 
+ static void
  plperl_safe_init(void)
  {
  	SV		   *safe_version_sv;
*************** plperl_safe_init(void)
*** 544,551 ****
  		{
  			ereport(ERROR,
  				(errcode(ERRCODE_INTERNAL_ERROR),
! 				 errmsg("%s", strip_trailing_ws(SvPV_nolen(ERRSV))),
! 				 errdetail("While executing PLC_SAFE_BAD")));
  		}
  
  	}
--- 623,630 ----
  		{
  			ereport(ERROR,
  				(errcode(ERRCODE_INTERNAL_ERROR),
! 				 errmsg("while executing PLC_SAFE_BAD"),
! 				 errdetail("%s", strip_trailing_ws(SvPV_nolen(ERRSV))) ));
  		}
  
  	}
*************** plperl_safe_init(void)
*** 556,563 ****
  		{
  			ereport(ERROR,
  				(errcode(ERRCODE_INTERNAL_ERROR),
! 				 errmsg("%s", strip_trailing_ws(SvPV_nolen(ERRSV))),
! 				 errdetail("While executing PLC_SAFE_OK")));
  		}
  
  		if (GetDatabaseEncoding() == PG_UTF8)
--- 635,642 ----
  		{
  			ereport(ERROR,
  				(errcode(ERRCODE_INTERNAL_ERROR),
! 				 errmsg("while executing PLC_SAFE_OK"),
! 				 errdetail("%s", strip_trailing_ws(SvPV_nolen(ERRSV))) ));
  		}
  
  		if (GetDatabaseEncoding() == PG_UTF8)
*************** plperl_create_sub(plperl_proc_desc *prod
*** 1150,1167 ****
   *
   **********************************************************************/
  
- EXTERN_C void boot_DynaLoader(pTHX_ CV *cv);
- EXTERN_C void boot_PostgreSQL__InServer__SPI(pTHX_ CV *cv);
- EXTERN_C void boot_PostgreSQL__InServer__Util(pTHX_ CV *cv);
- 
  static void
  plperl_init_shared_libs(pTHX)
  {
  	char	   *file = __FILE__;
  
  	newXS("DynaLoader::boot_DynaLoader", boot_DynaLoader, file);
- 	newXS("PostgreSQL::InServer::SPI::bootstrap",
- 		  boot_PostgreSQL__InServer__SPI, file);
  	newXS("PostgreSQL::InServer::Util::bootstrap",
  		boot_PostgreSQL__InServer__Util, file);
  }
--- 1229,1242 ----
   *
   **********************************************************************/
  
  static void
  plperl_init_shared_libs(pTHX)
  {
  	char	   *file = __FILE__;
+ 	EXTERN_C void boot_DynaLoader(pTHX_ CV *cv);
+ 	EXTERN_C void boot_PostgreSQL__InServer__Util(pTHX_ CV *cv);
  
  	newXS("DynaLoader::boot_DynaLoader", boot_DynaLoader, file);
  	newXS("PostgreSQL::InServer::Util::bootstrap",
  		boot_PostgreSQL__InServer__Util, file);
  }
*************** plperl_hash_from_tuple(HeapTuple tuple, 
*** 1897,1902 ****
--- 1972,1987 ----
  }
  
  
+ static void
+ check_spi_usage_allowed()
+ {
+ 	if (plperl_ending) {
+ 		/* simple croak as we don't want to involve PostgreSQL code */
+ 		croak("SPI functions can not be used in END blocks");
+ 	}
+ }
+ 
+ 
  HV *
  plperl_spi_exec(char *query, int limit)
  {
*************** plperl_spi_exec(char *query, int limit)
*** 1909,1914 ****
--- 1994,2001 ----
  	MemoryContext oldcontext = CurrentMemoryContext;
  	ResourceOwner oldowner = CurrentResourceOwner;
  
+ 	check_spi_usage_allowed();
+ 
  	BeginInternalSubTransaction(NULL);
  	/* Want to run inside function's memory context */
  	MemoryContextSwitchTo(oldcontext);
*************** plperl_spi_execute_fetch_result(SPITuple
*** 1972,1977 ****
--- 2059,2066 ----
  {
  	HV		   *result;
  
+ 	check_spi_usage_allowed();
+ 
  	result = newHV();
  
  	hv_store_string(result, "status",
*************** plperl_spi_query(char *query)
*** 2145,2150 ****
--- 2234,2241 ----
  	MemoryContext oldcontext = CurrentMemoryContext;
  	ResourceOwner oldowner = CurrentResourceOwner;
  
+ 	check_spi_usage_allowed();
+ 
  	BeginInternalSubTransaction(NULL);
  	/* Want to run inside function's memory context */
  	MemoryContextSwitchTo(oldcontext);
*************** plperl_spi_fetchrow(char *cursor)
*** 2223,2228 ****
--- 2314,2321 ----
  	MemoryContext oldcontext = CurrentMemoryContext;
  	ResourceOwner oldowner = CurrentResourceOwner;
  
+ 	check_spi_usage_allowed();
+ 
  	BeginInternalSubTransaction(NULL);
  	/* Want to run inside function's memory context */
  	MemoryContextSwitchTo(oldcontext);
*************** plperl_spi_fetchrow(char *cursor)
*** 2297,2303 ****
  void
  plperl_spi_cursor_close(char *cursor)
  {
! 	Portal		p = SPI_cursor_find(cursor);
  
  	if (p)
  		SPI_cursor_close(p);
--- 2390,2400 ----
  void
  plperl_spi_cursor_close(char *cursor)
  {
! 	Portal		p;
! 
! 	check_spi_usage_allowed();
! 
! 	p = SPI_cursor_find(cursor);
  
  	if (p)
  		SPI_cursor_close(p);
*************** plperl_spi_prepare(char *query, int argc
*** 2315,2320 ****
--- 2412,2419 ----
  	MemoryContext oldcontext = CurrentMemoryContext;
  	ResourceOwner oldowner = CurrentResourceOwner;
  
+ 	check_spi_usage_allowed();
+ 
  	BeginInternalSubTransaction(NULL);
  	MemoryContextSwitchTo(oldcontext);
  
*************** plperl_spi_exec_prepared(char *query, HV
*** 2450,2455 ****
--- 2549,2556 ----
  	MemoryContext oldcontext = CurrentMemoryContext;
  	ResourceOwner oldowner = CurrentResourceOwner;
  
+ 	check_spi_usage_allowed();
+ 
  	BeginInternalSubTransaction(NULL);
  	/* Want to run inside function's memory context */
  	MemoryContextSwitchTo(oldcontext);
*************** plperl_spi_query_prepared(char *query, i
*** 2592,2597 ****
--- 2693,2700 ----
  	MemoryContext oldcontext = CurrentMemoryContext;
  	ResourceOwner oldowner = CurrentResourceOwner;
  
+ 	check_spi_usage_allowed();
+ 
  	BeginInternalSubTransaction(NULL);
  	/* Want to run inside function's memory context */
  	MemoryContextSwitchTo(oldcontext);
*************** plperl_spi_freeplan(char *query)
*** 2715,2720 ****
--- 2818,2825 ----
  	plperl_query_desc *qdesc;
  	plperl_query_entry *hash_entry;
  
+ 	check_spi_usage_allowed();
+ 
  	hash_entry = hash_search(plperl_query_hash, query,
  							 HASH_FIND, NULL);
  	if (hash_entry == NULL)
diff --git a/src/pl/plperl/sql/plperl_end.sql b/src/pl/plperl/sql/plperl_end.sql
index ...d4f1444 .
*** a/src/pl/plperl/sql/plperl_end.sql
--- b/src/pl/plperl/sql/plperl_end.sql
***************
*** 0 ****
--- 1,18 ----
+ -- test END block handling
+ 
+ -- Not included in the normal testing
+ -- because it's beyond the scope of the test harness.
+ -- Available here for manual developer testing.
+ 
+ DO $do$
+ 	open my $fh, ">/tmp/plperl_end.$$.log";
+ 	$SIG{__WARN__} = sub { printf $fh "Warn: @_" };
+ 	$SIG{__DIE__}  = sub { printf $fh "Die: @_"; die @_ };
+ 	END {
+ 		warn "end\n";
+ 		eval {
+ 			spi_exec_query("select 1");
+ 		};
+ 		warn "spi_exec_query: $@";
+ 	}
+ $do$ language plperlu;
diff --git a/src/pl/plperl/sql/plperl_plperlu.sql b/src/pl/plperl/sql/plperl_plperlu.sql
index fc2bb7b..15b5aa2 100644
*** a/src/pl/plperl/sql/plperl_plperlu.sql
--- b/src/pl/plperl/sql/plperl_plperlu.sql
*************** $$ LANGUAGE plperlu; -- compile plperlu 
*** 16,19 ****
  SELECT * FROM bar(); -- throws exception normally (running plperl)
  SELECT * FROM foo(); -- used to cause backend crash (after switching to plperlu)
  
- 
--- 16,18 ----
#2Andrew Dunstan
andrew@dunslane.net
In reply to: Tim Bunce (#1)
Re: Add on_perl_init and proper destruction to plperl [PATCH]

Tim Bunce wrote:

- Added plperl.on_perl_init GUC for DBA use (PGC_SIGHUP)
SPI functions are not available when the code is run.

- Added normal interpreter destruction behaviour
END blocks, if any, are run then objects are
destroyed, calling their DESTROY methods, if any.
SPI functions will die if called at this time.

OK, we've made good progress with the PL/Perl patches, and this one is
next on the queue.

It should also be noted that as proposed END blocks will not run at all
in the postmaster, even if perl is preloaded in the postmaster and the
preloaded code sets END handlers. That makes setting them rather safer,
ISTM.

So, are there still objections to applying this patch?

(Note, this is different from the proposal to specify on_trusted_init
and on_untrusted_init handlers. The on_perl_init handler would be run on
library load, and is mainly for the purpose of preloading perl modules
and the like).

cheers

andrew

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#2)
Re: Add on_perl_init and proper destruction to plperl [PATCH]

Andrew Dunstan <andrew@dunslane.net> writes:

Tim Bunce wrote:

- Added plperl.on_perl_init GUC for DBA use (PGC_SIGHUP)
SPI functions are not available when the code is run.

- Added normal interpreter destruction behaviour
END blocks, if any, are run then objects are
destroyed, calling their DESTROY methods, if any.
SPI functions will die if called at this time.

So, are there still objections to applying this patch?

Yes.

regards, tom lane

#4Alex Hunsaker
badalex@gmail.com
In reply to: Tom Lane (#3)
Re: Add on_perl_init and proper destruction to plperl [PATCH]

On Tue, Jan 26, 2010 at 23:14, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Andrew Dunstan <andrew@dunslane.net> writes:

Tim Bunce wrote:

- Added plperl.on_perl_init GUC for DBA use (PGC_SIGHUP)
SPI functions are not available when the code is run.

- Added normal interpreter destruction behaviour
END blocks, if any, are run then objects are
destroyed, calling their DESTROY methods, if any.
SPI functions will die if called at this time.

So, are there still objections to applying this patch?

Yes.

FWIW the atexit scares me to. I was thinking a good workaround
perhaps would be to provide a function that destroys the interpreter
(so that the END blocks get called). Tim would that work OK ? If we
are still worried about that hanging we can probably do something
hacky with alarm() and/or signals...

Maybe a good solid use case will help figure this out? Im assuming
the current one is to profile plperl functions and dump a prof file in
/tmp/ or some such (which happens at END time). Or did I miss the use
case in one of the other threads?

#5Tim Bunce
Tim.Bunce@pobox.com
In reply to: Alex Hunsaker (#4)
Re: Add on_perl_init and proper destruction to plperl [PATCH]

On Wed, Jan 27, 2010 at 12:46:42AM -0700, Alex Hunsaker wrote:

On Tue, Jan 26, 2010 at 23:14, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Andrew Dunstan <andrew@dunslane.net> writes:

Tim Bunce wrote:

- Added plperl.on_perl_init GUC for DBA use (PGC_SIGHUP)
SPI functions are not available when the code is run.

- Added normal interpreter destruction behaviour
END blocks, if any, are run then objects are
destroyed, calling their DESTROY methods, if any.
SPI functions will die if called at this time.

So, are there still objections to applying this patch?

Yes.

FWIW the atexit scares me to.

In what way, specifically?

I understand concerns about interacting with the database, so the
patch ensures that any use of spi functions throws an exception.

I don't recall any other concrete concerns.

Specifically, how is code that starts executing at the end of a session
different in risk to code that starts executing before the end of a session?

DO $$ while (1) { } $$ language plperl;

Tim.

#6Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#3)
Re: Add on_perl_init and proper destruction to plperl [PATCH]

Tom Lane wrote:

Andrew Dunstan <andrew@dunslane.net> writes:

Tim Bunce wrote:

- Added plperl.on_perl_init GUC for DBA use (PGC_SIGHUP)
SPI functions are not available when the code is run.

- Added normal interpreter destruction behaviour
END blocks, if any, are run then objects are
destroyed, calling their DESTROY methods, if any.
SPI functions will die if called at this time.

So, are there still objections to applying this patch?

Yes.

I see I asked the wrong question. Start again.

What more should be done to make all or some of it acceptable?

cheers

andrew

#7Tim Bunce
Tim.Bunce@pobox.com
In reply to: Tom Lane (#3)
Re: Add on_perl_init and proper destruction to plperl [PATCH]

On Wed, Jan 27, 2010 at 01:14:16AM -0500, Tom Lane wrote:

Andrew Dunstan <andrew@dunslane.net> writes:

Tim Bunce wrote:

- Added plperl.on_perl_init GUC for DBA use (PGC_SIGHUP)
SPI functions are not available when the code is run.

- Added normal interpreter destruction behaviour
END blocks, if any, are run then objects are
destroyed, calling their DESTROY methods, if any.
SPI functions will die if called at this time.

So, are there still objections to applying this patch?

Yes.

To focus the discussion I've looked back through all the messages from
you that relate to this issue so I can summarize and try to address your
objections.

Some I've split or presented out of order, most relate to earlier (less
restricted) versions of the patch before it was split out, and naturally
they are lacking some context, so I've included archive URLs.

Please forgive and correct me if I misrepresent you or your intent here.

Regarding the utility of plperl.on_perl_init and END:

http://archives.postgresql.org/message-id/18338.1260033447@sss.pgh.pa.us
The question is not about whether we think it's useful; the question
is about whether it's safe.

I agree.

Regarding visibility of changes to plperl.on_perl_init:

http://archives.postgresql.org/message-id/28618.1259952660@sss.pgh.pa.us
What is to happen if the admin changes the value when the system is already up?

If a GUC could be defined as PGC_BACKEND and only settable by superuser,
perhaps that would be a good fit. [GucContext seems to conflate some things.]
Meanwhile the _init name is meant to convey the fact that it's a
before-first-use GUC, like temp_buffers.

I'm happy to accept whatever you'd recommend by way of PGC_* GUC selection.
Documentation can note any caveats associated with combining
plperl.on_perl_init with shared_preload_libraries.

http://archives.postgresql.org/message-id/4516.1263168347@sss.pgh.pa.us
However, I think PGC_SIGHUP would be enough to address my basic
worry, which is that people shouldn't be depending on the ability to set
these things within an individual session.

The patch uses PGC_SIGHUP for plperl.on_perl_init.

http://archives.postgresql.org/message-id/8950.1259994082@sss.pgh.pa.us

Tom, what's your objection to Shlib load time being user-visible?

It's not really designed to be user-visible. Let me give you just
two examples:

* We call a plperl function for the first time in a session, causing
plperl.so to be loaded. Later the transaction fails and is rolled
back. If loading plperl.so caused some user-visible things to happen,
should those be rolled back? If so, how do we get perl to play along?
If not, how do we get postgres to play along?

I believe that's addressed by spi functions being disabled when init code runs.

* We call a plperl function for the first time in a session, causing
plperl.so to be loaded. This happens in the context of a superuser
calling a non-superuser security definer function, or perhaps vice
versa. Whose permissions apply to whatever the on_load code tries
to do? (Hint: every answer is wrong.)

I think that related to on_*trusted_init not plperl.on_perl_init, and
is also addressed by spi functions being disabled when init code runs.

That doesn't even begin to cover the problems with allowing any of
this to happen inside the postmaster. Recall that the postmaster
does not have any database access.

I believe that's addressed by spi functions being disabled when init code runs.

Furthermore, it is a very long
established reliability principle around here that the postmaster
process should do as little as possible, because every thing that it
does creates another opportunity to have a nonrecoverable failure.
The postmaster can recover if a child crashes, but the other way
round, not so much.

I understand that concern. Ultimately, though, that comes down to the
judgement of DBAs and the trust placed in them. They can already
load arbitrary code via shared_preload_libraries.

http://archives.postgresql.org/message-id/18338.1260033447@sss.pgh.pa.us

I think if we do this the on_perl_init setting should probably be
PGC_POSTMASTER, which would remove any issue about it changing
underneath us.

Yes, if the main intended usage is in combination with preloading perl
at postmaster start, it would be pointless to imagine that PGC_SIGHUP
is useful anyway.

http://archives.postgresql.org/message-id/17793.1260031296@sss.pgh.pa.us
Yeah, in the shower this morning I was thinking that not loading
SPI till after the on_init code runs would alleviate the concerns
about transactionality and permissions --- that would ensure that
whatever on_init does affects only the Perl world and not the database
world.

That's included in the current patch (and also applies to END blocks).

However, we're not out of the woods yet. In a trusted interpreter
(plperl not plperlu), is the on_init code executed before we lock down
the interpreter with Safe? I would think it has to be since the main
point AFAICS is to let you preload code via "use". But then what is
left of the security guarantees of plperl? I can hardly imagine DBAs
wanting to vet a few thousand lines of random Perl code to see if it
contains anything that could be subverted.

plperl.on_perl_init code, set by the DBA, runs before the Safe
compartment is created. Without explicitextra steps the Safe
compartment has no access to code loaded by plperl.on_perl_init.

The Safe compartment (plperl) could get access to loaded code in one of
these ways:
1. by using SQL to call a plperlu function that accesses the code.
2. by the DBA 'sharing' a specific subroutine with the compartment.
3. by the DBA loading a module into the compartment.

There's no formal interface for 2. and 3. at the moment, so the only
official option is 1. (The final patch in the series includes some
building blocks towards an interface for 2 & 3, but doesn't add one.)

If you're willing to also confine the feature to plperlu, then maybe
the risk level could be decreased from insane to merely unreasonable.

I think you could reasonably describe plperl.on_perl_init as effectively
confined to plperlu (because plperl has no access to any new code).

http://archives.postgresql.org/message-id/26766.1263149361@sss.pgh.pa.us
For the record, I think it's a bad idea to run arbitrary
user-defined code in the postmaster, and I think it's a worse idea to
run arbitrary user-defined code at backend shutdown (the END-blocks bit).
I do not care in the least what applications you think this might
enable --- the negative consequences for overall system stability seem
to me to outweigh any possible arguments on that side.
- What happens when the supplied code has errors,

For on_perl_init it throws an exception that propagates to the user
statement that triggered the initialization of perl. It also ensures
that perl is left in a non-initialized state, so any further uses
also fail.

For END blocks an error triggers an exception that's caught by perl.

(As noted above, there's no access to postgres from init or END code.)

- takes an unreasonable amount of time to run,

Unreasonable is in the eye of the DBA, of course, and they
have the discretion to set on_perl_init to fit their needs.

For END blocks, I don't see how this issue is any different from
"users might do something dumb", like DO 'while(1){}' language plperl;
(or plpython , pltcl, or plpgsql for that matter).

- does something unsafe,

Such as? The code can't do anything more unsafe than is already possible.

- depends on the backend not being in an error state already,

The code has no access to postgress, whatever the state.

- etc. etc?

I'd welcome more concrete examples of potential issues.

Tim.

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tim Bunce (#5)
Re: Add on_perl_init and proper destruction to plperl [PATCH]

Tim Bunce <Tim.Bunce@pobox.com> writes:

On Wed, Jan 27, 2010 at 12:46:42AM -0700, Alex Hunsaker wrote:

FWIW the atexit scares me to.

In what way, specifically?

It runs too late, and too unpredictably, during the shutdown sequence.
(In particular note that shutdown itself might be fired as an atexit
callback, a move forced on us by exactly the sort of random user code
that you want to add more of. It's not clear whether a Perl-added
atexit would fire before or after that.)

I understand concerns about interacting with the database, so the
patch ensures that any use of spi functions throws an exception.

That assuages my fears to only a tiny degree. SPI is not the only
possible connection between perl code and the rest of the backend.
Indeed, AFAICS the major *point* of these additions is to allow people
to insert unknown other functionality that is likely to interact
with the rest of the backend; a prospect that doesn't make me feel
better about it.

Specifically, how is code that starts executing at the end of a session
different in risk to code that starts executing before the end of a session?

If it runs before the shutdown sequence starts, we know we have a
functioning backend. Once shutdown starts, it's unknown and mostly
untested exactly what subsystems will still work and which won't.
Injecting arbitrary user-written code into an unspecified point in
that sequence is not a recipe for good results.

Lastly, an atexit trigger will still fire during FATAL or PANIC aborts,
which scares me even more. When the house is already afire, it's
not prudent to politely let user-written perl code do whatever it wants
before you get the heck out of there.

regards, tom lane

#9Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#8)
Re: Add on_perl_init and proper destruction to plperl [PATCH]

Tom Lane wrote:

Indeed, AFAICS the major *point* of these additions is to allow people
to insert unknown other functionality that is likely to interact
with the rest of the backend; a prospect that doesn't make me feel
better about it.

No. The major use case we've seen for END blocks is to allow a profiler
to write its data out. That should have zero interaction with the rest
of the backend.

cheers

andrew

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#9)
Re: Add on_perl_init and proper destruction to plperl [PATCH]

Andrew Dunstan <andrew@dunslane.net> writes:

Tom Lane wrote:

Indeed, AFAICS the major *point* of these additions is to allow people
to insert unknown other functionality that is likely to interact
with the rest of the backend; a prospect that doesn't make me feel
better about it.

No. The major use case we've seen for END blocks is to allow a profiler
to write its data out. That should have zero interaction with the rest
of the backend.

Really? We've found that gprof, for instance, doesn't exactly have
"zero interaction with the rest of the backend" --- there's actually
a couple of different bits in there to help it along, including a
behavioral change during shutdown. I rather doubt that Perl profilers
would turn out much different.

But in any case, I don't believe for a moment that profiling is the only
or even the largest use to which people would try to put this.

regards, tom lane

#11Tim Bunce
Tim.Bunce@pobox.com
In reply to: Tom Lane (#8)
Re: Add on_perl_init and proper destruction to plperl [PATCH]

On Wed, Jan 27, 2010 at 11:13:43AM -0500, Tom Lane wrote:

Tim Bunce <Tim.Bunce@pobox.com> writes:

On Wed, Jan 27, 2010 at 12:46:42AM -0700, Alex Hunsaker wrote:

FWIW the atexit scares me to.

In what way, specifically?

It runs too late, and too unpredictably, during the shutdown sequence.
(In particular note that shutdown itself might be fired as an atexit
callback, a move forced on us by exactly the sort of random user code
that you want to add more of. It's not clear whether a Perl-added
atexit would fire before or after that.)

man atexit says "Functions so registered are called in reverse order".
Since the plperl atexit is called only when a plperl SP or DO is
executed it would fire before any atexit() registered during startup.

The timing and predictability shouldn't be a significant concern if the
plperl subsystem can't interact with the rest of the backend - which is
the intent.

I understand concerns about interacting with the database, so the
patch ensures that any use of spi functions throws an exception.

That assuages my fears to only a tiny degree. SPI is not the only
possible connection between perl code and the rest of the backend.

Could you give me some examples of others?

Indeed, AFAICS the major *point* of these additions is to allow people
to insert unknown other functionality that is likely to interact
with the rest of the backend; a prospect that doesn't make me feel
better about it.

The major point is *not at all* to allow people to interact with the
rest of the backend. I'm specifically trying to limit that.
The major point is simply to allow perl code to clean itself up properly.

Specifically, how is code that starts executing at the end of a session
different in risk to code that starts executing before the end of a session?

If it runs before the shutdown sequence starts, we know we have a
functioning backend. Once shutdown starts, it's unknown and mostly
untested exactly what subsystems will still work and which won't.
Injecting arbitrary user-written code into an unspecified point in
that sequence is not a recipe for good results.

The plperl subsystem is isolated from, and can't interact with, the rest
of the backend during shutdown.
Can you give me examples where that's not the case?

Lastly, an atexit trigger will still fire during FATAL or PANIC aborts,
which scares me even more. When the house is already afire, it's
not prudent to politely let user-written perl code do whatever it wants
before you get the heck out of there.

Again, that point rests on your underlying concern about interaction
between plperl and the rest of the backend. Examples?

Is there some way for plperl.c to detect a FATAL or PANIC abort?
If so, or if one could be added, then we could skip the END code in
those circumstances.

I don't really want to add more GUCs, but perhaps controlling END
block execution via a plperl.destroy_end=bool (default false) would
help address your concerns.

Tim.

#12Noname
fche@redhat.com
In reply to: Tom Lane (#8)
Re: Add on_perl_init and proper destruction to plperl [PATCH]

Tom Lane <tgl@sss.pgh.pa.us> writes:

[...]
Lastly, an atexit trigger will still fire during FATAL or PANIC aborts,
which scares me even more. When the house is already afire, it's
not prudent to politely let user-written perl code do whatever it wants
before you get the heck out of there.

Is there a reason that these panics don't use _exit(3) to bypass
atexit hooks?

- FChE

#13Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#10)
Re: Add on_perl_init and proper destruction to plperl [PATCH]

Tom Lane wrote:

But in any case, I don't believe for a moment that profiling is the only
or even the largest use to which people would try to put this.

Well, ISTR there have been requests over the years for event handlers
for (among other things) session shutdown, so if you're speculating that
people would use this as an end run around our lack of such things you
could be right. Maybe providing for such handlers in a more general and
at the same time more safe way would be an alternative.

cheers

andrew

#14Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#6)
Re: Add on_perl_init and proper destruction to plperl [PATCH]

Andrew Dunstan <andrew@dunslane.net> writes:

I see I asked the wrong question. Start again.
What more should be done to make all or some of it acceptable?

I think a "must" is to get rid of the use of atexit(). Possibly an
on_proc_exit callback could be used instead, although I'm not sure how
you'd handle the case of code loaded in the postmaster that would like
corresponding exit-time code to happen in child processes. (OTOH, it
seems likely that it's impossible to make that work correctly anyway.
It certainly isn't going to work the same on EXEC_BACKEND platforms
as anywhere else, and I don't particularly want to see us documenting
that the feature works differently on Windows than elsewhere.)

Dropping the ability to make the postmaster run any such code would go a
very long way towards fixing the above, as well as assuaging other
fears.

The other thing that I find entirely unconvincing is Tim's idea that
shutting off SPI isolates perl from the rest of the backend. I have
no confidence in that, but no real idea of how to do better either :-(.
If you think that shutting off SPI is sufficient, you can find
counterexamples in the CVS history, for instance where we had to take
special measures to prevent Perl from screwing up the locale settings.
I'm afraid that on_perl_init is going to vastly expand the opportunities
for that kind of unwanted side-effect; and the earlier that it runs, the
more likely it's going to be that we can't recover easily.

regards, tom lane

#15Tom Lane
tgl@sss.pgh.pa.us
In reply to: Noname (#12)
Re: Add on_perl_init and proper destruction to plperl [PATCH]

fche@redhat.com (Frank Ch. Eigler) writes:

Tom Lane <tgl@sss.pgh.pa.us> writes:

Lastly, an atexit trigger will still fire during FATAL or PANIC aborts,
which scares me even more. When the house is already afire, it's
not prudent to politely let user-written perl code do whatever it wants
before you get the heck out of there.

Is there a reason that these panics don't use _exit(3) to bypass
atexit hooks?

Well, I don't really want to entirely forbid the use of atexit() ---
I'm just concerned about using it to run arbitrary user-written code.
There might be more limited purposes for which it's a reasonable choice.

regards, tom lane

#16Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tim Bunce (#11)
Re: Add on_perl_init and proper destruction to plperl [PATCH]

Tim Bunce <Tim.Bunce@pobox.com> writes:

On Wed, Jan 27, 2010 at 11:13:43AM -0500, Tom Lane wrote:

(In particular note that shutdown itself might be fired as an atexit
callback, a move forced on us by exactly the sort of random user code
that you want to add more of. It's not clear whether a Perl-added
atexit would fire before or after that.)

man atexit says "Functions so registered are called in reverse order".
Since the plperl atexit is called only when a plperl SP or DO is
executed it would fire before any atexit() registered during startup.

Right, which means that it would occur either before or after
on_proc_exit processing, depending on whether we got there through
an exit() call or via the normal proc_exit sequence. That's just
the kind of instability I don't want to have to debug.

The plperl subsystem is isolated from, and can't interact with, the rest
of the backend during shutdown.

This is exactly the claim that I have zero confidence in. Quite
frankly, the problem with Perl as an extension language is that Perl was
never designed to be a subsystem: it feels free to mess around with the
entire state of the process. We've been burnt multiple times by that
even with the limited use we make of Perl now, and these proposed
additions are going to make it a lot worse IMO.

regards, tom lane

#17David E. Wheeler
david@kineticode.com
In reply to: Tom Lane (#16)
Re: Add on_perl_init and proper destruction to plperl [PATCH]

On Jan 27, 2010, at 9:08 AM, Tom Lane wrote:

This is exactly the claim that I have zero confidence in. Quite
frankly, the problem with Perl as an extension language is that Perl was
never designed to be a subsystem: it feels free to mess around with the
entire state of the process. We've been burnt multiple times by that
even with the limited use we make of Perl now, and these proposed
additions are going to make it a lot worse IMO.

Can you provide an example? Such concerns are impossible to address without concrete examples.

Best,

David

#18Tom Lane
tgl@sss.pgh.pa.us
In reply to: David E. Wheeler (#17)
Re: Add on_perl_init and proper destruction to plperl [PATCH]

"David E. Wheeler" <david@kineticode.com> writes:

On Jan 27, 2010, at 9:08 AM, Tom Lane wrote:

This is exactly the claim that I have zero confidence in. Quite
frankly, the problem with Perl as an extension language is that Perl was
never designed to be a subsystem: it feels free to mess around with the
entire state of the process. We've been burnt multiple times by that
even with the limited use we make of Perl now, and these proposed
additions are going to make it a lot worse IMO.

Can you provide an example? Such concerns are impossible to address without concrete examples.

Two examples that I can find in a quick review of our CVS history: perl
stomping on the process's setlocale state, and perl stomping on the
stdio state (Windows only).

regards, tom lane

#19David E. Wheeler
david@kineticode.com
In reply to: Tom Lane (#18)
Re: Add on_perl_init and proper destruction to plperl [PATCH]

On Jan 27, 2010, at 10:08 AM, Tom Lane wrote:

Two examples that I can find in a quick review of our CVS history: perl
stomping on the process's setlocale state, and perl stomping on the
stdio state (Windows only).

Are there links to those commits?

Thanks,

David

#20Tim Bunce
Tim.Bunce@pobox.com
In reply to: David E. Wheeler (#19)
Re: Add on_perl_init and proper destruction to plperl [PATCH]

On Wed, Jan 27, 2010 at 12:08:48PM -0500, Tom Lane wrote:

Tim Bunce <Tim.Bunce@pobox.com> writes:

On Wed, Jan 27, 2010 at 11:13:43AM -0500, Tom Lane wrote:

(In particular note that shutdown itself might be fired as an atexit
callback, a move forced on us by exactly the sort of random user code
that you want to add more of. It's not clear whether a Perl-added
atexit would fire before or after that.)

man atexit says "Functions so registered are called in reverse order".
Since the plperl atexit is called only when a plperl SP or DO is
executed it would fire before any atexit() registered during startup.

Right, which means that it would occur either before or after
on_proc_exit processing, depending on whether we got there through
an exit() call or via the normal proc_exit sequence. That's just
the kind of instability I don't want to have to debug.

Okay. I could change the callback code to ignore calls if
proc_exit_inprogress is false. So an abnormal shutdown via exit()
wouldn't involve plperl at all. (Alternatively I could use use
on_proc_exit() instead of atexit() to register the callback.)

Would that address this call sequence instability issue?

The plperl subsystem is isolated from, and can't interact with, the
rest of the backend during shutdown.

This is exactly the claim that I have zero confidence in. Quite
frankly, the problem with Perl as an extension language is that Perl was
never designed to be a subsystem: it feels free to mess around with the
entire state of the process. We've been burnt multiple times by that
even with the limited use we make of Perl now, and these proposed
additions are going to make it a lot worse IMO.

On Wed, Jan 27, 2010 at 09:53:44AM -0800, David E. Wheeler wrote:

Can you provide an example? Such concerns are impossible to address
without concrete examples.

On Wed, Jan 27, 2010 at 01:08:56PM -0500, Tom Lane wrote:

Two examples that I can find in a quick review of our CVS history: perl
stomping on the process's setlocale state, and perl stomping on the
stdio state (Windows only).

Neither of those relate to the actions of perl source code.
To address that, instead of calling perl_destruct() to perform a
complete destruction I could just execute END blocks and object
destructors. That would avoid executing any system-level actions.

Do you have any examples of how a user could write code in a plperl END
block that would interact with the rest of the backend?

Tim.

#21Tim Bunce
Tim.Bunce@pobox.com
In reply to: Tim Bunce (#20)
Re: Add on_perl_init and proper destruction to plperl [PATCH]

On Wed, Jan 27, 2010 at 11:28:02AM -0500, Tom Lane wrote:

Andrew Dunstan <andrew@dunslane.net> writes:

Tom Lane wrote:

Indeed, AFAICS the major *point* of these additions is to allow people
to insert unknown other functionality that is likely to interact
with the rest of the backend; a prospect that doesn't make me feel
better about it.

No. The major use case we've seen for END blocks is to allow a profiler
to write its data out. That should have zero interaction with the rest
of the backend.

Really? We've found that gprof, for instance, doesn't exactly have
"zero interaction with the rest of the backend" --- there's actually
a couple of different bits in there to help it along, including a
behavioral change during shutdown. I rather doubt that Perl profilers
would turn out much different.

Devel::NYTProf (http://blog.timbunce.org/tag/nytprof/) has zero
interaction with the rest of the backend.

It works in PostgreSQL 8.4, although greatly handicapped by the lack of
END blocks. http://search.cpan.org/perldoc?Devel::NYTProf::PgPLPerl

But in any case, I don't believe for a moment that profiling is the only
or even the largest use to which people would try to put this.

Can you give any examples?

Tim.

#22David E. Wheeler
david@kineticode.com
In reply to: Tim Bunce (#20)
Re: Add on_perl_init and proper destruction to plperl [PATCH]

On Jan 27, 2010, at 1:27 PM, Tim Bunce wrote:

Okay. I could change the callback code to ignore calls if
proc_exit_inprogress is false. So an abnormal shutdown via exit()
wouldn't involve plperl at all. (Alternatively I could use use
on_proc_exit() instead of atexit() to register the callback.)

Given Tom's hesitace about atexit(), perhaps that would be best.

Neither of those relate to the actions of perl source code.
To address that, instead of calling perl_destruct() to perform a
complete destruction I could just execute END blocks and object
destructors. That would avoid executing any system-level actions.

Does perl_destruct() execute system-level actions, then? If so, then it seems prudent to either audit such actions or, as you say, call destructors directly.

Do you have any examples of how a user could write code in a plperl END
block that would interact with the rest of the backend?

I appreciate you taking the time to look at ways to address the issues Tom has raised, Tim. Good on you.

There is so much benefit to this level of interaction, as shown by the success of mod_perl and other forking environments that support pre-loading code, that I think it'd be extremely valuable to get these features in 9.0. They really allow Perl to be a first-class PL in a way that it wasn't before.

So if there is no way other than SPI for Perl code to interact with the backend, and system-level actions in Perl itself are disabled, it seems to me that the major issues are addressed. Am I wrong, Tom?

Best,

David

#23Robert Haas
robertmhaas@gmail.com
In reply to: David E. Wheeler (#22)
Re: Add on_perl_init and proper destruction to plperl [PATCH]

On Wed, Jan 27, 2010 at 4:51 PM, David E. Wheeler <david@kineticode.com> wrote:

Neither of those relate to the actions of perl source code.
To address that, instead of calling perl_destruct() to perform a
complete destruction I could just execute END blocks and object
destructors. That would avoid executing any system-level actions.

Does perl_destruct() execute system-level actions, then? If so, then it seems prudent to either audit such actions or, as you say, call destructors directly.

What exactly do we mean by "system-level actions"? I mean, END blocks
can execute arbitrary code....

...Robert

#24David E. Wheeler
david@kineticode.com
In reply to: Robert Haas (#23)
Re: Add on_perl_init and proper destruction to plperl [PATCH]

On Jan 27, 2010, at 1:53 PM, Robert Haas wrote:

What exactly do we mean by "system-level actions"? I mean, END blocks
can execute arbitrary code....

Yeah. In Perl. What part of Perl can access the backend systems without SPI? And that it couldn't do at any other point in runtime?

Best,

David

#25Tim Bunce
Tim.Bunce@pobox.com
In reply to: David E. Wheeler (#24)
Re: Add on_perl_init and proper destruction to plperl [PATCH]

On Wed, Jan 27, 2010 at 01:51:47PM -0800, David E. Wheeler wrote:

On Jan 27, 2010, at 1:27 PM, Tim Bunce wrote:

Okay. I could change the callback code to ignore calls if
proc_exit_inprogress is false. So an abnormal shutdown via exit()
wouldn't involve plperl at all. (Alternatively I could use use
on_proc_exit() instead of atexit() to register the callback.)

Given Tom's hesitace about atexit(), perhaps that would be best.

I've a draft patch using !proc_exit_inprogress implemented now
(appended) and I'll test it tomorrow. That was the simplest to do first.
Once I've a reasonable way of testing the exit() and proc_exit() code
paths I'll try using on_proc_exit().

Neither of those relate to the actions of perl source code.
To address that, instead of calling perl_destruct() to perform a
complete destruction I could just execute END blocks and object
destructors. That would avoid executing any system-level actions.

Does perl_destruct() execute system-level actions, then?

Potentially: Kills threads it knows about (should be none), wait for
children (should be none), flushes all open *perl* file handles (should
be none for plperl), tears down PerlIO layers, etc. etc. In practice
none of that should affect the backend, but it's possible, especially
for the Windows port. Since none of that is needed it can be skipped.

If so, then it seems prudent to either audit such actions or, as you
say, call destructors directly.

The patch now just calls END blocks and DESTROY methods.

Do you have any examples of how a user could write code in a plperl END
block that would interact with the rest of the backend?

I appreciate you taking the time to look at ways to address the issues
Tom has raised, Tim. Good on you.

Thanks David. I appreciate the visible support!

Tom and the team set the bar high, rightly, so it's certainly been a
challenging introduction to PostgreSQL development!

Tim.

diff --git a/src/pl/plperl/plperl.c b/src/pl/plperl/plperl.c
index 8315d5a..38f2d35 100644
*** a/src/pl/plperl/plperl.c
--- b/src/pl/plperl/plperl.c
***************
*** 27,32 ****
--- 27,33 ----
  #include "miscadmin.h"
  #include "nodes/makefuncs.h"
  #include "parser/parse_type.h"
+ #include "storage/ipc.h"
  #include "utils/builtins.h"
  #include "utils/fmgroids.h"
  #include "utils/guc.h"
*************** _PG_init(void)
*** 281,287 ****
  static void
  plperl_fini(void)
  {
! 	plperl_ending = true;
  	plperl_destroy_interp(&plperl_trusted_interp);
  	plperl_destroy_interp(&plperl_untrusted_interp);
  	plperl_destroy_interp(&plperl_held_interp);
--- 282,297 ----
  static void
  plperl_fini(void)
  {
! 	plperl_ending = true; /* disables use of spi_* functions */
! 
! 	/*
! 	 * Only perform perl cleanup if we're exiting cleanly via proc_exit().
! 	 * If proc_exit_inprogress is false then exit() was called directly
! 	 * (because we call atexit() very late, so get called early).
! 	 */
! 	if (!proc_exit_inprogress)
! 		return;
! 
  	plperl_destroy_interp(&plperl_trusted_interp);
  	plperl_destroy_interp(&plperl_untrusted_interp);
  	plperl_destroy_interp(&plperl_held_interp);
*************** plperl_destroy_interp(PerlInterpreter **
*** 595,602 ****
  {
  	if (interp && *interp)
  	{
! 		perl_destruct(*interp);
! 		perl_free(*interp);
  		*interp = NULL;
  	}
  }
--- 605,640 ----
  {
  	if (interp && *interp)
  	{
! 		/*
! 		 * Only a very minimal destruction is performed.
! 		 * Just END blocks and object destructors, no system-level actions.
! 		 * Code code here extracted from perl's perl_destruct().
! 		 */
! 
! 		/* Run END blocks */
! 		if (PL_exit_flags & PERL_EXIT_DESTRUCT_END) {
! 			dJMPENV;
! 			int x = 0;
! 
! 			JMPENV_PUSH(x);
! 			PERL_UNUSED_VAR(x);
! 			if (PL_endav && !PL_minus_c)
! 				call_list(PL_scopestack_ix, PL_endav);
! 			JMPENV_POP;
! 		}
! 		LEAVE;
! 		FREETMPS;
! 
! 		PL_dirty = TRUE;
! 
! 		/* destroy objects - call DESTROY methods */
! 		if (PL_sv_objcount) {
! 			Perl_sv_clean_objs(aTHX);
! 			PL_sv_objcount = 0;
! 			if (PL_defoutgv && !SvREFCNT(PL_defoutgv))
! 				PL_defoutgv = NULL; /* may have been freed */
! 		}
! 
  		*interp = NULL;
  	}
  }
#26Tom Lane
tgl@sss.pgh.pa.us
In reply to: David E. Wheeler (#24)
Re: Add on_perl_init and proper destruction to plperl [PATCH]

"David E. Wheeler" <david@kineticode.com> writes:

On Jan 27, 2010, at 1:53 PM, Robert Haas wrote:

What exactly do we mean by "system-level actions"? I mean, END blocks
can execute arbitrary code....

Yeah. In Perl. What part of Perl can access the backend systems without SPI? And that it couldn't do at any other point in runtime?

You still aren't letting go of the notion that Perl could only affect
the rest of the backend via SPI. The point I'm trying to impress on you
is that there are any number of other possible pathways, and that Perl's
historical assumption that it owns all resources of the process make
those pathways a nontrivial hazard. Anything that Perl does to libc
state, open file handles, etc etc carries a high risk of breaking the
backend.

Now it is certainly true that any such hazards can be created just from
use of plperlu (we hope only plperlu, and not plperl ...) today,
without any use of the proposed additional features. What is bothering
me about these features is that their entire reason for existence is to
encourage people to use parts of Perl that have time-extended effects on
the process state. That means that (a) the probability of problems goes
up substantially, and (b) our ability to fix such problems goes down
substantially. Right now, the canonical approach to trying to undo
anything bad Perl does is to save/restore process state around a plperl
call. If we're trying to support usages in which Perl has time-extended
effects on process state, that solution goes out the window, and we have
to think of some other way to coexist with Perl. (Where, I note,
"coexist" means "Perl does what it damn pleases and we have to pick up
the pieces" --- we're not likely to get any cooperation on limiting
damage from that side. Nobody even suggested that we treat stomping on
setlocale state as a Perl bug, for example, rather than a fact of life
that we just had to work around however we could.)

So the real bottom line here is that I foresee this patch as being
destabilizing and requiring us to put large amounts of time into
figuring out workarounds for whatever creative things people decide to
try to do with Perl. I'd feel better about it if I thought that we
could get away with a policy of "if it breaks it's your problem", but
I do not think that will fly from a PR standpoint. It hasn't in the
past.

regards, tom lane

#27Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tim Bunce (#20)
Re: Add on_perl_init and proper destruction to plperl [PATCH]

Tim Bunce <Tim.Bunce@pobox.com> writes:

Okay. I could change the callback code to ignore calls if
proc_exit_inprogress is false. So an abnormal shutdown via exit()
wouldn't involve plperl at all. (Alternatively I could use use
on_proc_exit() instead of atexit() to register the callback.)

Use on_proc_exit please. I will continue to object to any attempt
to hang arbitrary processing on atexit().

An advantage of on_proc_exit from your end is that it should allow
you to not have to try to prevent the END blocks from using SPI,
as that would still be perfectly functional when your callback
gets called. (Starting a new transaction would be a good idea
though, cf Async_UnlistenOnExit.)

regards, tom lane

#28David E. Wheeler
david@kineticode.com
In reply to: Tom Lane (#26)
Re: Add on_perl_init and proper destruction to plperl [PATCH]

On Jan 27, 2010, at 3:11 PM, Tom Lane wrote:

You still aren't letting go of the notion that Perl could only affect
the rest of the backend via SPI. The point I'm trying to impress on you
is that there are any number of other possible pathways, and that Perl's
historical assumption that it owns all resources of the process make
those pathways a nontrivial hazard. Anything that Perl does to libc
state, open file handles, etc etc carries a high risk of breaking the
backend.

As could any other code that executes then, including C libraries installed from pgFoundry and loaded by a DBA.

Now it is certainly true that any such hazards can be created just from
use of plperlu (we hope only plperlu, and not plperl ...) today,
without any use of the proposed additional features. What is bothering
me about these features is that their entire reason for existence is to
encourage people to use parts of Perl that have time-extended effects on
the process state.

Well, mainly it's to avoid the overhead of loading the code except at startup.

That means that (a) the probability of problems goes
up substantially,

Why? Arbitrary code can already execute at start time. Is Perl special somehow?

and (b) our ability to fix such problems goes down
substantially.

Why is it your problem?

Right now, the canonical approach to trying to undo
anything bad Perl does is to save/restore process state around a plperl
call. If we're trying to support usages in which Perl has time-extended
effects on process state, that solution goes out the window, and we have
to think of some other way to coexist with Perl. (Where, I note,
"coexist" means "Perl does what it damn pleases and we have to pick up
the pieces" --- we're not likely to get any cooperation on limiting
damage from that side. Nobody even suggested that we treat stomping on
setlocale state as a Perl bug, for example, rather than a fact of life
that we just had to work around however we could.)

How is that different from any other code that gets loaded when the server starts, exactly?

Do, however, feel free to report Perl bugs. Just run `perlbug`.

So the real bottom line here is that I foresee this patch as being
destabilizing and requiring us to put large amounts of time into
figuring out workarounds for whatever creative things people decide to
try to do with Perl. I'd feel better about it if I thought that we
could get away with a policy of "if it breaks it's your problem", but
I do not think that will fly from a PR standpoint. It hasn't in the
past.

mod_perl has for many years. Provide lots of caveats in the documentation. Point users to it when they write in about a problem.

Truth is, the vast majority of Perl modules are pretty well-behaved. I sincerely doubt you'd hear much complaint. Have the Apache guys had to take any special steps to protect httpd from mod_perl?

Best,

David

#29Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tim Bunce (#21)
Re: Add on_perl_init and proper destruction to plperl [PATCH]

Tim Bunce <Tim.Bunce@pobox.com> writes:

On Wed, Jan 27, 2010 at 11:28:02AM -0500, Tom Lane wrote:

Really? We've found that gprof, for instance, doesn't exactly have
"zero interaction with the rest of the backend" --- there's actually
a couple of different bits in there to help it along, including a
behavioral change during shutdown. I rather doubt that Perl profilers
would turn out much different.

Devel::NYTProf (http://blog.timbunce.org/tag/nytprof/) has zero
interaction with the rest of the backend.

I don't have to read any further than the place where it says "doesn't
work if you call both plperl and plperlu" to realize that that's quite
false. Maybe we have different definitions of what a software
interaction is...

regards, tom lane

#30David E. Wheeler
david@kineticode.com
In reply to: Tom Lane (#29)
Re: Add on_perl_init and proper destruction to plperl [PATCH]

On Jan 27, 2010, at 3:33 PM, Tom Lane wrote:

I don't have to read any further than the place where it says "doesn't
work if you call both plperl and plperlu" to realize that that's quite
false. Maybe we have different definitions of what a software
interaction is...

I think that dates from when plperl and plperlu couldn't co-exists, which was fixed a few months ago, n'est pas?

Best,

David

#31Tom Lane
tgl@sss.pgh.pa.us
In reply to: David E. Wheeler (#30)
Re: Add on_perl_init and proper destruction to plperl [PATCH]

"David E. Wheeler" <david@kineticode.com> writes:

On Jan 27, 2010, at 3:33 PM, Tom Lane wrote:

I don't have to read any further than the place where it says "doesn't
work if you call both plperl and plperlu" to realize that that's quite
false. Maybe we have different definitions of what a software
interaction is...

I think that dates from when plperl and plperlu couldn't co-exists, which was fixed a few months ago, n'est pas?

No, that was fixed years ago, at least if you have a modern Perl build
that supports multiplicity at all.

regards, tom lane

#32Tom Lane
tgl@sss.pgh.pa.us
In reply to: David E. Wheeler (#28)
Re: Add on_perl_init and proper destruction to plperl [PATCH]

"David E. Wheeler" <david@kineticode.com> writes:

On Jan 27, 2010, at 3:11 PM, Tom Lane wrote:

... Anything that Perl does to libc
state, open file handles, etc etc carries a high risk of breaking the
backend.

As could any other code that executes then, including C libraries installed from pgFoundry and loaded by a DBA.

Absolutely. The difference here is in who is going to be expected to
try to deal with any problems. When somebody says "if I do this in
plperlu, my database crashes! Postgres sux!" it's not going to help to
say "that's a Perl bug", even if an independent observer might agree.
It's going to be *our* problem, and I don't see any reason to expect
a shred of help from the Perl side.

regards, tom lane

#33David E. Wheeler
david@kineticode.com
In reply to: Tom Lane (#32)
Re: Add on_perl_init and proper destruction to plperl [PATCH]

On Jan 27, 2010, at 4:10 PM, Tom Lane wrote:

Absolutely. The difference here is in who is going to be expected to
try to deal with any problems. When somebody says "if I do this in
plperlu, my database crashes! Postgres sux!" it's not going to help to
say "that's a Perl bug", even if an independent observer might agree.
It's going to be *our* problem, and I don't see any reason to expect
a shred of help from the Perl side.

Is that not the case with plperlu already?

Best,

David

#34Tom Lane
tgl@sss.pgh.pa.us
In reply to: David E. Wheeler (#33)
Re: Add on_perl_init and proper destruction to plperl [PATCH]

"David E. Wheeler" <david@kineticode.com> writes:

On Jan 27, 2010, at 4:10 PM, Tom Lane wrote:

Absolutely. The difference here is in who is going to be expected to
try to deal with any problems. When somebody says "if I do this in
plperlu, my database crashes! Postgres sux!" it's not going to help to
say "that's a Perl bug", even if an independent observer might agree.
It's going to be *our* problem, and I don't see any reason to expect
a shred of help from the Perl side.

Is that not the case with plperlu already?

Sure. Which is why I'm resisting expanding our exposure to it.

regards, tom lane

#35Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#31)
Re: Add on_perl_init and proper destruction to plperl [PATCH]

Tom Lane wrote:

"David E. Wheeler" <david@kineticode.com> writes:

On Jan 27, 2010, at 3:33 PM, Tom Lane wrote:

I don't have to read any further than the place where it says "doesn't
work if you call both plperl and plperlu" to realize that that's quite
false. Maybe we have different definitions of what a software
interaction is...

I think that dates from when plperl and plperlu couldn't co-exists, which was fixed a few months ago, n'est pas?

No, that was fixed years ago, at least if you have a modern Perl build
that supports multiplicity at all.

To be strictly accurate, what we fixed years ago was that we used to run
plperl and plperlu in the same interpreter, and that caused all sorts of
information leaks, so we switched to running in two interpreters, which
in turn became a problem for perl builds that didn't define multiplicity.

The problem here is that NYTprof is apparently not multiplicity safe. I
guess the question is what would happen if you tried to load it with
both plperl and plperlu. In any case, it's a known and documented issue,
so it's not one I'd be terribly worried about.

cheers

andrew

#36David E. Wheeler
david@kineticode.com
In reply to: Tom Lane (#34)
Re: Add on_perl_init and proper destruction to plperl [PATCH]

On Jan 27, 2010, at 4:15 PM, Tom Lane wrote:

Is that not the case with plperlu already?

Sure. Which is why I'm resisting expanding our exposure to it

I don't understand how it's expanding core's exposure to it.

Best,

David

#37Tom Lane
tgl@sss.pgh.pa.us
In reply to: David E. Wheeler (#36)
Re: Add on_perl_init and proper destruction to plperl [PATCH]

"David E. Wheeler" <david@kineticode.com> writes:

On Jan 27, 2010, at 4:15 PM, Tom Lane wrote:

Sure. Which is why I'm resisting expanding our exposure to it

I don't understand how it's expanding core's exposure to it.

[ shrug...] I see little point in repeating myself yet again.
It's obvious that the people who want this are entirely willing
to adopt a Pollyanna-ishly optimistic view about its potential
to cause serious problems that we may or may not be able to fix.

I don't really expect to be able to prevent something along this line
from getting committed --- I'm merely hoping to circumscribe it as much
as possible and get large WARNING items into the manual's description.

regards, tom lane

#38David E. Wheeler
david@kineticode.com
In reply to: Tom Lane (#37)
Re: Add on_perl_init and proper destruction to plperl [PATCH]

On Jan 27, 2010, at 4:33 PM, Tom Lane wrote:

[ shrug...] I see little point in repeating myself yet again.
It's obvious that the people who want this are entirely willing
to adopt a Pollyanna-ishly optimistic view about its potential
to cause serious problems that we may or may not be able to fix.

Well, no. The problems you raise already exist in plperlu. And I would argue that they're worse there, as the DBA can give others permission to create PL/PerlU functions, and those users can do all kinds of crazy shit with them. on_perl_init can be executed the DBA only. It's scope is far less. This is *safe* than PL/PerlU, while given more capability to PL/Perl.

I don't really expect to be able to prevent something along this line
from getting committed --- I'm merely hoping to circumscribe it as much
as possible and get large WARNING items into the manual's description.

Oh, absolutely. Your sober attention to security issues is greatly appreciated by us fanboys.

Best,

David

PS: I'm a PostgreSQL fanboy, not a Tom Lane fanboy. ;-P

#39Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#37)
Re: Add on_perl_init and proper destruction to plperl [PATCH]

Tom Lane wrote:

It's obvious that the people who want this are entirely willing
to adopt a Pollyanna-ishly optimistic view about its potential
to cause serious problems that we may or may not be able to fix.

I don't really expect to be able to prevent something along this line
from getting committed --- I'm merely hoping to circumscribe it as much
as possible and get large WARNING items into the manual's description.

Well, we seem to have got much closer to what you can live with on the
END block issue, although we took a rather roundabout route to get there.

Perhaps with a little more work we can achieve something similar on the
on_perl_init front (which, in any case, I don't regard as being as
important as the END blocks, although Tim might not agree.)

cheers

andrew

#40Tim Bunce
Tim.Bunce@pobox.com
In reply to: Tom Lane (#29)
Re: Add on_perl_init and proper destruction to plperl [PATCH]

On Wed, Jan 27, 2010 at 06:33:19PM -0500, Tom Lane wrote:

Tim Bunce <Tim.Bunce@pobox.com> writes:

On Wed, Jan 27, 2010 at 11:28:02AM -0500, Tom Lane wrote:

Really? We've found that gprof, for instance, doesn't exactly have
"zero interaction with the rest of the backend" --- there's actually
a couple of different bits in there to help it along, including a
behavioral change during shutdown. I rather doubt that Perl profilers
would turn out much different.

Devel::NYTProf (http://blog.timbunce.org/tag/nytprof/) has zero
interaction with the rest of the backend.

I don't have to read any further than the place where it says "doesn't
work if you call both plperl and plperlu" to realize that that's quite
false.

NYTProf is not, currently, multiplicity-safe. That's a limitation I
intend to fix.

Maybe we have different definitions of what a software interaction is...

Doing _anything_ in the backend is an interaction of some kind, e.g.,
shifting later memory allocations to a different address. But that's not
a very practical basis for a definition.

From what you said, quoted above, it seemed that your definition of
"interaction with the rest of the backend" was more much more direct.
The specific example you gave related to the backend code needing to be
modified to support the gprof profiler. Clearly that's not the case for
NYTProf.

We're splitting hairs now.

Tim.

#41Tim Bunce
Tim.Bunce@pobox.com
In reply to: Tom Lane (#27)
Re: Add on_perl_init and proper destruction to plperl [PATCH]

On Wed, Jan 27, 2010 at 06:27:50PM -0500, Tom Lane wrote:

Tim Bunce <Tim.Bunce@pobox.com> writes:

Okay. I could change the callback code to ignore calls if
proc_exit_inprogress is false. So an abnormal shutdown via exit()
wouldn't involve plperl at all. (Alternatively I could use use
on_proc_exit() instead of atexit() to register the callback.)

Use on_proc_exit please. I will continue to object to any attempt
to hang arbitrary processing on atexit().

Ok.

An advantage of on_proc_exit from your end is that it should allow
you to not have to try to prevent the END blocks from using SPI,
as that would still be perfectly functional when your callback
gets called. (Starting a new transaction would be a good idea
though, cf Async_UnlistenOnExit.)

I'm surprised that you're suggesting that END block should be allowed to
interact with the backend via SPI. It seems to go against what you've
said previously about code running at shutdown.

I've no use-case for that so I'm happy to leave it disabled. If someone
does have a sane use-case, please let me know. It can always be enabled later.

Tim.

#42Andrew Dunstan
andrew@dunslane.net
In reply to: Tim Bunce (#41)
Re: Add on_perl_init and proper destruction to plperl [PATCH]

Tim Bunce wrote:

I've no use-case for that so I'm happy to leave it disabled. If someone
does have a sane use-case, please let me know. It can always be enabled later.

As I noted upthread, there have been requests for user level session end
handlers before. With SPI enabled as Tom suggests, this would just about
buy us that for free.

But if you're uncomfortable about it we can take that up at a later date.

cheers

andrew

#43Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tim Bunce (#41)
Re: Add on_perl_init and proper destruction to plperl [PATCH]

Tim Bunce <Tim.Bunce@pobox.com> writes:

On Wed, Jan 27, 2010 at 06:27:50PM -0500, Tom Lane wrote:

An advantage of on_proc_exit from your end is that it should allow
you to not have to try to prevent the END blocks from using SPI,
as that would still be perfectly functional when your callback
gets called. (Starting a new transaction would be a good idea
though, cf Async_UnlistenOnExit.)

I'm surprised that you're suggesting that END block should be allowed to
interact with the backend via SPI. It seems to go against what you've
said previously about code running at shutdown.

I think you have completely misunderstood what I'm complaining about.
What I'm not happy about is executing operations at a point where
they're likely to be ill-defined because the code is in the wrong state.
In an early on_proc_exit hook, the system is for all practical purposes
still fully functional, and so I don't see a reason for an arbitrary
restriction on what the END blocks should be able to do.

(Or, to repeat myself in a different way: the no-SPI restriction is
utterly useless to guard against my real concerns anyway. I see no
point in it either here or elsewhere.)

regards, tom lane

#44Tim Bunce
Tim.Bunce@pobox.com
In reply to: Tom Lane (#43)
Re: Add on_perl_init and proper destruction to plperl [PATCH]

On Thu, Jan 28, 2010 at 10:39:33AM -0500, Tom Lane wrote:

Tim Bunce <Tim.Bunce@pobox.com> writes:

On Wed, Jan 27, 2010 at 06:27:50PM -0500, Tom Lane wrote:

An advantage of on_proc_exit from your end is that it should allow
you to not have to try to prevent the END blocks from using SPI,
as that would still be perfectly functional when your callback
gets called. (Starting a new transaction would be a good idea
though, cf Async_UnlistenOnExit.)

I'm surprised that you're suggesting that END block should be allowed to
interact with the backend via SPI. It seems to go against what you've
said previously about code running at shutdown.

I think you have completely misunderstood what I'm complaining about.
What I'm not happy about is executing operations at a point where
they're likely to be ill-defined because the code is in the wrong state.
In an early on_proc_exit hook, the system is for all practical purposes
still fully functional, and so I don't see a reason for an arbitrary
restriction on what the END blocks should be able to do.

Ah, okay. I guess I missed your underlying concerns in:

http://archives.postgresql.org/message-id/26766.1263149361@sss.pgh.pa.us
For the record, [...] and I think it's a worse idea to run
arbitrary user-defined code at backend shutdown (the END-blocks bit).

(Or, to repeat myself in a different way: the no-SPI restriction is
utterly useless to guard against my real concerns anyway. I see no
point in it either here or elsewhere.)

I've left it in the updated patch I've just posted.
There are two more plperl patches in the current commitfest that I'd
like to chaperone through to commit (in some form or other) first.

Thanks for your help Tom.

Tim.