BUG #5066: plperl issues with perl_destruct() and END blocks

Started by Tim Bunceover 16 years ago23 messagesbugs
Jump to latest
#1Tim Bunce
Tim.Bunce@pobox.com

The following bug has been logged online:

Bug reference: 5066
Logged by: Tim Bunce
Email address: Tim.Bunce@pobox.com
PostgreSQL version: 8.4.1
Operating system: darwin
Description: plperl issues with perl_destruct() and END blocks
Details:

The plperl implementation doesn't call perl_destruct() during server
shutdown.

So any resources held by references, in %_SHARED for example, are not
properly freed. The perl interpreter never gets a chance to cleanup, it's
simply discarded.

Related to the above, plperl should also set PL_exit_flags |=
PERL_EXIT_DESTRUCT_END. Currently any END blocks defined during
initialization get executed at initialization (just before perl_run()
returns). Any END blocks defined later never get run.

Setting PL_exit_flags |= PERL_EXIT_DESTRUCT_END in plperl_init_interp() and
calling perl_destruct() will fix the issue.

The timing of the perl_destruct() call (i.e., early or late in the shutdown
sequence) doesn't matter much. You might want to make the spi_* functions
return an error if there's a shutdown in progress.

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tim Bunce (#1)
Re: BUG #5066: plperl issues with perl_destruct() and END blocks

"Tim Bunce" <Tim.Bunce@pobox.com> writes:

The plperl implementation doesn't call perl_destruct() during server
shutdown.

Who cares? The process is going away anyway.

regards, tom lane

#3Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#2)
Re: BUG #5066: plperl issues with perl_destruct() and END blocks

On Sat, Sep 19, 2009 at 3:53 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

"Tim Bunce" <Tim.Bunce@pobox.com> writes:

The plperl implementation doesn't call perl_destruct() during server
shutdown.

Who cares?  The process is going away anyway.

END {} blocks can execute arbitrary code. Perl users will expect them
to be executed.

...Robert

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#3)
Re: BUG #5066: plperl issues with perl_destruct() and END blocks

Robert Haas <robertmhaas@gmail.com> writes:

On Sat, Sep 19, 2009 at 3:53 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

"Tim Bunce" <Tim.Bunce@pobox.com> writes:

The plperl implementation doesn't call perl_destruct() during server
shutdown.

Who cares? �The process is going away anyway.

END {} blocks can execute arbitrary code. Perl users will expect them
to be executed.

[ shrug... ] As a database geek I find the lack of guarantees about
that to be entirely unsatisfying. What exactly are you going to put
in that block? If it's actually important, are you going to trust
a mechanism that doesn't have any crash safeness?

regards, tom lane

#5Tim Bunce
Tim.Bunce@pobox.com
In reply to: Tom Lane (#4)
Re: BUG #5066: plperl issues with perl_destruct() and END blocks

On Sat, Sep 19, 2009 at 11:43:26PM -0400, Tom Lane wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Sat, Sep 19, 2009 at 3:53 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

"Tim Bunce" <Tim.Bunce@pobox.com> writes:

The plperl implementation doesn't call perl_destruct() during server
shutdown.

Who cares? �The process is going away anyway.

END {} blocks can execute arbitrary code. Perl users will expect them
to be executed.

[ shrug... ] As a database geek I find the lack of guarantees about
that to be entirely unsatisfying. What exactly are you going to put
in that block? If it's actually important, are you going to trust
a mechanism that doesn't have any crash safeness?

Can you expand on what you mean by 'guarantees' and 'crash safeness'?

I my particular case I'm trying to enable Devel::NYTProf, the perl
source code performance profiler, to profile PL/Perl code.
See http://www.slideshare.net/Tim.Bunce/develnytprof-200907
(starting

NYTProf uses an END block to finish up the profile and write final
details to the data file. One of the first problems I ran into was that
Postgres was executing the END block before the first plperl sub was
even executed. Very counter intuitive. Since NYTProf is implemented in C
(XS) I've worked around that problem by adding an option to enable
PL_exit_flags |= PERL_EXIT_DESTRUCT_END.

But because postgres doesn't call perl_destruct() the problem has just
moved from END blocks being called too early to END blocks not being
called at all. The effect is that the profile data file is unterminated
and so corrupt and unusable.

To finish the profiling users currently have to execute a SQL statement
to trigger a plperl sub that calls an NYTProf sub that finializes the
profile. Ideally I'd like users to be able to finish the profiling
cleanly with a shutdown (to then restart with profiling disabled).

Calling perl_destruct() during shutdown would fix that.

Tim.

p.s. As a random data point, google code search finds about 7000
perl modules containing an END block:
http://www.google.com/codesearch?as_q=%5EEND%5C+%7B&amp;btnG=Search+Code&amp;hl=en&amp;as_lang=perlas_filename=%5C.pm%24&amp;as_case=y

#6Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Tim Bunce (#5)
Re: BUG #5066: plperl issues with perl_destruct() and END blocks

There's a definitional problem here however. When should we call the
destructor? My impression is that it should happen when the calling
query terminates, not when the backend shuts down. I'm sure this will
cause other issues -- for example %_SHARED will be destroyed way too
early.

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

#7Tim Bunce
Tim.Bunce@pobox.com
In reply to: Alvaro Herrera (#6)
Re: BUG #5066: plperl issues with perl_destruct() and END blocks

On Sun, Sep 20, 2009 at 10:00:01PM -0400, Alvaro Herrera wrote:

There's a definitional problem here however. When should we call the
destructor? My impression is that it should happen when the calling
query terminates, not when the backend shuts down. I'm sure this will
cause other issues -- for example %_SHARED will be destroyed way too
early.

The perlmod man page says:

An "END" code block is executed as late as possible, that is, after
perl has finished running the program and just before the interpreter
is being exited, even if it is exiting as a result of a die() function.
[...]
Note that "END" code blocks are not executed at the end of a string
"eval()": if any "END" code blocks are created in a string "eval()",
they will be executed just as any other "END" code block of that pack‐
age in LIFO order just before the interpreter is being exited.

so executing at the end of query, a transaction, or a session would be
wrong. They should execute "as late as possible", "just before the
interpreter being exited".

Tim.

#8David Fetter
david@fetter.org
In reply to: Tim Bunce (#7)
Re: BUG #5066: plperl issues with perl_destruct() and END blocks

On Mon, Sep 21, 2009 at 11:05:43AM +0100, Tim Bunce wrote:

On Sun, Sep 20, 2009 at 10:00:01PM -0400, Alvaro Herrera wrote:

There's a definitional problem here however. When should we call the
destructor? My impression is that it should happen when the calling
query terminates, not when the backend shuts down. I'm sure this will
cause other issues -- for example %_SHARED will be destroyed way too
early.

The perlmod man page says:

An "END" code block is executed as late as possible, that is, after
perl has finished running the program and just before the interpreter
is being exited, even if it is exiting as a result of a die() function.
[...]
Note that "END" code blocks are not executed at the end of a string
"eval()": if any "END" code blocks are created in a string "eval()",
they will be executed just as any other "END" code block of that pack‐
age in LIFO order just before the interpreter is being exited.

so executing at the end of query, a transaction, or a session would be
wrong. They should execute "as late as possible", "just before the
interpreter being exited".

Taken literally, that would mean, "the last action before the backend
exits," but at least to me, that sounds troubling for the same reasons
that "end of transaction" triggers do. What happens when there are
two different END blocks in a session? With connection poolers,
backends can last quite awhile. Is it OK for the END block to run
hours after the rest of the code?

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

#9Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: David Fetter (#8)
Re: BUG #5066: plperl issues with perl_destruct() and END blocks

David Fetter escribi�:

Taken literally, that would mean, "the last action before the backend
exits," but at least to me, that sounds troubling for the same reasons
that "end of transaction" triggers do. What happens when there are
two different END blocks in a session?

The manual is clear that both are executed.

With connection poolers, backends can last quite awhile. Is it OK for
the END block to run hours after the rest of the code?

This is an interesting point -- should END blocks be called on DISCARD ALL?

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#10David Fetter
david@fetter.org
In reply to: Alvaro Herrera (#9)
Re: BUG #5066: plperl issues with perl_destruct() and END blocks

On Mon, Sep 21, 2009 at 12:06:30PM -0400, Alvaro Herrera wrote:

David Fetter escribi�:

Taken literally, that would mean, "the last action before the
backend exits," but at least to me, that sounds troubling for the
same reasons that "end of transaction" triggers do. What happens
when there are two different END blocks in a session?

The manual is clear that both are executed.

So it is, but does order matter, and if so, how would PostgreSQL know?

With connection poolers, backends can last quite awhile. Is it OK
for the END block to run hours after the rest of the code?

This is an interesting point -- should END blocks be called on
DISCARD ALL?

ENOCLUE

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

#11Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: David Fetter (#10)
Re: BUG #5066: plperl issues with perl_destruct() and END blocks

David Fetter escribi�:

On Mon, Sep 21, 2009 at 12:06:30PM -0400, Alvaro Herrera wrote:

David Fetter escribi�:

Taken literally, that would mean, "the last action before the
backend exits," but at least to me, that sounds troubling for the
same reasons that "end of transaction" triggers do. What happens
when there are two different END blocks in a session?

The manual is clear that both are executed.

So it is, but does order matter, and if so, how would PostgreSQL know?

The fine manual saith

You may have multiple "END" blocks within a file--they will execute in
reverse order of definition; that is: last in, first out (LIFO).

But then, why would we care? We just call the destructor and Perl
ensures that the blocks are called in the right order.

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

#12Robert Haas
robertmhaas@gmail.com
In reply to: Alvaro Herrera (#9)
Re: BUG #5066: plperl issues with perl_destruct() and END blocks

On Mon, Sep 21, 2009 at 12:06 PM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:

David Fetter escribió:

Taken literally, that would mean, "the last action before the backend
exits," but at least to me, that sounds troubling for the same reasons
that "end of transaction" triggers do.  What happens when there are
two different END blocks in a session?

The manual is clear that both are executed.

With connection poolers, backends can last quite awhile.  Is it OK for
the END block to run hours after the rest of the code?

This is an interesting point -- should END blocks be called on DISCARD ALL?

It seems pretty reasonable that it would. The intention of DISCARD
ALL is to completely reset the entire session.

...Robert

#13David Fetter
david@fetter.org
In reply to: Alvaro Herrera (#11)
Re: BUG #5066: plperl issues with perl_destruct() and END blocks

On Mon, Sep 21, 2009 at 01:06:17PM -0400, Alvaro Herrera wrote:

David Fetter escribi�:

On Mon, Sep 21, 2009 at 12:06:30PM -0400, Alvaro Herrera wrote:

David Fetter escribi�:

Taken literally, that would mean, "the last action before the
backend exits," but at least to me, that sounds troubling for
the same reasons that "end of transaction" triggers do. What
happens when there are two different END blocks in a session?

The manual is clear that both are executed.

So it is, but does order matter, and if so, how would PostgreSQL
know?

The fine manual saith

You may have multiple "END" blocks within a file--they will
execute in reverse order of definition; that is: last in, first
out (LIFO).

But then, why would we care? We just call the destructor and Perl
ensures that the blocks are called in the right order.

This is not quite what I meant. Let's say we have two or more different
PL/Perl functions executed over the course of a backend. Which one's
END block gets executed last? Do we need to warn people about this?
Generate a WARNING, even?

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

#14Robert Haas
robertmhaas@gmail.com
In reply to: David Fetter (#13)
Re: BUG #5066: plperl issues with perl_destruct() and END blocks

On Mon, Sep 21, 2009 at 2:17 PM, David Fetter <david@fetter.org> wrote:

On Mon, Sep 21, 2009 at 01:06:17PM -0400, Alvaro Herrera wrote:

David Fetter escribió:

On Mon, Sep 21, 2009 at 12:06:30PM -0400, Alvaro Herrera wrote:

David Fetter escribió:

Taken literally, that would mean, "the last action before the
backend exits," but at least to me, that sounds troubling for
the same reasons that "end of transaction" triggers do.  What
happens when there are two different END blocks in a session?

The manual is clear that both are executed.

So it is, but does order matter, and if so, how would PostgreSQL
know?

The fine manual saith

      You may have multiple "END" blocks within a file--they will
      execute in reverse order of definition; that is: last in, first
      out (LIFO).

But then, why would we care?  We just call the destructor and Perl
ensures that the blocks are called in the right order.

This is not quite what I meant.  Let's say we have two or more different
PL/Perl functions executed over the course of a backend.  Which one's
END block gets executed last?  Do we need to warn people about this?
Generate a WARNING, even?

This is a feature of the Perl language. I don't think it's our job to
second-guess the language design, however good or bad it may be. As a
long-time Perl programmer, I would certainly say that if you are
counting on the execution ordering of your END blocks, you are
probably playing with fire and likely ought to rethink your
application design, because there are all kinds of ways this could
fail spectacularly as a result of apparently innocuous application
changes (like, say, alphabetizing the list of "use" declarations in
some package). But that's true not only with PL/perl but with just
plain old perl, and I don't see that it's substantially more dangerous
here than anywhere else.

...Robert

#15David Fetter
david@fetter.org
In reply to: Robert Haas (#14)
Re: BUG #5066: plperl issues with perl_destruct() and END blocks

On Mon, Sep 21, 2009 at 02:28:11PM -0400, Robert Haas wrote:

On Mon, Sep 21, 2009 at 2:17 PM, David Fetter <david@fetter.org> wrote:

On Mon, Sep 21, 2009 at 01:06:17PM -0400, Alvaro Herrera wrote:

David Fetter escribi�:

On Mon, Sep 21, 2009 at 12:06:30PM -0400, Alvaro Herrera wrote:

David Fetter escribi�:

Taken literally, that would mean, "the last action before the
backend exits," but at least to me, that sounds troubling for
the same reasons that "end of transaction" triggers do. �What
happens when there are two different END blocks in a session?

The manual is clear that both are executed.

So it is, but does order matter, and if so, how would PostgreSQL
know?

The fine manual saith

� � � You may have multiple "END" blocks within a file--they will
� � � execute in reverse order of definition; that is: last in, first
� � � out (LIFO).

But then, why would we care? �We just call the destructor and Perl
ensures that the blocks are called in the right order.

This is not quite what I meant. �Let's say we have two or more different
PL/Perl functions executed over the course of a backend. �Which one's
END block gets executed last? �Do we need to warn people about this?
Generate a WARNING, even?

This is a feature of the Perl language. I don't think it's our job to
second-guess the language design, however good or bad it may be. As a
long-time Perl programmer, I would certainly say that if you are
counting on the execution ordering of your END blocks, you are
probably playing with fire and likely ought to rethink your
application design, because there are all kinds of ways this could
fail spectacularly as a result of apparently innocuous application
changes (like, say, alphabetizing the list of "use" declarations in
some package). But that's true not only with PL/perl but with just
plain old perl, and I don't see that it's substantially more dangerous
here than anywhere else.

OK, we've considered it and decided it's people's own foot-gun :)

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

#16Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: David Fetter (#13)
Re: BUG #5066: plperl issues with perl_destruct() and END blocks

David Fetter escribi�:

On Mon, Sep 21, 2009 at 01:06:17PM -0400, Alvaro Herrera wrote:

The fine manual saith

You may have multiple "END" blocks within a file--they will
execute in reverse order of definition; that is: last in, first
out (LIFO).

But then, why would we care? We just call the destructor and Perl
ensures that the blocks are called in the right order.

This is not quite what I meant. Let's say we have two or more different
PL/Perl functions executed over the course of a backend. Which one's
END block gets executed last?

I think the manual is quite clear on this point. It talks about "files"
which we don't have, but other than that it doesn't seem like there
shouldn't be any problem.

Now that I think about it, this only affects loaded modules, not the
plperl functions themselves, right? I mean, you can't define an END
block inside a function.

Do we need to warn people about this?

I don't see why not -- in the docs, of course.

Generate a WARNING, even?

"Log spam" anyone?

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#17Robert Haas
robertmhaas@gmail.com
In reply to: Alvaro Herrera (#16)
Re: BUG #5066: plperl issues with perl_destruct() and END blocks

On Mon, Sep 21, 2009 at 3:08 PM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:

David Fetter escribió:

On Mon, Sep 21, 2009 at 01:06:17PM -0400, Alvaro Herrera wrote:

The fine manual saith

    You may have multiple "END" blocks within a file--they will
    execute in reverse order of definition; that is: last in, first
    out (LIFO).

But then, why would we care?  We just call the destructor and Perl
ensures that the blocks are called in the right order.

This is not quite what I meant.  Let's say we have two or more different
PL/Perl functions executed over the course of a backend.  Which one's
END block gets executed last?

I think the manual is quite clear on this point.  It talks about "files"
which we don't have, but other than that it doesn't seem like there
shouldn't be any problem.

Now that I think about it, this only affects loaded modules, not the
plperl functions themselves, right?  I mean, you can't define an END
block inside a function.

You might think that, but it turns out the world of Perl is crazier
than the ordinary mind can fathom.

$ perl -e 'sub foo { END { print "hi\n" } }'
hi

...Robert

#18Tom Lane
tgl@sss.pgh.pa.us
In reply to: David Fetter (#10)
Re: BUG #5066: plperl issues with perl_destruct() and END blocks

David Fetter <david@fetter.org> writes:

On Mon, Sep 21, 2009 at 12:06:30PM -0400, Alvaro Herrera wrote:

With connection poolers, backends can last quite awhile. Is it OK
for the END block to run hours after the rest of the code?

This is an interesting point -- should END blocks be called on
DISCARD ALL?

ENOCLUE

And in the same vein, should they be called inside a transaction,
or not? What if they fail?

I don't see any reason whatsoever that we couldn't just document this
as a Perl feature not supported in plperl. If you do something like
creating threads inside plperl, we're going to give you the raspberry
when you complain about it breaking. END blocks can perfectly well
go into the same category.

regards, tom lane

#19Tim Bunce
Tim.Bunce@pobox.com
In reply to: Tom Lane (#18)
Re: BUG #5066: plperl issues with perl_destruct() and END blocks

On Mon, Sep 21, 2009 at 07:30:51PM -0400, Tom Lane wrote:

David Fetter <david@fetter.org> writes:

On Mon, Sep 21, 2009 at 12:06:30PM -0400, Alvaro Herrera wrote:

With connection poolers, backends can last quite awhile. Is it OK
for the END block to run hours after the rest of the code?

Yes.

This is an interesting point -- should END blocks be called on
DISCARD ALL?

ENOCLUE

And in the same vein, should they be called inside a transaction,
or not? What if they fail?

As I said in the original ticket, I'd be quite happy for plperl END
blocks to have no access to postgres at all, other than warnings going
to the log. The spi_* functions could return an error if postgres is
being shutdown (perhaps they already would if perl_destruct is called
late in the shutdown sequence). So transactions are mute.
Also, perl_destruct() will catch any exceptions from END blocks.

I don't see any reason whatsoever that we couldn't just document this
as a Perl feature not supported in plperl. If you do something like
creating threads inside plperl, we're going to give you the raspberry
when you complain about it breaking. END blocks can perfectly well
go into the same category.

Returning to my original use case, the NYTProf profiler needs END blocks
to work otherwise the generated profile data will be corrupt.

I don't see any reason not to add PL_exit_flags |= PERL_EXIT_DESTRUCT_END;
to plperl_init_interp(), and for perl_destruct() to be called late in
the shutdown sequence.

Tim.

#20Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#18)
Re: BUG #5066: plperl issues with perl_destruct() and END blocks

On Mon, Sep 21, 2009 at 7:30 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

David Fetter <david@fetter.org> writes:

On Mon, Sep 21, 2009 at 12:06:30PM -0400, Alvaro Herrera wrote:

With connection poolers, backends can last quite awhile.  Is it OK
for the END block to run hours after the rest of the code?

This is an interesting point -- should END blocks be called on
DISCARD ALL?

ENOCLUE

And in the same vein, should they be called inside a transaction,
or not?  What if they fail?

I don't see any reason whatsoever that we couldn't just document this
as a Perl feature not supported in plperl.  If you do something like
creating threads inside plperl, we're going to give you the raspberry
when you complain about it breaking.  END blocks can perfectly well
go into the same category.

If the changes are simple, as Tim seems to believe, exactly what do we
lose by doing this?

...Robert

#21Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#20)
#22Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#21)
#23Tim Bunce
Tim.Bunce@pobox.com
In reply to: Tom Lane (#21)