Pl/Perl function: Speed of the First time executing pl/perl function in connection;

Started by Oleg Serovabout 17 years ago26 messages
#1Oleg Serov
serovov@gmail.com

When perl function executes first time, it is too slowly, but if
execute perl function(not function which executed first time) again it
runs in 1000 times faster. Why ? how can i optimize it ?
Configure shared_preload_libraries = '$libdir/plperl' or
local_preload_libraries = '$libdir/plugins/plperl' does not help;
Reproduce code:
-- First of all, creating functions
CREATE OR REPLACE FUNCTION "perl_test_speed" () RETURNS
"pg_catalog"."void" AS $body$;$body$ LANGUAGE 'plperl' VOLATILE CALLED
ON NULL INPUT SECURITY INVOKER;
CREATE OR REPLACE FUNCTION "perl_test_speed_two" () RETURNS
"pg_catalog"."void" AS $body$;$body$ LANGUAGE 'plperl' VOLATILE CALLED
ON NULL INPUT SECURITY INVOKER;

-- Reopen connection(THIS IS IMPORTANT STEP!)

-- Execute:
EXPLAIN ANALYZE SELECT perl_test_speed();
-- Result:
Result (cost=0.00..0.26 rows=1 width=0) (actual
time=149.994..149.999 rows=1 loops=1)
Total runtime: 150.363 ms -- WTF?
-- Other calls in this connection:
EXPLAIN ANALYZE SELECT perl_test_speed();
-- Result:
Result (cost=0.00..0.26 rows=1 width=0) (actual time=0.121..0.122
rows=1 loops=1)
Total runtime: 0.176 ms

-- Reopen connection(THIS IS IMPORTANT STEP!)

EXPLAIN ANALYZE SELECT perl_test_speed_two();
----------------------------------------------------------------------------------------
Result (cost=0.00..0.26 rows=1 width=0) (actual
time=173.713..173.716 rows=1 loops=1)
Total runtime: 174.073 ms -- WTF ?

EXPLAIN ANALYZE SELECT perl_test_speed();
------------------------------------------------------------------------------------
Result (cost=0.00..0.26 rows=1 width=0) (actual time=0.973..0.975
rows=1 loops=1)
Total runtime: 1.035 ms

EXPLAIN ANALYZE SELECT perl_test_speed();
------------------------------------------------------------------------------------
Result (cost=0.00..0.26 rows=1 width=0) (actual time=0.122..0.123
rows=1 loops=1)
Total runtime: 0.171 ms
-- END;

How can i make it run faster, after every reconnect ?
Thanks!
PS. Sorry for my English.

#2Andrew Dunstan
andrew@dunslane.net
In reply to: Oleg Serov (#1)
Re: Pl/Perl function: Speed of the First time executing pl/perl function in connection;

Oleg Serov wrote:

When perl function executes first time, it is too slowly, but if
execute perl function(not function which executed first time) again it
runs in 1000 times faster. Why ? how can i optimize it ?
Configure shared_preload_libraries = '$libdir/plperl' or
local_preload_libraries = '$libdir/plugins/plperl' does not help;

The function is recompiled on the first call in each backend. (The same
is true of most PLs, BTW, it's not a perl-only problem.) There is no
immediate cure, unfortunately. Using pooled connections might help to
mitigate the effect.

One might imagine providing for certain functions to be loaded on
backend startup, or even (where we aren't using BACKEND_EXEC) on
postmaster start. But that would be a new feature, and we are past
feature freeze for the upcoming release, so it's not going to happen any
time soon.

cheers

andrrew

#3Oleg Serov
serovov@gmail.com
In reply to: Andrew Dunstan (#2)
Re: Pl/Perl function: Speed of the First time executing pl/perl function in connection;

Why pl/pgsql doesn't have this effect ?

2008/11/16 Andrew Dunstan <andrew@dunslane.net>:

Show quoted text

Oleg Serov wrote:

When perl function executes first time, it is too slowly, but if
execute perl function(not function which executed first time) again it
runs in 1000 times faster. Why ? how can i optimize it ?
Configure shared_preload_libraries = '$libdir/plperl' or
local_preload_libraries = '$libdir/plugins/plperl' does not help;

The function is recompiled on the first call in each backend. (The same is
true of most PLs, BTW, it's not a perl-only problem.) There is no immediate
cure, unfortunately. Using pooled connections might help to mitigate the
effect.

One might imagine providing for certain functions to be loaded on backend
startup, or even (where we aren't using BACKEND_EXEC) on postmaster start.
But that would be a new feature, and we are past feature freeze for the
upcoming release, so it's not going to happen any time soon.

cheers

andrrew

#4Oleg Serov
serovov@gmail.com
In reply to: Oleg Serov (#3)
Re: Pl/Perl function: Speed of the First time executing pl/perl function in connection;

Hey, you are wrong, compile time is 1 ms, but not 100 ms, it hapens on
first plperl function call, it is perl init problem. not function
compilation.

2008/11/16 Oleg Serov <serovov@gmail.com>:

Show quoted text

Why pl/pgsql doesn't have this effect ?

2008/11/16 Andrew Dunstan <andrew@dunslane.net>:

Oleg Serov wrote:

When perl function executes first time, it is too slowly, but if
execute perl function(not function which executed first time) again it
runs in 1000 times faster. Why ? how can i optimize it ?
Configure shared_preload_libraries = '$libdir/plperl' or
local_preload_libraries = '$libdir/plugins/plperl' does not help;

The function is recompiled on the first call in each backend. (The same is
true of most PLs, BTW, it's not a perl-only problem.) There is no immediate
cure, unfortunately. Using pooled connections might help to mitigate the
effect.

One might imagine providing for certain functions to be loaded on backend
startup, or even (where we aren't using BACKEND_EXEC) on postmaster start.
But that would be a new feature, and we are past feature freeze for the
upcoming release, so it's not going to happen any time soon.

cheers

andrrew

#5Andrew Dunstan
andrew@dunslane.net
In reply to: Oleg Serov (#4)
Re: Pl/Perl function: Speed of the First time executing pl/perl function in connection;

Oleg Serov wrote:

Hey, you are wrong, compile time is 1 ms, but not 100 ms, it hapens on
first plperl function call, it is perl init problem. not function
compilation.

Please do NOT top-answer - it makes threads totally unreadable.

Is a one-off cost per connection of 100ms so bothersome?

The alternative would be to start it up when the backend starts, but
backend startup cost is already high, so that doesn't seem like a win.

cheers

andrew

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#5)
Re: Pl/Perl function: Speed of the First time executing pl/perl function in connection;

Andrew Dunstan <andrew@dunslane.net> writes:

Is a one-off cost per connection of 100ms so bothersome?

I think the complaint is that there's no obvious reason for it to be so
high.

On my Fedora 9 machine, the overhead to start plperl seems to be about
40 msec. This compares unfavorably to the time to start perl from the
command line, which is under 4 msec. I see that /usr/bin/perl pulls in
libperl.so, so it's paying the same shlib overhead as we do. How is it
that we take ten times longer to start up?

regards, tom lane

#7Martijn van Oosterhout
kleptog@svana.org
In reply to: Andrew Dunstan (#5)
Re: Pl/Perl function: Speed of the First time executing pl/perl function in connection;

On Sun, Nov 16, 2008 at 09:15:28AM -0500, Andrew Dunstan wrote:

Oleg Serov wrote:

Hey, you are wrong, compile time is 1 ms, but not 100 ms, it hapens on
first plperl function call, it is perl init problem. not function
compilation.

Please do NOT top-answer - it makes threads totally unreadable.

Is a one-off cost per connection of 100ms so bothersome?

The alternative would be to start it up when the backend starts, but
backend startup cost is already high, so that doesn't seem like a win.

If per-connection costs are a problem, then connection pooling usually
helps.

Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/

Show quoted text

Please line up in a tree and maintain the heap invariant while
boarding. Thank you for flying nlogn airlines.

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#6)
Re: Pl/Perl function: Speed of the First time executing pl/perl function in connection;

I wrote:

On my Fedora 9 machine, the overhead to start plperl seems to be about
40 msec. This compares unfavorably to the time to start perl from the
command line, which is under 4 msec. I see that /usr/bin/perl pulls in
libperl.so, so it's paying the same shlib overhead as we do. How is it
that we take ten times longer to start up?

The above number was for plperl in a SQL_ASCII database.
Some more data points:

plperl plperlu

SQL_ASCII 40 18
UTF8 67 18

which leads to the conclusion that those random little startup things
plperl does are just unbelievably expensive.

regards, tom lane

#9Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#8)
Re: Pl/Perl function: Speed of the First time executing pl/perl function in connection;

Tom Lane wrote:

I wrote:

On my Fedora 9 machine, the overhead to start plperl seems to be about
40 msec. This compares unfavorably to the time to start perl from the
command line, which is under 4 msec. I see that /usr/bin/perl pulls in
libperl.so, so it's paying the same shlib overhead as we do. How is it
that we take ten times longer to start up?

The above number was for plperl in a SQL_ASCII database.
Some more data points:

plperl plperlu

SQL_ASCII 40 18
UTF8 67 18

which leads to the conclusion that those random little startup things
plperl does are just unbelievably expensive.

I suspect much of this time is taken in loading perl modules, (strict,
Safe, utf8, SPI among others), not in anything very special that we do.:

andrew@constanza plperl]$ time perl -e ';'

real 0m0.004s
user 0m0.003s
sys 0m0.001s
[andrew@constanza plperl]$ time perl -e 'use strict; use Safe;use utf8;'

real 0m0.053s
user 0m0.016s
sys 0m0.007s

cheers

andrew

#10Oleg Serov
serovov@gmail.com
In reply to: Andrew Dunstan (#9)
Re: Pl/Perl function: Speed of the First time executing pl/perl function in connection;

So, how i must optimize it?

2008/11/16 Andrew Dunstan <andrew@dunslane.net>:

Show quoted text

Tom Lane wrote:

I wrote:

On my Fedora 9 machine, the overhead to start plperl seems to be about
40 msec. This compares unfavorably to the time to start perl from the
command line, which is under 4 msec. I see that /usr/bin/perl pulls in
libperl.so, so it's paying the same shlib overhead as we do. How is it
that we take ten times longer to start up?

The above number was for plperl in a SQL_ASCII database.
Some more data points:

plperl plperlu

SQL_ASCII 40 18
UTF8 67 18

which leads to the conclusion that those random little startup things
plperl does are just unbelievably expensive.

I suspect much of this time is taken in loading perl modules, (strict, Safe,
utf8, SPI among others), not in anything very special that we do.:

andrew@constanza plperl]$ time perl -e ';'

real 0m0.004s
user 0m0.003s
sys 0m0.001s
[andrew@constanza plperl]$ time perl -e 'use strict; use Safe;use utf8;'

real 0m0.053s
user 0m0.016s
sys 0m0.007s

cheers

andrew

#11Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#9)
Re: Pl/Perl function: Speed of the First time executing pl/perl function in connection;

Andrew Dunstan <andrew@dunslane.net> writes:

Tom Lane wrote:

... which leads to the conclusion that those random little startup things
plperl does are just unbelievably expensive.

I suspect much of this time is taken in loading perl modules, (strict,
Safe, utf8, SPI among others), not in anything very special that we do.:

I did some more poking and determined that on my machine, it takes about
14 msec to do plperl_init_interp and about 20 msec to do
plperl_safe_init.

Most of the former time (over 9 msec) is going into the perl_parse call;
I'm not entirely sure why it's so expensive. Taking out the SPI module
initialization does *not* save anything worth noticing. I tried
trimming various bits of the PERLBOOT script and the only thing that
made a dent was removing "use vars qw(%_SHARED);", but that was still
not much (about 5 msec).

In plperl_safe_init, it's entirely clear that "require Safe" is the
expensive part; I can duplicate a comparable time expenditure when doing
that from the command line.

So about the only real answer is going to be preloading. It seems worth
considering that on machines where can_run_two is true, we should just
go ahead and initialize both interps at _PG_init time, so as to allow
the "require Safe" overhead to be bought back by preloading. I'm not
sure how to deal with the UTF8 hack though.

regards, tom lane

#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Oleg Serov (#10)
Re: Pl/Perl function: Speed of the First time executing pl/perl function in connection;

"Oleg Serov" <serovov@gmail.com> writes:

So, how i must optimize it?

The short-term answer seems to be "preload and use plperlu".

regards, tom lane

#13Oleg Serov
serovov@gmail.com
In reply to: Tom Lane (#12)
Re: Pl/Perl function: Speed of the First time executing pl/perl function in connection;

Wee need to use shared memory for passing one BIGINT value(is session
throwout triggers), can you advice the method to get/set it with best
performance ?

2008/11/16 Tom Lane <tgl@sss.pgh.pa.us>:

Show quoted text

"Oleg Serov" <serovov@gmail.com> writes:

So, how i must optimize it?

The short-term answer seems to be "preload and use plperlu".

regards, tom lane

#14Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#11)
Re: Pl/Perl function: Speed of the First time executing pl/perl function in connection;

Tom Lane wrote:

So about the only real answer is going to be preloading. It seems worth
considering that on machines where can_run_two is true, we should just
go ahead and initialize both interps at _PG_init time, so as to allow
the "require Safe" overhead to be bought back by preloading.

Even if only one language is defined?

cheers

andrew

#15Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#14)
Re: Pl/Perl function: Speed of the First time executing pl/perl function in connection;

Andrew Dunstan <andrew@dunslane.net> writes:

Tom Lane wrote:

So about the only real answer is going to be preloading. It seems worth
considering that on machines where can_run_two is true, we should just
go ahead and initialize both interps at _PG_init time, so as to allow
the "require Safe" overhead to be bought back by preloading.

Even if only one language is defined?

The point here is to do the work at postmaster start time. You won't
get a chance to find out whether both languages are defined in some
database or other. (This is the same thing as the point about the
UTF8 hack --- you can't tell if it's needed or not.)

regards, tom lane

#16Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#15)
Re: Pl/Perl function: Speed of the First time executing pl/perl function in connection;

Tom Lane wrote:

Andrew Dunstan <andrew@dunslane.net> writes:

Tom Lane wrote:

So about the only real answer is going to be preloading. It seems worth
considering that on machines where can_run_two is true, we should just
go ahead and initialize both interps at _PG_init time, so as to allow
the "require Safe" overhead to be bought back by preloading.

Even if only one language is defined?

The point here is to do the work at postmaster start time. You won't
get a chance to find out whether both languages are defined in some
database or other. (This is the same thing as the point about the
UTF8 hack --- you can't tell if it's needed or not.)

w.r.t. UTF8, I guess we'll need a way of knowing if we're preloading or
not, and if so we'd need to skip the calls to GetDatabaseEncoding().

I assume this will all happen in the 8.5 dev cycle (or later).

cheers

andrew

#17Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#16)
Re: Pl/Perl function: Speed of the First time executing pl/perl function in connection;

Andrew Dunstan <andrew@dunslane.net> writes:

Tom Lane wrote:

The point here is to do the work at postmaster start time. You won't
get a chance to find out whether both languages are defined in some
database or other. (This is the same thing as the point about the
UTF8 hack --- you can't tell if it's needed or not.)

w.r.t. UTF8, I guess we'll need a way of knowing if we're preloading or
not, and if so we'd need to skip the calls to GetDatabaseEncoding().

If you mean that you want to still do the UTF8 hack at backend start
time, I think you might as well not bother with any of this. That hack
in itself is slow enough to be 50% of the problem (presumably because
it's loading some modules down in there somewhere). We need to arrange
to either not do it at all or do it at preload.

regards, tom lane

#18Hannu Krosing
hannu@krosing.net
In reply to: Oleg Serov (#13)
Re: Pl/Perl function: Speed of the First time executing pl/perl function in connection;

On Sun, 2008-11-16 at 23:20 +0300, Oleg Serov wrote:

Wee need to use shared memory for passing one BIGINT value(is session
throwout triggers), can you advice the method to get/set it with best
performance ?

have you tried "setval(seq, value)" to set and "select last_value from
seq" to read it.

or just use a table, if concurrency control is important

hannu=# create table shvaltable(id int, value bigint);
CREATE TABLE
Time: 34.704 ms
hannu=# insert into shvaltable values(1,0);
INSERT 0 1
Time: 0.742 ms
hannu=# explain analyse update shvaltable set value=47 where id=1;
QUERY
PLAN
-------------------------------------------------------------------------------------------------------
Seq Scan on shvaltable (cost=0.00..34.25 rows=10 width=10) (actual
time=0.015..0.017 rows=1 loops=1)
Filter: (id = 1)
Total runtime: 0.073 ms
(3 rows)

Time: 3.503 ms
hannu=# explain analyse update shvaltable set value=47 where id=1;
QUERY
PLAN
-------------------------------------------------------------------------------------------------------
Seq Scan on shvaltable (cost=0.00..34.25 rows=10 width=10) (actual
time=0.014..0.016 rows=1 loops=1)
Filter: (id = 1)
Total runtime: 0.058 ms
(3 rows)

Time: 3.298 ms

Show quoted text

2008/11/16 Tom Lane <tgl@sss.pgh.pa.us>:

"Oleg Serov" <serovov@gmail.com> writes:

So, how i must optimize it?

The short-term answer seems to be "preload and use plperlu".

regards, tom lane

#19Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#17)
Re: Pl/Perl function: Speed of the First time executing pl/perl function in connection;

Tom Lane wrote:

Andrew Dunstan <andrew@dunslane.net> writes:

Tom Lane wrote:

The point here is to do the work at postmaster start time. You won't
get a chance to find out whether both languages are defined in some
database or other. (This is the same thing as the point about the
UTF8 hack --- you can't tell if it's needed or not.)

w.r.t. UTF8, I guess we'll need a way of knowing if we're preloading or
not, and if so we'd need to skip the calls to GetDatabaseEncoding().

If you mean that you want to still do the UTF8 hack at backend start
time, I think you might as well not bother with any of this. That hack
in itself is slow enough to be 50% of the problem (presumably because
it's loading some modules down in there somewhere). We need to arrange
to either not do it at all or do it at preload.

I don't mean that we should do it at backend start even if we are
preloading. But you're not intending to make preloading compulsory, are
you?

The whole cost of loading that has been complained about is 1/10 of a
second per backend (or maybe more if you use both plperl and plperlu).
I'm fine with providing a preloading solution, but plenty of us who use
plperl from time to time are not disturbed by the current cost,
especially when it's used in web apps that use connection pooling, where
it disappears entirely in the noise.

Does preloading work at all in the BACKEND_EXEC case?

cheers

andrew

#20Andrew Dunstan
andrew@dunslane.net
In reply to: Hannu Krosing (#18)
Re: Pl/Perl function: Speed of the First time executing pl/perl function in connection;

Hannu Krosing wrote:

On Sun, 2008-11-16 at 23:20 +0300, Oleg Serov wrote:

Wee need to use shared memory for passing one BIGINT value(is session
throwout triggers), can you advice the method to get/set it with best
performance ?

have you tried "setval(seq, value)" to set and "select last_value from
seq" to read it.

or just use a table, if concurrency control is important

Neither of these would be the same as using plperl's %_SHARED, which I
gather is what is being done. But using a temp table would be.

cheers

andrew

#21Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#19)
Re: Pl/Perl function: Speed of the First time executing pl/perl function in connection;

Andrew Dunstan <andrew@dunslane.net> writes:

I don't mean that we should do it at backend start even if we are
preloading. But you're not intending to make preloading compulsory, are
you?

No, certainly not. But people who are doing preloading should get as
much benefit out of doing so as possible.

Does preloading work at all in the BACKEND_EXEC case?

No.

regards, tom lane

#22Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#15)
Re: Pl/Perl function: Speed of the First time executing pl/perl function in connection;

The point here is to do the work at postmaster start time.

What happens if I upgrade one of the perl modules that is getting
preloaded here? It seems this change will require the entire database
to be shut down to pick up the changes, vs. just needing to terminate
individual backends.

...Robert

#23Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#22)
Re: Pl/Perl function: Speed of the First time executing pl/perl function in connection;

"Robert Haas" <robertmhaas@gmail.com> writes:

The point here is to do the work at postmaster start time.

What happens if I upgrade one of the perl modules that is getting
preloaded here? It seems this change will require the entire database
to be shut down to pick up the changes, vs. just needing to terminate
individual backends.

Well, that would be one of the tradeoffs to consider when deciding
to preload plperl.

regards, tom lane

#24Hannu Krosing
hannu@2ndQuadrant.com
In reply to: Andrew Dunstan (#20)
Re: Pl/Perl function: Speed of the First time executing pl/perl function in connection;

On Mon, 2008-11-17 at 12:37 -0500, Andrew Dunstan wrote:

Hannu Krosing wrote:

On Sun, 2008-11-16 at 23:20 +0300, Oleg Serov wrote:

Wee need to use shared memory for passing one BIGINT value(is session
throwout triggers), can you advice the method to get/set it with best
performance ?

have you tried "setval(seq, value)" to set and "select last_value from
seq" to read it.

or just use a table, if concurrency control is important

Neither of these would be the same as using plperl's %_SHARED, which I
gather is what is being done.

He asked for a method to get/set a shared value "with best performance"
and updating a table row seems about twice as fast as calling a plperl
function.

But using a temp table would be.

You can't access temp table from several sessions, which is what I think
they want.

--
------------------------------------------
Hannu Krosing http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability
Services, Consulting and Training

#25Andrew Dunstan
andrew@dunslane.net
In reply to: Hannu Krosing (#24)
Re: Pl/Perl function: Speed of the First time executing pl/perl function in connection;

Hannu Krosing wrote:

Neither of these would be the same as using plperl's %_SHARED, which I
gather is what is being done.

He asked for a method to get/set a shared value "with best performance"
and updating a table row seems about twice as fast as calling a plperl
function.

But using a temp table would be.

You can't access temp table from several sessions, which is what I think
they want.

Well, neither is %_SHARED shared across backends, so if that's what they
want I'm wondering what exactly they are doing. Maybe using plperlu to
talk to some shared memory, or memcached, or some such.

cheers

andrew

#26Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#21)
Re: Pl/Perl function: Speed of the First time executing pl/perl function in connection;

Tom Lane wrote:

Andrew Dunstan <andrew@dunslane.net> writes:

I don't mean that we should do it at backend start even if we are
preloading. But you're not intending to make preloading compulsory, are
you?

No, certainly not. But people who are doing preloading should get as
much benefit out of doing so as possible.

OK, I think we're in furious agreement.

cheers

andrew