why does plperl cache functions using just a bool for is_trigger

Started by Jan Urbańskiover 15 years ago37 messageshackers
Jump to latest
#1Jan Urbański
wulczer@wulczer.org

I see that plperl uses a triple of (function oid, is_trigger flag, user
id) as a hash key for caching compiled functions. OTOH pltcl and plpgsql
both use (oid, trigger relation oid, user id). Is there any reason why
just using a bool as plperl does would be wrong?

I'm trying to write a validator function for plpython and I'm not sure
if I should copy plperl's or plpgsql's logic.

Cheers,
Jan

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jan Urbański (#1)
Re: why does plperl cache functions using just a bool for is_trigger

=?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= <wulczer@wulczer.org> writes:

I see that plperl uses a triple of (function oid, is_trigger flag, user
id) as a hash key for caching compiled functions. OTOH pltcl and plpgsql
both use (oid, trigger relation oid, user id). Is there any reason why
just using a bool as plperl does would be wrong?

plpgsql needs to consider the trigger relation OID because datatypes of
the trigger relation's columns will make their way into cached plans
for the function. The same function, if applied as a trigger on two
different rels, could therefore have different cached plans so it needs
two separate cache entries.

In PLs where this kind of dependency isn't possible, there's no need for
separate function cache entries.

I'm not certain that plperl is actually correct to do it that way,
but that's the basic idea.

regards, tom lane

#3Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#2)
Re: why does plperl cache functions using just a bool for is_trigger

On 10/24/2010 06:44 PM, Tom Lane wrote:

=?UTF-8?B?SmFuIFVyYmHFhHNraQ==?=<wulczer@wulczer.org> writes:

I see that plperl uses a triple of (function oid, is_trigger flag, user
id) as a hash key for caching compiled functions. OTOH pltcl and plpgsql
both use (oid, trigger relation oid, user id). Is there any reason why
just using a bool as plperl does would be wrong?

plpgsql needs to consider the trigger relation OID because datatypes of
the trigger relation's columns will make their way into cached plans
for the function. The same function, if applied as a trigger on two
different rels, could therefore have different cached plans so it needs
two separate cache entries.

In PLs where this kind of dependency isn't possible, there's no need for
separate function cache entries.

I'm not certain that plperl is actually correct to do it that way,
but that's the basic idea.

Why do we need the is_trigger flag at all for the plperl hash key? At
first glance it strikes me as unnecessary.

cheers

andrew

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#3)
Re: why does plperl cache functions using just a bool for is_trigger

Andrew Dunstan <andrew@dunslane.net> writes:

On 10/24/2010 06:44 PM, Tom Lane wrote:

I'm not certain that plperl is actually correct to do it that way,
but that's the basic idea.

Why do we need the is_trigger flag at all for the plperl hash key? At
first glance it strikes me as unnecessary.

We might not. Does the presence or absence of the $_TD hash reference
have any impact on what we cache, or what Perl might cache internally?

regards, tom lane

#5Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#4)
Re: why does plperl cache functions using just a bool for is_trigger

On 10/24/2010 07:50 PM, Tom Lane wrote:

Andrew Dunstan<andrew@dunslane.net> writes:

On 10/24/2010 06:44 PM, Tom Lane wrote:

I'm not certain that plperl is actually correct to do it that way,
but that's the basic idea.

Why do we need the is_trigger flag at all for the plperl hash key? At
first glance it strikes me as unnecessary.

We might not. Does the presence or absence of the $_TD hash reference
have any impact on what we cache, or what Perl might cache internally?

For both trigger and non-trigger functions, we compile this ahead of the
user-set function code:

our $_TD; local $_TD=shift;

Non-trigger functions get passed "undef" to correspond to this invisible
argument, while trigger functions get passed the hashref that the
trigger calling code has set up.

cheers

andrew

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#5)
Re: why does plperl cache functions using just a bool for is_trigger

Andrew Dunstan <andrew@dunslane.net> writes:

On 10/24/2010 07:50 PM, Tom Lane wrote:

Andrew Dunstan<andrew@dunslane.net> writes:

Why do we need the is_trigger flag at all for the plperl hash key? At
first glance it strikes me as unnecessary.

We might not. Does the presence or absence of the $_TD hash reference
have any impact on what we cache, or what Perl might cache internally?

For both trigger and non-trigger functions, we compile this ahead of the
user-set function code:
our $_TD; local $_TD=shift;
Non-trigger functions get passed "undef" to correspond to this invisible
argument, while trigger functions get passed the hashref that the
trigger calling code has set up.

Seems like we don't need it then. You going to get rid of it?

regards, tom lane

#7Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#6)
Re: why does plperl cache functions using just a bool for is_trigger

On 10/24/2010 09:34 PM, Tom Lane wrote:

For both trigger and non-trigger functions, we compile this ahead of the
user-set function code:
our $_TD; local $_TD=shift;
Non-trigger functions get passed "undef" to correspond to this invisible
argument, while trigger functions get passed the hashref that the
trigger calling code has set up.

Seems like we don't need it then. You going to get rid of it?

Ok, will do.

cheers

andrew

#8Jan Urbański
wulczer@wulczer.org
In reply to: Andrew Dunstan (#7)
Re: why does plperl cache functions using just a bool for is_trigger

On 25/10/10 03:59, Andrew Dunstan wrote:

On 10/24/2010 09:34 PM, Tom Lane wrote:

For both trigger and non-trigger functions, we compile this ahead of the
user-set function code:
our $_TD; local $_TD=shift;
Non-trigger functions get passed "undef" to correspond to this invisible
argument, while trigger functions get passed the hashref that the
trigger calling code has set up.

Seems like we don't need it then. You going to get rid of it?

Ok, will do.

Seems that this circumverts some output conversion error checking, since
adding the attached to the regression suite results in a segfault during
the plperl installcheck.

Reverting 2d01ec0708d571eef926f3f5795aa73759df5d9a fixes it. Noticed
while fooling around with plpython and hitting a similar error (since
plpython does have a regression test for trigger functions being called
directly).

Cheers,
Jan

Attachments:

plperl-regression.difftext/x-patch; name=plperl-regression.diffDownload+12-0
#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jan Urbański (#8)
Re: why does plperl cache functions using just a bool for is_trigger

=?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= <wulczer@wulczer.org> writes:

Seems that this circumverts some output conversion error checking, since
adding the attached to the regression suite results in a segfault during
the plperl installcheck.

Reverting 2d01ec0708d571eef926f3f5795aa73759df5d9a fixes it.

Good catch, patch reverted (and regression test added).

regards, tom lane

#10Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#9)
Re: why does plperl cache functions using just a bool for is_trigger

On 10/31/2010 11:44 AM, Tom Lane wrote:

=?UTF-8?B?SmFuIFVyYmHFhHNraQ==?=<wulczer@wulczer.org> writes:

Seems that this circumverts some output conversion error checking, since
adding the attached to the regression suite results in a segfault during
the plperl installcheck.
Reverting 2d01ec0708d571eef926f3f5795aa73759df5d9a fixes it.

Good catch, patch reverted (and regression test added).

Well, I guess that answers the question of why we needed it, which
nobody could answer before. I'm not sure I exactly understand what's
going on here, though - I guess I need to look at it closer. At least I
think we need a code comment on why the trigger flag is needed as part
of the hash key.

cheers

andrew

#11Alex Hunsaker
badalex@gmail.com
In reply to: Andrew Dunstan (#10)
Re: why does plperl cache functions using just a bool for is_trigger

On Sun, Oct 31, 2010 at 12:00, Andrew Dunstan <andrew@dunslane.net> wrote:

On 10/31/2010 11:44 AM, Tom Lane wrote:

Good catch, patch reverted (and regression test added).

Well, I guess that answers the question of why we needed it, which nobody
could answer before. I'm not sure I exactly understand what's going on here,
though - I guess I need to look at it closer. At least I think we need a
code comment on why the trigger flag is needed as part of the hash key.

The stack trace is:
#0 0x0000000000000000 in ?? ()
#1 0x00000000006c18e9 in InputFunctionCall (flinfo=0x2a039a0,
str=0x0, typioparam=0, typmod=-1)
#2 0x00007ff6d2bdf950 in plperl_func_handler (fcinfo=0x7fff4743bec0)
at plperl.c:1729

which happens because prodesc->result_in_func.fn_addr (flinfo) is
NULL. That happens because when we are a trigger we don't setup
input/output conversion. And with the change we get the same
proc_desc for triggers and non triggers, so if the trigger function
gets called first, any call to the direct function will use the same
proc_desc with the wrong input/out conversion.

There is a check so that trigger functions can not be called as plain
functions, but it only gets called when we do not have an entry in
plperl_proc_hash. I think just moving that up, something the like the
attached should be enough. Thoughts?

Attachments:

plperl_rem_proc_key_trig.patchtext/x-patch; charset=US-ASCII; name=plperl_rem_proc_key_trig.patchDownload+13-16
#12Andrew Dunstan
andrew@dunslane.net
In reply to: Alex Hunsaker (#11)
Re: why does plperl cache functions using just a bool for is_trigger

On 10/31/2010 04:40 PM, Alex Hunsaker wrote:

On Sun, Oct 31, 2010 at 12:00, Andrew Dunstan<andrew@dunslane.net> wrote:

On 10/31/2010 11:44 AM, Tom Lane wrote:

Good catch, patch reverted (and regression test added).

Well, I guess that answers the question of why we needed it, which nobody
could answer before. I'm not sure I exactly understand what's going on here,
though - I guess I need to look at it closer. At least I think we need a
code comment on why the trigger flag is needed as part of the hash key.

The stack trace is:
#0 0x0000000000000000 in ?? ()
#1 0x00000000006c18e9 in InputFunctionCall (flinfo=0x2a039a0,
str=0x0, typioparam=0, typmod=-1)
#2 0x00007ff6d2bdf950 in plperl_func_handler (fcinfo=0x7fff4743bec0)
at plperl.c:1729

which happens because prodesc->result_in_func.fn_addr (flinfo) is
NULL. That happens because when we are a trigger we don't setup
input/output conversion. And with the change we get the same
proc_desc for triggers and non triggers, so if the trigger function
gets called first, any call to the direct function will use the same
proc_desc with the wrong input/out conversion.

How does that happen given that the function Oid is part of the hash key?

There is a check so that trigger functions can not be called as plain
functions, but it only gets called when we do not have an entry in
plperl_proc_hash. I think just moving that up, something the like the
attached should be enough.

This looks like the right fix, though.

cheers

andrew

#13Alex Hunsaker
badalex@gmail.com
In reply to: Andrew Dunstan (#12)
Re: why does plperl cache functions using just a bool for is_trigger

On Sun, Oct 31, 2010 at 15:17, Andrew Dunstan <andrew@dunslane.net> wrote:

On 10/31/2010 04:40 PM, Alex Hunsaker wrote:

And with the change we get the same
proc_desc for triggers and non triggers, so if the trigger function
gets called first, any call to the direct function will use the same
proc_desc with the wrong input/out conversion.

How does that happen given that the function Oid is part of the hash key?

They are the same function and so have the same Oid ?

#14Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#12)
Re: why does plperl cache functions using just a bool for is_trigger

Andrew Dunstan <andrew@dunslane.net> writes:

On 10/31/2010 04:40 PM, Alex Hunsaker wrote:

which happens because prodesc->result_in_func.fn_addr (flinfo) is
NULL. That happens because when we are a trigger we don't setup
input/output conversion. And with the change we get the same
proc_desc for triggers and non triggers, so if the trigger function
gets called first, any call to the direct function will use the same
proc_desc with the wrong input/out conversion.

How does that happen given that the function Oid is part of the hash key?

I think the crash is dependent on the fact that the function is created
and called in the same session. That means the validator gets called on
it first, and the validator not unreasonably assumes istrigger = true,
and then it calls compile_plperl_function which sets up a cache entry
on that basis. So then when the "regular" call comes along, it tries
to reuse the cache entry in the other style. Kaboom.

There is a check so that trigger functions can not be called as plain
functions, but it only gets called when we do not have an entry in
plperl_proc_hash. I think just moving that up, something the like the
attached should be enough.

This looks like the right fix, though.

No, that is just moving a test that only needs to be done once into a
place where it has to be done every time. You might as well argue that
we shouldn't cache any of the setup but just redo it all every time.

The fundamental issue here is that the contents of plperl_proc_desc
structs are different between the trigger and non-trigger cases.
Unless you're prepared to make them the same, and guarantee that they
always will be the same in future, I think that including the istrigger
flag in the hash key is a good safety feature. It's also the same way
that the other three PLs do things, and I see no good excuse for plperl
to do things differently here.

IOW, it's not broke, let's not fix it.

regards, tom lane

#15Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#14)
Re: why does plperl cache functions using just a bool for is_trigger

On 11/01/2010 11:28 AM, Tom Lane wrote:

The fundamental issue here is that the contents of plperl_proc_desc
structs are different between the trigger and non-trigger cases.
Unless you're prepared to make them the same, and guarantee that they
always will be the same in future, I think that including the istrigger
flag in the hash key is a good safety feature. It's also the same way
that the other three PLs do things, and I see no good excuse for plperl
to do things differently here.

IOW, it's not broke, let's not fix it.

Ok, then let's make a note in the code to this effect. When the question
was asked before about why it was there nobody seemed to have any good
answer.

cheers

andrew

#16Alex Hunsaker
badalex@gmail.com
In reply to: Tom Lane (#14)
Re: why does plperl cache functions using just a bool for is_trigger

On Mon, Nov 1, 2010 at 09:28, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I think the crash is dependent on the fact that the function is created
and called in the same session.  That means the validator gets called on
it first, and the validator not unreasonably assumes istrigger = true,
and then it calls compile_plperl_function which sets up a cache entry
on that basis.  So then when the "regular" call comes along, it tries
to reuse the cache entry in the other style.  Kaboom.

The other Kaboom happens if the trigger gets called as a trigger first
and then directly.

There is a check so that trigger functions can not be called as plain
functions... I think just moving that up...

No, that is just moving a test that only needs to be done once into a
place where it has to be done every time.  You might as well argue that
we shouldn't cache any of the setup but just redo it all every time.

Huh? I might try and argue that if the new test was more complex than
2 compares :P. In-fact the way it stands now we uselessly grab the
functions pg_proc entry in the common case. Would you object to
trying to clean that up across all pls? Im thinking add an is_trigger
or context to each proc_desc, then check that in the corresponding
handlers. While im at it get rid of at least one SysCache lookup.
Thoughts? We can still keep the is_trigger bool in the hash key, as
you said below it is a good safety feature. I just wish the logic was
spelled out a bit more. Maybe im alone here.

It's also the same way
that the other three PLs do things, and I see no good excuse for plperl
to do things differently here.

Im all in favor of keeping things between the pls as close as possible.

Speaking of which, pltcl stores the trigger reloid instead of a flag
(it also uses tg_reloid in the internal proname). It seems a tad
excessive to have one function *per* trigger table. I looked through
the history to see if there was some reason, it goes all the way back
to the initial commit. I assume its this way because it copied
plpgsql, which needs it as the rowtype might be different per table.
pltcl should not have that issue. Find attached a patch to clean that
up and make it match the other pls (err um plperl). It passes its
regression tests and some additional limited testing. Thoughts?

Attachments:

pltcl_rm_tgrelod_key.patchtext/x-patch; charset=US-ASCII; name=pltcl_rm_tgrelod_key.patchDownload+20-20
#17Alex Hunsaker
badalex@gmail.com
In reply to: Alex Hunsaker (#16)
Re: why does plperl cache functions using just a bool for is_trigger

On Mon, Nov 1, 2010 at 15:24, Alex Hunsaker <badalex@gmail.com> wrote:
houldn't cache any of the setup but just redo it all every time.

Huh?  I might try and argue that if the new test was more complex than
2 compares :P.  In-fact the way it stands now we uselessly grab the
functions pg_proc entry in the common case.

This is bogus, I missed the fact that we need it to make sure the
function is uptodate for the OR REPLACE in CREATE OR REPLACE.

#18Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alex Hunsaker (#16)
Re: why does plperl cache functions using just a bool for is_trigger

Alex Hunsaker <badalex@gmail.com> writes:

Speaking of which, pltcl stores the trigger reloid instead of a flag
(it also uses tg_reloid in the internal proname). It seems a tad
excessive to have one function *per* trigger table. I looked through
the history to see if there was some reason, it goes all the way back
to the initial commit. I assume its this way because it copied
plpgsql, which needs it as the rowtype might be different per table.
pltcl should not have that issue. Find attached a patch to clean that
up and make it match the other pls (err um plperl). It passes its
regression tests and some additional limited testing. Thoughts?

Surely, removing the internal name's dependency on the istrigger flag is
wrong. If you're going to maintain separate hash entries at the pltcl
level, why would you want to risk collisions underneath that?

regards, tom lane

#19Alex Hunsaker
badalex@gmail.com
In reply to: Tom Lane (#18)
Re: why does plperl cache functions using just a bool for is_trigger

On Mon, Nov 1, 2010 at 16:59, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Alex Hunsaker <badalex@gmail.com> writes:

Speaking of which, pltcl stores the trigger reloid instead of a flag
(it also uses tg_reloid in the internal proname).  It seems a tad
excessive to have one function *per* trigger table.

Surely, removing the internal name's dependency on the istrigger flag is
wrong.  If you're going to maintain separate hash entries at the pltcl
level, why would you want to risk collisions underneath that?

Good catch. I was basing it off plperl which uses the same proname
for both (sprintf(subname, %s__%u", prodesc->proname, fn_oid)). Its
OK for plperl because when we compile we save a reference to it and
use that directly (more or less). The name does not really matter.

Attachments:

pltcl_rm_tgrelod_key_v2.patchtext/x-patch; charset=US-ASCII; name=pltcl_rm_tgrelod_key_v2.patchDownload+29-29
#20Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alex Hunsaker (#19)
Re: why does plperl cache functions using just a bool for is_trigger

Alex Hunsaker <badalex@gmail.com> writes:

On Mon, Nov 1, 2010 at 16:59, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Surely, removing the internal name's dependency on the istrigger flag is
wrong.  If you're going to maintain separate hash entries at the pltcl
level, why would you want to risk collisions underneath that?

Good catch. I was basing it off plperl which uses the same proname
for both (sprintf(subname, %s__%u", prodesc->proname, fn_oid)). Its
OK for plperl because when we compile we save a reference to it and
use that directly (more or less). The name does not really matter.

OK, applied.

I notice that plpython is also using the trigger relation's OID, but I
don't know that language well enough to tell whether it really needs to.

regards, tom lane

#21Alex Hunsaker
badalex@gmail.com
In reply to: Tom Lane (#20)
#22Jan Urbański
wulczer@wulczer.org
In reply to: Alex Hunsaker (#21)
#23Alex Hunsaker
badalex@gmail.com
In reply to: Jan Urbański (#22)
#24David E. Wheeler
david@kineticode.com
In reply to: Alex Hunsaker (#23)
#25Hannu Krosing
hannu@tm.ee
In reply to: Jan Urbański (#22)
#26Hannu Krosing
hannu@tm.ee
In reply to: Hannu Krosing (#25)
#27Peter Eisentraut
peter_e@gmx.net
In reply to: David E. Wheeler (#24)
#28Tom Lane
tgl@sss.pgh.pa.us
In reply to: Hannu Krosing (#25)
#29Alex Hunsaker
badalex@gmail.com
In reply to: Hannu Krosing (#26)
#30Hannu Krosing
hannu@tm.ee
In reply to: Alex Hunsaker (#29)
#31David E. Wheeler
david@kineticode.com
In reply to: Peter Eisentraut (#27)
#32Alex Hunsaker
badalex@gmail.com
In reply to: Hannu Krosing (#30)
#33Alex Hunsaker
badalex@gmail.com
In reply to: Alex Hunsaker (#32)
#34Jan Urbański
wulczer@wulczer.org
In reply to: Hannu Krosing (#30)
#35Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jan Urbański (#34)
#36Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Jan Urbański (#34)
#37Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#35)