late binding of shared libs for C functions

Started by Geoff Winklessover 7 years ago14 messages
#1Geoff Winkless
pgsqladmin@geoff.dj

Hi All

Is it possible to use CREATE FUNCTION to link a shared library that
doesn't yet exist? I don't think it is, but I might be missing
something.

If not, would it be something that people would be open to a patch
for? I'm thinking of e.g.

CREATE [ OR REPLACE ] FUNCTION
    name ( [ [ argmode ] [ argname ] argtype [ { DEFAULT | = }
default_expr ] [, ...] ] )
    [ RETURNS rettype
      | RETURNS TABLE ( column_name column_type [, ...] ) ]
  { LANGUAGE lang_name
    | TRANSFORM { FOR TYPE type_name } [, ... ]
    | WINDOW
    | IMMUTABLE | STABLE | VOLATILE | [ NOT ] LEAKPROOF
    | CALLED ON NULL INPUT | RETURNS NULL ON NULL INPUT | STRICT
    | [ EXTERNAL ] SECURITY INVOKER | [ EXTERNAL ] SECURITY DEFINER
    | PARALLEL { UNSAFE | RESTRICTED | SAFE }
    | COST execution_cost
    | ROWS result_rows
    | SET configuration_parameter { TO value | = value | FROM CURRENT }
    | AS 'definition'
-    | AS 'obj_file', 'link_symbol'
+    | AS 'obj_file', 'link_symbol' [UNBOUNDED]
 } ...
    [ WITH ( attribute [, ...] ) ]

(I know UNBOUNDED isn't quite the word - BINDLATE would be better -
but I figured I should try to use an existing reserved keyword...)

We run our SQL scripts before we install binaries (because our
binaries are started by the installer, so having the database in place
is a Good Thing). The binary installer includes the .so. We're now
stuck in a catch-22 where I can't run the SQL script because it
requires the .so to be in place, but I can't run the binary installer
because if I do the SQL won't be updated...

This specific problem is obviously workaround-able, but it occurred to
me that since the libraries are bound late anyway it seems like this
wouldn't cause any serious problems.

Of course chances are I've missed something...

Geoff

In reply to: Geoff Winkless (#1)
Re: late binding of shared libs for C functions

This thing also bites PostGIS upgrades.

When distro's packaging system decides to upgrade PostGIS, or both
Postgres/PostGIS at the same time, you may often get to a situation when
you only have one version of PostGIS .so installed, and it's not the one
referenced in catalog. There are workarounds that tell you to symlink a
newer PostGIS .so file to the spot an older one is being looked for, and
then do ALTER EXTENSION UPGRADE to get out of this inconsistent state.

This also means PostGIS has to ship stubs for all the functions that should
have been deleted, but may be needed during such hacky upgrade process.
For example,
https://github.com/postgis/postgis/blob/16270b9352e84bc989b9b946d279f16e0de5c2b9/postgis/lwgeom_accum.c#L55

вт, 12 июн. 2018 г. в 13:48, Geoff Winkless <pgsqladmin@geoff.dj>:

Show quoted text

Hi All

Is it possible to use CREATE FUNCTION to link a shared library that
doesn't yet exist? I don't think it is, but I might be missing
something.

If not, would it be something that people would be open to a patch
for? I'm thinking of e.g.

CREATE [ OR REPLACE ] FUNCTION
name ( [ [ argmode ] [ argname ] argtype [ { DEFAULT | = }
default_expr ] [, ...] ] )
[ RETURNS rettype
| RETURNS TABLE ( column_name column_type [, ...] ) ]
{ LANGUAGE lang_name
| TRANSFORM { FOR TYPE type_name } [, ... ]
| WINDOW
| IMMUTABLE | STABLE | VOLATILE | [ NOT ] LEAKPROOF
| CALLED ON NULL INPUT | RETURNS NULL ON NULL INPUT | STRICT
| [ EXTERNAL ] SECURITY INVOKER | [ EXTERNAL ] SECURITY DEFINER
| PARALLEL { UNSAFE | RESTRICTED | SAFE }
| COST execution_cost
| ROWS result_rows
| SET configuration_parameter { TO value | = value | FROM CURRENT }
| AS 'definition'
-    | AS 'obj_file', 'link_symbol'
+    | AS 'obj_file', 'link_symbol' [UNBOUNDED]
} ...
[ WITH ( attribute [, ...] ) ]

(I know UNBOUNDED isn't quite the word - BINDLATE would be better -
but I figured I should try to use an existing reserved keyword...)

We run our SQL scripts before we install binaries (because our
binaries are started by the installer, so having the database in place
is a Good Thing). The binary installer includes the .so. We're now
stuck in a catch-22 where I can't run the SQL script because it
requires the .so to be in place, but I can't run the binary installer
because if I do the SQL won't be updated...

This specific problem is obviously workaround-able, but it occurred to
me that since the libraries are bound late anyway it seems like this
wouldn't cause any serious problems.

Of course chances are I've missed something...

Geoff

#3Andrew Dunstan
andrew.dunstan@2ndquadrant.com
In reply to: Geoff Winkless (#1)
Re: late binding of shared libs for C functions

On 06/12/2018 06:48 AM, Geoff Winkless wrote:

Hi All

Is it possible to use CREATE FUNCTION to link a shared library that
doesn't yet exist? I don't think it is, but I might be missing
something.

If not, would it be something that people would be open to a patch
for? I'm thinking of e.g.

CREATE [ OR REPLACE ] FUNCTION
name ( [ [ argmode ] [ argname ] argtype [ { DEFAULT | = }
default_expr ] [, ...] ] )
[ RETURNS rettype
| RETURNS TABLE ( column_name column_type [, ...] ) ]
{ LANGUAGE lang_name
| TRANSFORM { FOR TYPE type_name } [, ... ]
| WINDOW
| IMMUTABLE | STABLE | VOLATILE | [ NOT ] LEAKPROOF
| CALLED ON NULL INPUT | RETURNS NULL ON NULL INPUT | STRICT
| [ EXTERNAL ] SECURITY INVOKER | [ EXTERNAL ] SECURITY DEFINER
| PARALLEL { UNSAFE | RESTRICTED | SAFE }
| COST execution_cost
| ROWS result_rows
| SET configuration_parameter { TO value | = value | FROM CURRENT }
| AS 'definition'
-    | AS 'obj_file', 'link_symbol'
+    | AS 'obj_file', 'link_symbol' [UNBOUNDED]
} ...
[ WITH ( attribute [, ...] ) ]

(I know UNBOUNDED isn't quite the word - BINDLATE would be better -
but I figured I should try to use an existing reserved keyword...)

UNBOUNDED would be terrible. It does not mean the same thing as UNBOUND.

Perhaps something like NO CHECK would meet the case, i.e. we're not
checking the link at function creation time.

I haven't thought through the other implications yet.

cheers

andrew

--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#4Andrew Gierth
andrew@tao11.riddles.org.uk
In reply to: Andrew Dunstan (#3)
Re: late binding of shared libs for C functions

"Andrew" == Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:

Andrew> Perhaps something like NO CHECK would meet the case, i.e. we're
Andrew> not checking the link at function creation time.

The real question is why check_function_bodies doesn't cover this;
there's a comment in fmgr_c_validator that this is deliberate, but it's
rather unclear what the advantage is supposed to be:

* It'd be most consistent to skip the check if !check_function_bodies,
* but the purpose of that switch is to be helpful for pg_dump loading,
* and for pg_dump loading it's much better if we *do* check.

--
Andrew (irc:RhodiumToad)

#5Andrew Dunstan
andrew.dunstan@2ndquadrant.com
In reply to: Andrew Gierth (#4)
Re: late binding of shared libs for C functions

On 06/12/2018 08:46 AM, Andrew Gierth wrote:

"Andrew" == Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:

Andrew> Perhaps something like NO CHECK would meet the case, i.e. we're
Andrew> not checking the link at function creation time.

The real question is why check_function_bodies doesn't cover this;
there's a comment in fmgr_c_validator that this is deliberate, but it's
rather unclear what the advantage is supposed to be:

* It'd be most consistent to skip the check if !check_function_bodies,
* but the purpose of that switch is to be helpful for pg_dump loading,
* and for pg_dump loading it's much better if we *do* check.

Maybe we need a function that will validate all the links, that could be
called after everything is installed.

cheers

andrew

--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Gierth (#4)
Re: late binding of shared libs for C functions

Andrew Gierth <andrew@tao11.riddles.org.uk> writes:

The real question is why check_function_bodies doesn't cover this;
there's a comment in fmgr_c_validator that this is deliberate, but it's
rather unclear what the advantage is supposed to be:

Error detection, ie did you spell the C symbol name correctly.

regards, tom lane

#7Andrew Gierth
andrew@tao11.riddles.org.uk
In reply to: Tom Lane (#6)
Re: late binding of shared libs for C functions

"Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:

Andrew Gierth <andrew@tao11.riddles.org.uk> writes:

The real question is why check_function_bodies doesn't cover this;
there's a comment in fmgr_c_validator that this is deliberate, but it's
rather unclear what the advantage is supposed to be:

Tom> Error detection, ie did you spell the C symbol name correctly.

Right, but surely restoring a dump is not the place to be doing that
error check?

--
Andrew (irc:RhodiumToad)

In reply to: Andrew Gierth (#7)
Re: late binding of shared libs for C functions

The real question is why check_function_bodies doesn't cover this;
there's a comment in fmgr_c_validator that this is deliberate, but it's
rather unclear what the advantage is supposed to be:

Tom> Error detection, ie did you spell the C symbol name correctly.

Right, but surely restoring a dump is not the place to be doing that
error check?

Similar check also happens on pg_upgrade in link mode. It would be more
useful to get it to attempt to upgrade the extension if there is absent
function or absent library error.

#9Geoff Winkless
pgsqladmin@geoff.dj
In reply to: Andrew Dunstan (#3)
Re: late binding of shared libs for C functions

On Tue, 12 Jun 2018 at 13:41, Andrew Dunstan
<andrew.dunstan@2ndquadrant.com> wrote:

On 06/12/2018 06:48 AM, Geoff Winkless wrote:

+ | AS 'obj_file', 'link_symbol' [UNBOUNDED]
(I know UNBOUNDED isn't quite the word - BINDLATE would be better -
but I figured I should try to use an existing reserved keyword...)

UNBOUNDED would be terrible. It does not mean the same thing as UNBOUND.

Indeed. I agree.

Perhaps something like NO CHECK would meet the case, i.e. we're not
checking the link at function creation time.

I did wonder about "NO CHECK" but wasn't sure if having two words
would make the parser change more complex.

Geoff

#10Christian Ullrich
chris@chrullrich.net
In reply to: Geoff Winkless (#9)
Re: late binding of shared libs for C functions

* On 2018-06-12 16:35, Geoff Winkless wrote:

On Tue, 12 Jun 2018 at 13:41, Andrew Dunstan wrote:

UNBOUNDED would be terrible. It does not mean the same thing as UNBOUND.

Indeed. I agree.

Perhaps something like NO CHECK would meet the case, i.e. we're not
checking the link at function creation time.

I did wonder about "NO CHECK" but wasn't sure if having two words
would make the parser change more complex.

DEFERRED?

--
Christian

#11Geoff Winkless
pgsqladmin@geoff.dj
In reply to: Christian Ullrich (#10)
Re: late binding of shared libs for C functions

On Tue, 12 Jun 2018 at 15:44, Christian Ullrich <chris@chrullrich.net> wrote:

I did wonder about "NO CHECK" but wasn't sure if having two words
would make the parser change more complex.

DEFERRED?

That's a good shout. I wouldn't mind either of those choices.

So can I assume at least that no-one has an objection to the general principle?

I don't currently have a PG dev environment set up so it's non-trivial
for me to implement, which I'm ok with but not if I'm just wasting my
(and everyone else's) time :)

Cheers

Geoff

#12Andrew Dunstan
andrew.dunstan@2ndquadrant.com
In reply to: Geoff Winkless (#11)
Re: late binding of shared libs for C functions

On 06/12/2018 11:09 AM, Geoff Winkless wrote:

On Tue, 12 Jun 2018 at 15:44, Christian Ullrich <chris@chrullrich.net> wrote:

I did wonder about "NO CHECK" but wasn't sure if having two words
would make the parser change more complex.

DEFERRED?

That's a good shout. I wouldn't mind either of those choices.

So can I assume at least that no-one has an objection to the general principle?

I don't currently have a PG dev environment set up so it's non-trivial
for me to implement, which I'm ok with but not if I'm just wasting my
(and everyone else's) time :)

I would wait a little while. The idea's only been floated for a few hours.

cheers

andrew

--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#13Andres Freund
andres@anarazel.de
In reply to: Andrew Gierth (#7)
Re: late binding of shared libs for C functions

On 2018-06-12 15:05:16 +0100, Andrew Gierth wrote:

"Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:

Andrew Gierth <andrew@tao11.riddles.org.uk> writes:

The real question is why check_function_bodies doesn't cover this;
there's a comment in fmgr_c_validator that this is deliberate, but it's
rather unclear what the advantage is supposed to be:

Tom> Error detection, ie did you spell the C symbol name correctly.

Right, but surely restoring a dump is not the place to be doing that
error check?

I'm not convinced that that's true. Checking that the target system has
the right shared library [version] installed isn't crazy, and you can't
do it at dump time.

If I wanted to do something about it - which I don't really - I'd argue
that check_function_bodies should become an enum or such.

Greetings,

Andres Freund

#14Robert Haas
robertmhaas@gmail.com
In reply to: Andrew Dunstan (#3)
Re: late binding of shared libs for C functions

On Tue, Jun 12, 2018 at 8:41 AM, Andrew Dunstan
<andrew.dunstan@2ndquadrant.com> wrote:

UNBOUNDED would be terrible. It does not mean the same thing as UNBOUND.

Perhaps something like NO CHECK would meet the case, i.e. we're not checking
the link at function creation time.

I haven't thought through the other implications yet.

It seems like it might be better to control this through a GUC than
dedicated syntax, because you probably want it for purposes of
restoring an otherwise-unrestorable dump, and you want to make the
decision at restore time, not dump time. If it's a GUC, that's a lot
easier than if you have to edit the dump.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company