contrib function naming, and upgrade issues

Started by Andrew Gierthabout 17 years ago43 messageshackers

andrew@tao11.riddles.org.uk

about 17 years ago

Note that I'm talking here about the names of the C functions, not
the SQL names.

The existing hstore has some very dubious choices of function names
(for non-static functions) in the C code; functions like each(),
delete(), fetchval(), defined(), tconvert(), etc. which all look to me
like prime candidates for name collisions and consequent hilarity.

The patch I'm working on could include fixes for this; but there's an
obvious impact on anyone upgrading from an earlier version... is it
worth it?

--
Andrew (irc:RhodiumToad)

Robert Haas

robertmhaas@gmail.com

about 17 years ago

In reply to: Andrew Gierth (#1)

Re: contrib function naming, and upgrade issues

On Fri, Mar 20, 2009 at 9:57 PM, Andrew Gierth
<andrew@tao11.riddles.org.uk> wrote:

Note that I'm talking here about the names of the C functions, not
the SQL names.

The existing hstore has some very dubious choices of function names
(for non-static functions) in the C code; functions like each(),
delete(), fetchval(), defined(), tconvert(), etc. which all look to me
like prime candidates for name collisions and consequent hilarity.

The patch I'm working on could include fixes for this; but there's an
obvious impact on anyone upgrading from an earlier version... is it
worth it?

Based on that description, +1 from me. That kind of hilarity can be a
huge time sink when debugging, and it makes it hard to use grep to
find all references to a particular function (or #define, typedef,
etc.).

...Robert

Tom Lane

tgl@sss.pgh.pa.us

about 17 years ago

In reply to: Andrew Gierth (#1)

Re: contrib function naming, and upgrade issues

Andrew Gierth <andrew@tao11.riddles.org.uk> writes:

Note that I'm talking here about the names of the C functions, not
the SQL names.

The existing hstore has some very dubious choices of function names
(for non-static functions) in the C code; functions like each(),
delete(), fetchval(), defined(), tconvert(), etc. which all look to me
like prime candidates for name collisions and consequent hilarity.

The patch I'm working on could include fixes for this; but there's an
obvious impact on anyone upgrading from an earlier version... is it
worth it?

I agree that this wasn't an amazingly good choice, but I think there's
no real risk of name collisions because fmgr only searches for such names
within the particular .so. As you say, renaming *will* break existing
dumps. I'd be inclined to leave it alone, at least for now. I hope
that someone will step up and implement a decent module system for us
sometime soon, which might fix the upgrade problem for changes of this
sort.

regards, tom lane

Simon Riggs

simon@2ndQuadrant.com

about 17 years ago

In reply to: Andrew Gierth (#1)

Re: contrib function naming, and upgrade issues

On Sat, 2009-03-21 at 01:57 +0000, Andrew Gierth wrote:

Note that I'm talking here about the names of the C functions, not
the SQL names.

The existing hstore has some very dubious choices of function names
(for non-static functions) in the C code; functions like each(),
delete(), fetchval(), defined(), tconvert(), etc. which all look to me
like prime candidates for name collisions and consequent hilarity.

The patch I'm working on could include fixes for this; but there's an
obvious impact on anyone upgrading from an earlier version... is it
worth it?

Perhaps you can have two sets of functions, yet just one .so? One with
the old naming for compatibility, and a set of dehilarified function
names for future use. Two .sql files, giving the user choice.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

Andrew Gierth

andrew@tao11.riddles.org.uk

about 17 years ago

In reply to: Simon Riggs (#4)

Re: contrib function naming, and upgrade issues

"Simon" == Simon Riggs <simon@2ndQuadrant.com> writes:

On Sat, 2009-03-21 at 01:57 +0000, Andrew Gierth wrote:

Note that I'm talking here about the names of the C functions, not
the SQL names.

The existing hstore has some very dubious choices of function names
(for non-static functions) in the C code; functions like each(),
delete(), fetchval(), defined(), tconvert(), etc. which all look to me
like prime candidates for name collisions and consequent hilarity.

The patch I'm working on could include fixes for this; but there's an
obvious impact on anyone upgrading from an earlier version... is it
worth it?

Simon> Perhaps you can have two sets of functions, yet just one .so?
Simon> One with the old naming for compatibility, and a set of
Simon> dehilarified function names for future use. Two .sql files,
Simon> giving the user choice.

Two .sql files would be pointless. Remember we're talking about the C
function names, not the SQL names; the only time the user should notice
the difference is when restoring an old dump.

As I see it there are three options:

1) do nothing; keep the existing C function names. dump/restore from
older versions will still work, but new functionality won't be
available without messing with the SQL.

2) hard cutover; rename all the dubious C functions. dump/restore from
older versions will get lots of errors, for which the workaround will
be "install the new hstore.sql into the database before trying to
restore".

3) some sort of compatibility hack involving optionally duplicating the
names in the C module.

--
Andrew.

Andrew Gierth

andrew@tao11.riddles.org.uk

about 17 years ago

In reply to: Tom Lane (#3)

Re: contrib function naming, and upgrade issues

"Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:

Tom> I agree that this wasn't an amazingly good choice, but I think
Tom> there's no real risk of name collisions because fmgr only
Tom> searches for such names within the particular .so.

Oh, if only life were so simple.

Consider two modules mod1 (source files mod1a.c and mod1b.c) and mod2
(source files mod2a.c and mod2b.c).

mod1a.c: contains sql-callable function foo() which calls an extern
function bar() defined in mod1b.c. mod1a.o and mod1b.o are linked to
make mod1.so.

mod2a.c: contains sql-callable function baz() which calls an extern
function bar() defined in mod2b.c. These are linked to make mod2.so.

Guess what happens when foo() and baz() are both called from within
the same backend....

(Perhaps we should be linking contrib and pgxs modules with -Bsymbolic
on those platforms where it matters?)

--
Andrew.

Martijn van Oosterhout

kleptog@svana.org

about 17 years ago

In reply to: Andrew Gierth (#6)

Re: contrib function naming, and upgrade issues

On Sat, Mar 21, 2009 at 01:05:35PM +0000, Andrew Gierth wrote:

(Perhaps we should be linking contrib and pgxs modules with -Bsymbolic
on those platforms where it matters?)

Another possibility is to use the visibility attributes such as those
provided in GCC. Maybe the version1 declarion of a function could add
the appropriate magic to set the visiblity to public and alter PGXS to
set the default visibility to hidden. Voila, modules whose only
exported symbols are those declared with a version-1 declaration.

Perhaps a little too much magic :)

Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/

Show quoted text

Please line up in a tree and maintain the heap invariant while
boarding. Thank you for flying nlogn airlines.

Tom Lane

tgl@sss.pgh.pa.us

about 17 years ago

In reply to: Andrew Gierth (#6)

Re: contrib function naming, and upgrade issues

Andrew Gierth <andrew@tao11.riddles.org.uk> writes:

"Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
Tom> I agree that this wasn't an amazingly good choice, but I think
Tom> there's no real risk of name collisions because fmgr only
Tom> searches for such names within the particular .so.

Oh, if only life were so simple.

I think you are missing the point. There are certainly *potential*
problems from common function names in different .so's, but that does not
translate to evidence of *actual* problems in the Postgres environment.
In particular, I believe that we load .so's without adding their symbols
to those globally known by the linker --- at least on platforms where
that's possible. Not to mention that the universe of other .so's we
might load is not all that large. So I think the actual risks posed by
contrib/hstore are somewhere between minimal and nonexistent.

The past discussions we've had about developing a proper module facility
included ways to replace not-quite-compatible C functions. I think that
we can afford to let hstore go on as it is for another release or two,
in hopes that we'll have something that makes a fix for this transparent
to users. The risks don't look to me to be large enough to justify
imposing any upgrade pain on users.

regards, tom lane

Robert Treat

xzilla@users.sourceforge.net

about 17 years ago

In reply to: Tom Lane (#8)

Re: contrib function naming, and upgrade issues

On Saturday 21 March 2009 12:27:27 Tom Lane wrote:

Andrew Gierth <andrew@tao11.riddles.org.uk> writes:

"Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
Tom> I agree that this wasn't an amazingly good choice, but I think
Tom> there's no real risk of name collisions because fmgr only
Tom> searches for such names within the particular .so.

Oh, if only life were so simple.

I think you are missing the point. There are certainly *potential*
problems from common function names in different .so's, but that does not
translate to evidence of *actual* problems in the Postgres environment.
In particular, I believe that we load .so's without adding their symbols
to those globally known by the linker --- at least on platforms where
that's possible. Not to mention that the universe of other .so's we
might load is not all that large. So I think the actual risks posed by
contrib/hstore are somewhere between minimal and nonexistent.

The past discussions we've had about developing a proper module facility
included ways to replace not-quite-compatible C functions. I think that
we can afford to let hstore go on as it is for another release or two,
in hopes that we'll have something that makes a fix for this transparent
to users. The risks don't look to me to be large enough to justify
imposing any upgrade pain on users.

We've been talking about this magical "proper module facility" for a few
releases now... are we still opposed to putting contrib modules in thier own
schema? People who took my advice and did that for tsearch were mighty happy
when 8.2 broke at the C level, and when 8.3 broke all around. Doing that for
hstore now would make the transition a little easier in the future as well.

--
Robert Treat
Conjecture: http://www.xzilla.net
Consulting: http://www.omniti.com

#10

Tom Lane

tgl@sss.pgh.pa.us

about 17 years ago

In reply to: Robert Treat (#9)

Re: contrib function naming, and upgrade issues

Robert Treat <xzilla@users.sourceforge.net> writes:

We've been talking about this magical "proper module facility" for a few
releases now... are we still opposed to putting contrib modules in thier own
schema?

I'm hesitant to do that when we don't yet have either a design or a
migration plan for the module facility. We might find we'd shot
ourselves in the foot, or at least complicated the migration situation
unduly.

regards, tom lane

#11

Robert Haas

robertmhaas@gmail.com

about 17 years ago

In reply to: Tom Lane (#10)

Re: contrib function naming, and upgrade issues

On Sat, Mar 21, 2009 at 9:49 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Treat <xzilla@users.sourceforge.net> writes:

We've been talking about this magical "proper module facility" for a few
releases now... are we still opposed to putting contrib modules in thier own
schema?

I'm hesitant to do that when we don't yet have either a design or a
migration plan for the module facility. We might find we'd shot
ourselves in the foot, or at least complicated the migration situation
unduly.

I think there have been a few designs proposed, but I think part of
the problem is a lack of agreement on the requirements. "module
facility" seems to mean a lot of different things to different people.

...Robert

#12

Andrew Gierth

andrew@tao11.riddles.org.uk

about 17 years ago

In reply to: Tom Lane (#8)

Re: contrib function naming, and upgrade issues

"Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:

Tom> I agree that this wasn't an amazingly good choice, but I think
Tom> there's no real risk of name collisions because fmgr only
Tom> searches for such names within the particular .so.

Oh, if only life were so simple.

Tom> I think you are missing the point.

Nope.

Tom> There are certainly *potential* problems from common function
Tom> names in different .so's, but that does not translate to
Tom> evidence of *actual* problems in the Postgres environment.

It is true that I have no reason to believe that anyone has ever
encountered any problems due to name collisions between hstore and
something else. The only question is how to trade off the potential
risks against the known difficulties regarding upgrading; I'm quite
happy to accept the conclusion that the risk is not sufficient to
justify the upgrade pain, but only if the risk is being correctly
assessed.

Tom> In particular, I believe that we load .so's without adding their
Tom> symbols to those globally known by the linker --- at least on
Tom> platforms where that's possible.

This is false; in the exact reverse of the above, we explicitly
request RTLD_GLOBAL on platforms where it exists.

Tom> Not to mention that the universe of other .so's we might load is
Tom> not all that large. So I think the actual risks posed by
Tom> contrib/hstore are somewhere between minimal and nonexistent.

The problem extends not only to other loaded .so's, but also to every
library linked into the postmaster itself, every library linked into
another loaded .so, and every .so (and associated libs) dynamically
loaded by another .so (e.g. modules loaded by pls).

(-Bsymbolic (or equivalent) would negate all of these, as far as I can
tell.)

Tom> The risks don't look to me to be large enough to justify
Tom> imposing any upgrade pain on users.

OK. I will maintain binary compatibility in my patch.

--
Andrew.

#13

Andrew Gierth

andrew@tao11.riddles.org.uk

about 17 years ago

In reply to: Tom Lane (#10)

Re: contrib function naming, and upgrade issues

"Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:

Robert Treat <xzilla@users.sourceforge.net> writes:

We've been talking about this magical "proper module facility" for
a few releases now... are we still opposed to putting contrib
modules in thier own schema?

Tom> I'm hesitant to do that when we don't yet have either a design
Tom> or a migration plan for the module facility. We might find we'd
Tom> shot ourselves in the foot, or at least complicated the
Tom> migration situation unduly.

I've been thinking about this, and my conclusion is that schemas as
they currently exist are the wrong tool for making modules/packages.

Partly that's based on the relative inflexibility of the search_path
setting; it's hard to modify the search_path without completely
replacing it, so knowledge of the "default" search path ends up being
propagated to a lot of places.

There's a parallel here with operating-system package mechanisms; for
the most part, the more usable / successful packaging systems don't
rely on putting everything in separate directories, instead they have
an out-of-band method for specifying what files belong to what package.

We already have a mechanism we could use for this: pg_depend. If an
"installed package" was a type of object, the functions, types,
operators, or any other kind of object installed by the package could
have dependency links to it; that would (a) make it trivial to drop,
and (b) pg_dump could check for package dependencies and, for objects
depending on a package, emit only a package installation command rather
than the object definition.

(I distinguish an "installed package" from whatever the package
definition might be, since it's possible that a package might want to
provide multiple APIs, for example for different versions, and these
might be installed simultaneously in different schemas.)

--
Andrew.

#14

Andrew Gierth

andrew@tao11.riddles.org.uk

about 17 years ago

In reply to: Robert Haas (#11)

Re: contrib function naming, and upgrade issues

"Robert" == Robert Haas <robertmhaas@gmail.com> writes:

I'm hesitant to do that when we don't yet have either a design or
a migration plan for the module facility. We might find we'd shot
ourselves in the foot, or at least complicated the migration
situation unduly.

Robert> I think there have been a few designs proposed, but I think
Robert> part of the problem is a lack of agreement on the
Robert> requirements. "module facility" seems to mean a lot of
Robert> different things to different people.

Some ideas:

- want to be able to do INSTALL PACKAGE foo; without needing to
mess with .sql files. This might default to looking for
$libdir/foo.so, or there might be a mechanism to register packages
globally or locally.

- want to be able to do INSTALL PACKAGE foo VERSION 1; and get
the version 1 API rather than whatever the latest is.

- want to be able to do INSTALL PACKAGE foo SCHEMA bar; rather
than having to edit some .sql file.

- want to be able to do DROP PACKAGE foo;

- want pg_dump to not output the definitions of any objects that
belong to a package, but instead to output an INSTALL PACKAGE foo
VERSION n SCHEMA x;

--
Andrew.

#15

Dave Page

dpage@pgadmin.org

about 17 years ago

In reply to: Andrew Gierth (#14)

Re: contrib function naming, and upgrade issues

On Sun, Mar 22, 2009 at 11:54 AM, Andrew Gierth
<andrew@tao11.riddles.org.uk> wrote:

- want to be able to do INSTALL PACKAGE foo; without needing to
mess with .sql files. This might default to looking for
$libdir/foo.so, or there might be a mechanism to register packages
globally or locally.

- want to be able to do INSTALL PACKAGE foo VERSION 1; and get
the version 1 API rather than whatever the latest is.

- want to be able to do INSTALL PACKAGE foo SCHEMA bar; rather
than having to edit some .sql file.

- want to be able to do DROP PACKAGE foo;

- want pg_dump to not output the definitions of any objects that
belong to a package, but instead to output an INSTALL PACKAGE foo
VERSION n SCHEMA x;

I think using PACKAGE is a bad idea as it'll confuse people used to
Oracle. MODULE perhaps?

--
Dave Page
EnterpriseDB UK: http://www.enterprisedb.com

#16

Alvaro Herrera

alvherre@2ndquadrant.com

about 17 years ago

In reply to: Andrew Gierth (#13)

Re: contrib function naming, and upgrade issues

Andrew Gierth wrote:

I've been thinking about this, and my conclusion is that schemas as
they currently exist are the wrong tool for making modules/packages.

This has been discussed at length previously, and we even had an
incomplete but substantive patch posted. Did you review that? Some of
it appears to be in line of what you're proposing here. If you're
interested in this area, perhaps you could pick up where Tom Dunstan
left off.

See URLs here:
http://wiki.postgresql.org/wiki/Todo#Source_Code
under "Improve the module installation experience (/contrib, etc)"

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#17

Andrew Dunstan

andrew@dunslane.net

about 17 years ago

In reply to: Dave Page (#15)

Re: contrib function naming, and upgrade issues

Dave Page wrote:

I think using PACKAGE is a bad idea as it'll confuse people used to
Oracle. MODULE perhaps?

Right. We debated this extensively in the past. Module was the consensus
name.

cheers

andrew

#18

Dimitri Fontaine

dimitri@2ndQuadrant.fr

about 17 years ago

In reply to: Andrew Gierth (#13)

Re: contrib function naming, and upgrade issues

Hi,

Le 22 mars 09 à 12:42, Andrew Gierth a écrit :

Tom> I'm hesitant to do that when we don't yet have either a design
Tom> or a migration plan for the module facility. We might find we'd
Tom> shot ourselves in the foot, or at least complicated the
Tom> migration situation unduly.

I've been thinking about this, and my conclusion is that schemas as
they currently exist are the wrong tool for making modules/packages.

Agreed.
Still, schemas are useful and using them should be encouraged, I think.

Partly that's based on the relative inflexibility of the search_path
setting; it's hard to modify the search_path without completely
replacing it, so knowledge of the "default" search path ends up being
propagated to a lot of places.

pg_catalog is implicit in the search_path, what about having user
schemas with the implicit capability too?

Then you have the problem of ordering more than one implicit schemas,
the easy solution is solving that the same way we solve trigger
orderding: alphabetically. Now, that could mean ugly user-facing
schema names: we already know we need synonyms, don't we?

There's a parallel here with operating-system package mechanisms; for
the most part, the more usable / successful packaging systems don't
rely on putting everything in separate directories, instead they have
an out-of-band method for specifying what files belong to what
package.

We already have a mechanism we could use for this: pg_depend. If an
"installed package" was a type of object, the functions, types,
operators, or any other kind of object installed by the package could
have dependency links to it; that would (a) make it trivial to drop,
and (b) pg_dump could check for package dependencies and, for objects
depending on a package, emit only a package installation command
rather
than the object definition.

Here's a sketch of what I came up with:
http://wiki.postgresql.org/wiki/ExtensionPackaging

It's still needing some work before being a solid proposal, like for
example handling cases where you want to pg_restore a database and
insist on *not* caring about some extensions (pgq, londiste, slony
things, cron restoring into pre-live systems). Or working out some
versioning information and dependancies between modules.
What it misses the most is hackers acceptance of the proposed
concepts, though.

(I distinguish an "installed package" from whatever the package
definition might be, since it's possible that a package might want to
provide multiple APIs, for example for different versions, and these
might be installed simultaneously in different schemas.)

Version tracking is yet to be designed in the document.
--
dim

#19

Dimitri Fontaine

dimitri@2ndQuadrant.fr

about 17 years ago

In reply to: Dave Page (#15)

Re: contrib function naming, and upgrade issues

Hi,

Heard about http://wiki.postgresql.org/wiki/ExtensionPackaging ? :)

Le 22 mars 09 à 14:29, Dave Page a écrit :

- want to be able to do INSTALL PACKAGE foo; without needing to
mess with .sql files. This might default to looking for
$libdir/foo.so, or there might be a mechanism to register packages
globally or locally.

Part of the proposal.

- want to be able to do INSTALL PACKAGE foo VERSION 1; and get
the version 1 API rather than whatever the latest is.

To be added to the proposal.

- want to be able to do INSTALL PACKAGE foo SCHEMA bar; rather
than having to edit some .sql file.

Part of the proposal (install time variables/options/parameters).

- want to be able to do DROP PACKAGE foo;

Part of the proposal.

- want pg_dump to not output the definitions of any objects that
belong to a package, but instead to output an INSTALL PACKAGE foo
VERSION n SCHEMA x;

Part of the proposal.

I think using PACKAGE is a bad idea as it'll confuse people used to
Oracle. MODULE perhaps?

Using package would tie us into supporting oracle syntax, which nobody
actually wants, it seems. Or at least we have to reserve the keyword
for meaning "oracle compliant facility".

Module on the other hand is already used in PostgreSQL to refer to the
dynamic lib you get when installing C coded extensions (.so or .dll),
what we miss here is a way to refer to them in pure SQL, have their
existence cared about in the catalogs. That's the part Tom Dunstan
worked on IIRC.
He also worked out some OS level tools for module handling, but I
think I'd prefer to have another notion in between, the extension.

The extension would be a new SQL object referring to zero, one or more
modules and one or more SQL scripts creating new SQL objects (schemas,
tables, views, tablespaces, functions, types, casts, operator classes
and families, etc, whatever SQL scripting we support now --- yes,
index am would be great too). Those would depend (pg_depend) on the
package SQL object. I don't think we need to be able to nest a package
creation inside the package SQL scripts, but hey, why not.

So my vote is for us to talk about modules (.so) and extensions (the
packaging and distribution of them). And this term isn't even new in
PostgreSQL glossary ;)

Regards,
--
dim

#20

Tom Lane

tgl@sss.pgh.pa.us

about 17 years ago

In reply to: Dimitri Fontaine (#19)

Re: contrib function naming, and upgrade issues

Dimitri Fontaine <dfontaine@hi-media.com> writes:

He also worked out some OS level tools for module handling, but I
think I'd prefer to have another notion in between, the extension.

The extension would be a new SQL object referring to zero, one or more
modules and one or more SQL scripts creating new SQL objects (schemas,
tables, views, tablespaces, functions, types, casts, operator classes
and families, etc, whatever SQL scripting we support now --- yes,
index am would be great too).

This seems drastically overengineered. What do we need two levels of
objects for?

regards, tom lane

#21

Dimitri Fontaine

dimitri@2ndQuadrant.fr

about 17 years ago

In reply to: Tom Lane (#20)

#22

Tom Lane

tgl@sss.pgh.pa.us

about 17 years ago

In reply to: Dimitri Fontaine (#21)