search_path vs extensions
Hi,
Preliminary note: I'm using the term "extension" as if it's what we
already agree to call them, feel free to ignore this and use whatever
term you see fit. We'll have the naming issue tackled, please not now
though.
Following-up to discussions we had at the Developer Meeting and
subsequent pub events, I'd like us to agree upon the relations of
extensions and search_path. We basically have to choose one of those:
Proposal: do nothing
What's good about it:
it's already there, folks!
What's not good about it:
Users are alone on deciding where to put what, and the system won't
help them: either public is a complete mess, or they have to manually
care about search_path for their extensions and their own application
needs. Installations where DBA and application folks are separate
teams will suffer, ones where the application is heavily using schemas
will suffer too.
Proposal: pg_extension, a new dedicated system schema for extensions
Good:
It's easy to see SQL objects (\df) of extensions (think contribs) you
installed, and as extension developpers are required to use it, you
don't have to care about it any more.
As you have only one namespace for everyone, the collisions are
detected early.
Not good:
As you have only one namespace for everyone, collisions prevent users
from installing several extensions using the same SQL object name, so
we'd need a way for extension authors to share a catalog of free
names, like internally we do for systems OIDs in the bootstrap,
IIUC. But in a distributed fashion.
We would have to add ways for the user to see which extension which
object belongs to, so you'd have extension | schema | object_name
columns in all \dX things, e.g.
Proposal: allow user schema to behave the same as pg_catalog
Good:
Tell the system your schema is implicit and be done with it, object
searching won't need users to manage search_path explicitly.
Not good:
Breaking existing application code by adding an implicit schema in an
existing database is damn too easy. And how to choose if the implicit
schemas are to be searched in before or after the search_path?
Proposal: Separate search_path into components: pre_search_path,
search_path, post_search_path
Good:
This allows to easily separate who changes what: typically DBAs will
edit pre and post search_path components while application will care
about search_path the same way as now.
Not good:
2 new GUCs (but no new semantics, and defaults to empty)
My vote is to go with the pre/post search_path components proposal as
it's the one allowing the more flexibility, and we tend to value this a
lot around here.
Regards,
--
dim
On May 25, 2009, at 2:16 AM, Dimitri Fontaine wrote:
Proposal: pg_extension, a new dedicated system schema for extensions
Good:
It's easy to see SQL objects (\df) of extensions (think contribs) you
installed, and as extension developpers are required to use it, you
don't have to care about it any more.As you have only one namespace for everyone, the collisions are
detected early.
Not good:
As you have only one namespace for everyone, collisions prevent users
from installing several extensions using the same SQL object name, so
we'd need a way for extension authors to share a catalog of free
names, like internally we do for systems OIDs in the bootstrap,
IIUC. But in a distributed fashion.We would have to add ways for the user to see which extension which
object belongs to, so you'd have extension | schema | object_name
columns in all \dX things, e.g.
I like this, although I'd want to be able, as a user, to override that
default and tell an extension to install in some other schema. That
would allow me to immediately overcome conflicts, and to organize my
extensions if I want, rather than throw them all in one place.
Proposal: Separate search_path into components: pre_search_path,
search_path, post_search_path
Good:
This allows to easily separate who changes what: typically DBAs will
edit pre and post search_path components while application will care
about search_path the same way as now.
Not good:
2 new GUCs (but no new semantics, and defaults to empty)
I don't follow this at all. How to the three components effect
behavior? And what does this mean for where extensions are installed
in schemas?
Best,
David
Hi,
"David E. Wheeler" <david@kineticode.com> writes:
On May 25, 2009, at 2:16 AM, Dimitri Fontaine wrote:
Proposal: pg_extension, a new dedicated system schema for extensions
I like this, although I'd want to be able, as a user, to override that
default and tell an extension to install in some other schema. That would
allow me to immediately overcome conflicts, and to organize my extensions
if I want, rather than throw them all in one place.
The moment you're adding specific schemas where to put extensions into,
you have to adapt your search_path. Some applications already have to
manage search_path for their own needs, so we're trying to avoid having
those people to care about extensions schemas and application schema at
the same time.
It could even not be the same people caring about those search_path
parts.
Proposal: Separate search_path into components: pre_search_path,
search_path, post_search_pathI don't follow this at all. How to the three components effect behavior? And
what does this mean for where extensions are installed in schemas?
This proposal tries to solve previous one limitations. It's very good in
the typical case when you want each extension to be installed in one (or
more) schemas but don't want to have the application to care about it.
Then you add your extensions schemas into pre_search_path and
application schemas into search_path, so that the application doesn't
have to manage pre_search_path.
Now it could be that your application is historically using the same
function names as some extension you're now adding to the server, and
you want to control which function is called when not schema
qualified. So you have the post_search_path to play with too.
The idea being that application developpers will maintain search_path
for the application schemas (and this search_path can vary depending on
the application role which connects to the database, of course), and the
DBA team will make extensions available transparently to the application
by adding the extension's schemas in either pre_search_path or
post_search_path.
I hope I've added clarity to the point, rather than only some extra
verbosity... :)
Regards,
--
dim
On May 27, 2009, at 1:50 AM, Dimitri Fontaine wrote:
The moment you're adding specific schemas where to put extensions
into,
you have to adapt your search_path. Some applications already have to
manage search_path for their own needs, so we're trying to avoid
having
those people to care about extensions schemas and application schema
at
the same time.
That doesn't seem like much of a problem to me. I already do this for
extensions. I agree that what you suggest should be the default, but I
should be able to optionally install extensions in whatever schema I
deem appropriate, especially if I want to avoid conflicts.
This proposal tries to solve previous one limitations. It's very
good in
the typical case when you want each extension to be installed in one
(or
more) schemas but don't want to have the application to care about it.
Then you add your extensions schemas into pre_search_path and
application schemas into search_path, so that the application doesn't
have to manage pre_search_path.
So are pre_search_path and search_path and post_search_path basically
just concatenated into that order? That doesn't seem to buy you much.
Now it could be that your application is historically using the same
function names as some extension you're now adding to the server, and
you want to control which function is called when not schema
qualified. So you have the post_search_path to play with too.
It seems to me you'd just schema-qualify in this case. I mean, that's
kind of the point of schemas.
The idea being that application developpers will maintain search_path
for the application schemas (and this search_path can vary depending
on
the application role which connects to the database, of course), and
the
DBA team will make extensions available transparently to the
application
by adding the extension's schemas in either pre_search_path or
post_search_path.
I think more useful would be a way to append or prepend schemas to the
search path within a given context (in a transaction or a connection).
That way, instead of doing stuff like this:
BEGIN;
SET search_path = foo,bar,public;
-- ...
COMMIT;
RESET search_path;
…which suffers from an inability to easily modify an existing path
(yes, I know I can look it up and parse it, but please), I could just
do something like this:
BEGIN;
prepend_search_path('foo,bar');
COMMIT;
And then it would be reverted at the end of the transaction. Or it
could be for the duration of a connection; that probably makes more
sense.
I hope I've added clarity to the point, rather than only some extra
verbosity... :)
Yes, but it just seems like unnecessary complexity to me. We don't
want to learn the lessons of Java's CLASSPATH by making things *more*
complicated.
Best,
David
"David" == "David E Wheeler" <david@kineticode.com> writes:
The moment you're adding specific schemas where to put extensions
into, you have to adapt your search_path. Some applications
already have to manage search_path for their own needs, so we're
trying to avoid having those people to care about extensions
schemas and application schema at the same time.
David> That doesn't seem like much of a problem to me.
Unfortunately, the fact that something doesn't seem like much of a
problem to you doesn't actually make it less of a problem.
Splitting up search_path is something I've been thinking about for a
while (and threw out on IRC as a suggestion, which is where Dimitri
got it); it was based on actual experience running an app that set the
search path in the connection parameters in order to select which of
several different schemas to use for part (not all) of the data. When
setting search_path this way, there is no way to set only part of it;
the client-supplied value overrides everything.
Obviously there are other possible solutions, but pretending there
isn't a problem will get nowhere.
(Setting the search path using a function or sql statement _after_
connecting was not an option; it would have confused the connection
persistance layer, which needed different parameters to tell the
connections apart.)
--
Andrew (irc:RhodiumToad)
Andrew Gierth wrote:
Splitting up search_path is something I've been thinking about for a
while (and threw out on IRC as a suggestion, which is where Dimitri
got it); it was based on actual experience running an app that set the
search path in the connection parameters in order to select which of
several different schemas to use for part (not all) of the data. When
setting search_path this way, there is no way to set only part of it;
the client-supplied value overrides everything.Obviously there are other possible solutions, but pretending there
isn't a problem will get nowhere.(Setting the search path using a function or sql statement _after_
connecting was not an option; it would have confused the connection
persistance layer, which needed different parameters to tell the
connections apart.)
Another way of handling this might be to provide for prepending or
appending to the search path (or even for removing items from it).
examples - something like:
alter database foo set search_path = '+bar, baz'; -- append
alter database foo set search_path = 'bar, baz+'; -- prepend
cheers
andrew
Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
Splitting up search_path is something I've been thinking about for a
while (and threw out on IRC as a suggestion, which is where Dimitri
got it); it was based on actual experience running an app that set the
search path in the connection parameters in order to select which of
several different schemas to use for part (not all) of the data. When
setting search_path this way, there is no way to set only part of it;
the client-supplied value overrides everything.
Obviously there are other possible solutions, but pretending there
isn't a problem will get nowhere.
I agree that some more flexibility in search_path seems reasonable,
but what we've got at the moment is pretty handwavy. Dimitri didn't
suggest what the uses of the different parts of a three-part path
would be, and also failed to say what the implications for the default
creation namespace would be, as well as the existing special handling
of pg_temp and pg_catalog. That stuff all works together pretty
closely; it'd be easy to end up making it less usable not more so.
regards, tom lane
Andrew Dunstan <andrew@dunslane.net> writes:
Another way of handling this might be to provide for prepending or
appending to the search path (or even for removing items from it).
I was just about to raise that as a requirement. Some folks on this
list might recognize the following coding pattern:
create schema rhn_channel;
--make rhn_channel be the default creation schema
update pg_settings set setting = 'rhn_channel,' || setting where name = 'search_path';
... create a bunch of objects in schema rhn_channel ...
-- restore the original setting
update pg_settings set setting = overlay( setting placing '' from 1 for (length('rhn_channel')+1) ) where name = 'search_path';
I agree that a nicer way to do that would be good.
alter database foo set search_path = '+bar, baz'; -- append
alter database foo set search_path = 'bar, baz+'; -- prepend
... but that ain't it :-(. SET should mean SET, not "do something magic".
Particularly in ALTER DATABASE/ALTER USER, whose execution order
relative to other stuff isn't especially well defined.
regards, tom lane
All,
Wait, I thought we'd given up on the search path model and wanted to
track extensions via dependencies. No?
--
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com
Josh Berkus <josh@agliodbs.com> writes:
Wait, I thought we'd given up on the search path model and wanted to
track extensions via dependencies. No?
I think what this discussion is about is trying to gauge just what
amount of support we could give someone who insisted on dropping each
extension into a different schema. It's not really related to how
we track which objects belong to which extension.
regards, tom lane
On May 27, 2009, at 1:49 PM, Andrew Gierth wrote:
Splitting up search_path is something I've been thinking about for a
while (and threw out on IRC as a suggestion, which is where Dimitri
got it); it was based on actual experience running an app that set the
search path in the connection parameters in order to select which of
several different schemas to use for part (not all) of the data. When
setting search_path this way, there is no way to set only part of it;
the client-supplied value overrides everything.
Right, which is why I was thinking about an interface to push schemas
onto the front of the path. Or the end.
Obviously there are other possible solutions, but pretending there
isn't a problem will get nowhere.
Yeah, it was just the splitting bit that seemed a bit much to me.
(Setting the search path using a function or sql statement _after_
connecting was not an option; it would have confused the connection
persistance layer, which needed different parameters to tell the
connections apart.)
Okay, then maybe it's the names of the paths in Dimitri's suggestion
that were confusing me. prepend_search_path and append_search_path, or
something like that, might be better.
Best,
David
On May 27, 2009, at 2:14 PM, Tom Lane wrote:
Andrew Dunstan <andrew@dunslane.net> writes:
Another way of handling this might be to provide for prepending or
appending to the search path (or even for removing items from it).I was just about to raise that as a requirement.
Yeah, I likes.
Some folks on this
list might recognize the following coding pattern:create schema rhn_channel;
--make rhn_channel be the default creation schema
update pg_settings set setting = 'rhn_channel,' || setting where
name = 'search_path';... create a bunch of objects in schema rhn_channel ...
-- restore the original setting
update pg_settings set setting = overlay( setting placing '' from 1
for (length('rhn_channel')+1) ) where name = 'search_path';I agree that a nicer way to do that would be good.
Oh, yes please.
alter database foo set search_path = '+bar, baz'; -- append
alter database foo set search_path = 'bar, baz+'; -- prepend... but that ain't it :-(. SET should mean SET, not "do something
magic".
Particularly in ALTER DATABASE/ALTER USER, whose execution order
relative to other stuff isn't especially well defined.
Perhaps a MODIFY keyword?
Best,
David
On Mon, May 25, 2009 at 11:16 AM, Dimitri Fontaine
<dfontaine@hi-media.com> wrote:
Hi,
Preliminary note: I'm using the term "extension" as if it's what we
already agree to call them, feel free to ignore this and use whatever
term you see fit. We'll have the naming issue tackled, please not now
though.
[...]
Few thoughts about the ideas:
Basically I sort of don't like the idea of playing with search_path.
In past I have set up a system where each extension had a separate
schema. Maintaining per user search_path wasn't a very nice experience.
And trying to alter it later on for whatever reason, especially from
command line was even worse. :) I tend to avoid such designs now. :)
I think it is much better to store objects in one schema (like public)
and maintain access rights via roles. Like GRANT ltree_pkg TO userfoo;
...and build upon this idea.
One of advantages of roles here is that you can DROP OWNED BY ltree_pkg;
just as well as you did DROP SCHEMA ltree_pkg;
And they take effect immediately, not requiring all sessions to restart
to take up new search_path.
Furthermore, I think it would be nice to have a cluster-wide pg_extension
table which would list all the available (installed) packages available in
the system (much like pg_database lists all databases present).
This pg_extension should be used to "rewrite" extension objects into
given schema using given role (which would be either fixed or user defined).
The idea is that whenever user installs a RPM, DEB or whatever package
the system registers the extension. Or she compiles from source and registers
extension. Or we get a CPAN style utility which installs source, compiles and
register the extension. Then administrator can copy over given extension
into specific database, into specific schema.
Simplest implementation would be that the pg_extension would contain
a package name, package version (we can have multiple versions of
the same package installed), install script (series of CREATE FUNCTION
or whatever), uninstall script (may not be present) and some upgrade
path would be needed as well.
The installation would CREATE ROLE <packagename>_pkg and execute
all CREATE FUNCTION inside schema PUBLIC. Then GRANT access.
If administrator instructs so it might CREATE ROLE <packagename>_<schema>_pkg
and execute all CREATE FUNCTION in schema <schema>.
Uninstall would mean DROP OWNED BY <packagename>_pkg;
OK, enough of my proposal. :-)
Coming back to the pre_search_path -- it sounds somewhat like Oracle's
PACKAGEs, only different (completely parallel hierarchy, but similar to
schemas). I like the Oracle approach better though -- no messing with
search_paths please...
Best regards,
Dawid
--
.................. ``The essence of real creativity is a certain
: *Dawid Kuroczko* : playfulness, a flitting from idea to idea
: qnex42@gmail.com : without getting bogged down by fixated demands.''
`..................' Sherkaner Underhill, A Deepness in the Sky, V. Vinge
Tom,
I think what this discussion is about is trying to gauge just what
amount of support we could give someone who insisted on dropping each
extension into a different schema. It's not really related to how
we track which objects belong to which extension.
Really, they're on their own.
Either we drop everything into a standard pg_extensions schema (which is
then programmatically part of the search path, like pg_catalog is) or we
don't install them to any particular schema and leave it up to the DBA
to work out any search_path issues on their own.
Personally, if we're tracking stuff through special dependancies which
pg_dump will be aware of anyway, I don't see why extension objects
should go into a special schema.
--
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com
Josh Berkus <josh@agliodbs.com> writes:
Personally, if we're tracking stuff through special dependancies which
pg_dump will be aware of anyway, I don't see why extension objects
should go into a special schema.
Well, we could certainly take that attitude and eliminate all this
hassle ;-). However, I think that more-flexible search path handling
might have other uses, so I don't see any reason not to think about it.
regards, tom lane
Tom Lane wrote:
Josh Berkus <josh@agliodbs.com> writes:
Personally, if we're tracking stuff through special dependancies which
pg_dump will be aware of anyway, I don't see why extension objects
should go into a special schema.Well, we could certainly take that attitude and eliminate all this
hassle ;-). However, I think that more-flexible search path handling
might have other uses, so I don't see any reason not to think about it.
+1
I think Josh is right about extensions, but we certainly do need more
powerful tools to manipulate the search path.
cheers
andrew
Tom,
Well, we could certainly take that attitude and eliminate all this
hassle ;-). However, I think that more-flexible search path handling
might have other uses, so I don't see any reason not to think about it.
Sure. I think that having better search path management would be a
wonderful thing; it would encourage people to use schema more in general.
However, that doesn't mean that I think it should be part of the
extensions design, or even a gating factor.
For example, I could see these kinds of settings:
search_path_override (suset) would set all users to a specific search
path and raise an error at any set search_path attempts. This would be
mainly for secure applications.
search_path_suffix (suset) would append a certain set of schema to the
end of the search path regardless of what else the user put in, e.g.:
search_path_suffix = 'pg_modules, information_schema'
search_path = 'main,web,accounts'
... would mean that any object named would search in
main,web,accounts,pg_modules,information_schema. This would be one way
to solve the issue of having extra schema for extensions or other
"utilities" in applications.
--
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com
On Wed, May 27, 2009 at 9:01 PM, Josh Berkus <josh@agliodbs.com> wrote:
Sure. I think that having better search path management would be a
wonderful thing; it would encourage people to use schema more in general.However, that doesn't mean that I think it should be part of the extensions
design, or even a gating factor.
Agreed, I think this is largely a tangent.
However, since we're on that tangent, I'm not completely convinced
that additional lists of search paths that get prepended or appended
to the main search path are the right way to go. It seems like that's
just chopping up the problem into smaller bits without really fixing
anything. I wonder if the right solution might be to associate with
each schema a list of other schemas to be searched if the object isn't
found in that schema. This means that the contents of search_path
would really become the roots of the trees of schemas to be searched.
Then we could provide DDL commands to do things like:
ALTER SCHEMA pg_extensions INHERIT SCHEMA my_new_extension;
ALTER SCHEMA pg_extensions NO INHERIT SCHEMA extension_i_want_to_remove;
</handwaving>
...Robert
Robert,
However, since we're on that tangent, I'm not completely convinced
that additional lists of search paths that get prepended or appended
to the main search path are the right way to go. It seems like that's
just chopping up the problem into smaller bits without really fixing
anything. I wonder if the right solution might be to associate with
each schema a list of other schemas to be searched if the object isn't
found in that schema. This means that the contents of search_path
would really become the roots of the trees of schemas to be searched.
See, that strikes me a completely unmanageable and likely to give rise
to application security holes. But you're a smart guy ... so, *why*
would that be a better idea than some superuser settings? What am I not
thinking of?
--
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com
On Wed, May 27, 2009 at 10:02 PM, Josh Berkus <josh@agliodbs.com> wrote:
Robert,
However, since we're on that tangent, I'm not completely convinced
that additional lists of search paths that get prepended or appended
to the main search path are the right way to go. It seems like that's
just chopping up the problem into smaller bits without really fixing
anything. I wonder if the right solution might be to associate with
each schema a list of other schemas to be searched if the object isn't
found in that schema. This means that the contents of search_path
would really become the roots of the trees of schemas to be searched.See, that strikes me a completely unmanageable and likely to give rise to
application security holes. But you're a smart guy ... so, *why* would that
be a better idea than some superuser settings? What am I not thinking of?
Hey, you're a smart guy too, plus you've been around here longer than
I have and have more experience. If my idea sounds like it sucks,
there's a real possibility that it does.
With that caveat, my thought process was approximately as follows.
The contents of a particular schema are more or less analagous to an
application. In most programming languages, an application informs
the system of the libraries that it needs and the system goes off and
loads the symbols in those libraries into the application's namespace.
Using search path basically requires the user to tell the application
where to find those symbols, which ISTM is exactly backwards.
In other words, suppose we have an application in schema S that is
designed to use make use of extensions installed in scheams E1, E4,
and E7. With the pre- and post- search path approach, it's not
sufficient for the user to set his or her search_path to S and then
use the application. Instead, the user has to know that the
search_path must be set to S, E1, E4, E7, or else E1, E4, and E7 have
to be present in the system default post-search-path. And what
happens if there is another application in schema S2 that uses E1, E2,
and E4, where E2 is an older version of E7 with an overlapping set of
names? There's no possible way of configuring your search paths that
will make this work, short of explicitly setting the full search path
to exactly the right thing for each application when using that
application.
Also, it seems to me that we could create a system schema called
something like pg_extension and make it empty. Every extension could
install in its own schema and then tell pg_extension to inherit it
that schema. Then if you want to just get all the extensions, you can
just set your search path to include pg_extension, and as new
extensions are added or old ones are removed, you'll still have all
the extensions without changing anything. I don't see how this could
be made to work with the pre- and post- search_path idea; you'll be
manually fiddling those settings in postgresql.conf, or on a per-user
basis, or wherever you set them up, every time you add or remove an
extension.
I Just Work Here, You Want To Talk To The Boss.
...Robert