mapping object names to role IDs

Started by Robert Haasover 15 years ago19 messages
#1Robert Haas
robertmhaas@gmail.com

Suppose you have an object name as a CString and you want to convert
it to an OID. The method of doing this varies widely depending on the
object type:

oid = get_database_oid(name);
oid = get_tablespace_oid(name);
oid = GetForeignDataWrapperOidByName(name, true);
oid = GetForeignServerOidByName(name, true);
oid = LookupFuncNameTypeNames(name, args, true);
oid = LookupAggNameTypeNames(name, args, true);
oid = LookupFuncNameTypeNames(name, args, true);
oid = LookupOperNameTypeNames(name, args, true);
oid = GetSysCacheOid1(LANGNAME, CStringGetDatum(name));
oid = GetSysCacheOid1(NAMESPACENAME, CStringGetDatum(name));
oid = GetSysCacheOid1(AMNAME, CStringGetDatum(name));
oid = get_roleid(name);
oid = GetConstraintByName(reloid, name);
oid = FindConversionByName(list_make1(name));
oid = TSDictionaryGetDictid(list_make1(name), true);
oid = TSTemplateGetTmplid(list_make1(name), true);
oid = TSConfigGetCfgid(list_make1(name), true);

If you'd like to throw an error if the object doesn't exist, then you
can change the "true" parameter in the above examples to false, where
it exists, or you can use get_roleid_checked for roles. For the types
where a direct syscache lookup is used, you get to write the check
yourself. For constraints, GetConstraintByName throws an error
anyway; there's no equivalent that doesn't. Some other object types -
like rules and casts - have even more complex logic that is sometimes
duplicated in multiple places; and others (like tablespaces) that have
convenience functions still manage to duplicate the logic anyway.
Long story short, this is kind of a mess.

Looking at the different cases, there seem to be three main cases:

1. Objects that just have one name, like databases and procedural languages.
2. Objects that have possibly-qualified names, like conversions and
text search dictionaries.
3. Objects that have both names and argument, like functions and operators.

There's enough complexity about the stuff in category #3 that I'm
inclined to leave it alone, but try to make the other stuff more
standardized. What I would propose is that we create a new source
file somewhere (maybe utils/cache), move all of the other functions of
this type there, give them standardized names, and provide them all
with an argument that specifies whether an error is to be thrown if
the object doesn't exist. Begin bikeshedding here. I suggest that
the names be get_<object-type>_oid and that they take two parameters,
either a List * for possibly-qualified names (a little wonky, but it's
what we do now) or a char * for unqualified names, and a boolean
indicating whether to throw an error if the object isn't found. Thus:

Oid get_<object-type>_oid(List *qualname, bool missingok);
-or-
Oid get_<object-type>_oid(char *name, bool missingok);

Thus get_database_oid and get_tablespace_oid would remain unchanged
except for taking a second argument, get_roleid and get_roleid_checked
would merge, and all of the others would change to match that style.

Thoughts?

P.S. Purple.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

#2Robert Haas
robertmhaas@gmail.com
In reply to: Robert Haas (#1)
Re: mapping object names to role IDs

And of course I meant for the subject line to be "mapping object names
to OIDs", not "role IDs".

Sigh.

...Robert

#3Stephen Frost
sfrost@snowman.net
In reply to: Robert Haas (#1)
Re: mapping object names to role IDs

* Robert Haas (robertmhaas@gmail.com) wrote:

Long story short, this is kind of a mess.

I agree that it's a bit of a mess.

What I would propose is that we create a new source
file somewhere (maybe utils/cache), move all of the other functions of
this type there, give them standardized names, and provide them all
with an argument that specifies whether an error is to be thrown if
the object doesn't exist.

Something which has come up in the past is that putting all the
functions that do the same kind of thing, but operate on different
types of objects, into the same backend file/area ends up meaning that
such an area has an untold amount of knowledge about everything.
Additionally, what *does* go into said area has tons of things that are
only related by the kind of operation- not because they actually have
anything to do with each other.

This was one of the complaints levied at the approach for moving all the
ACL checking into one place. I think it would be good to have a
consistant naming/calling scheme for these various functions, but I'm
not sure that moving them all to the same place makes sense.

Thanks,

Stephen

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Stephen Frost (#3)
Re: mapping object names to role IDs

Stephen Frost <sfrost@snowman.net> writes:

* Robert Haas (robertmhaas@gmail.com) wrote:

Long story short, this is kind of a mess.

... I think it would be good to have a
consistant naming/calling scheme for these various functions, but I'm
not sure that moving them all to the same place makes sense.

I'm with Stephen on this one. I agree that standardizing the function
names and behavior would be a good idea, but don't try to put them all
in one place.

BTW, the plain-name cases should be "const char *", else some callers
will have to cast away const. You could possibly make an argument for
"const List *" in the qualified-name cases, but we don't do that
anywhere else so I think it'd just look funny here (and would require
internally casting away const, too).

regards, tom lane

#5Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#4)
Re: mapping object names to role IDs

On Sun, May 23, 2010 at 11:10 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Stephen Frost <sfrost@snowman.net> writes:

* Robert Haas (robertmhaas@gmail.com) wrote:

Long story short, this is kind of a mess.

... I think it would be good to have a
consistant naming/calling scheme for these various functions, but I'm
not sure that moving them all to the same place makes sense.

I'm with Stephen on this one.  I agree that standardizing the function
names and behavior would be a good idea, but don't try to put them all
in one place.

Some of the existing functions are in lsyscache.c, some are in files
in the commands directory, and some are in files in the parser
directory; also, even between commands and parser, not every object
type has its own file. It would be nice to bring some consistency to
where the functions are located as well as what they do. Any thoughts
on how to achieve that?

BTW, the plain-name cases should be "const char *", else some callers
will have to cast away const.  You could possibly make an argument for
"const List *" in the qualified-name cases, but we don't do that
anywhere else so I think it'd just look funny here (and would require
internally casting away const, too).

Makes sense.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#5)
Re: mapping object names to role IDs

Robert Haas <robertmhaas@gmail.com> writes:

On Sun, May 23, 2010 at 11:10 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I'm with Stephen on this one. �I agree that standardizing the function
names and behavior would be a good idea, but don't try to put them all
in one place.

Some of the existing functions are in lsyscache.c, some are in files
in the commands directory, and some are in files in the parser
directory; also, even between commands and parser, not every object
type has its own file. It would be nice to bring some consistency to
where the functions are located as well as what they do. Any thoughts
on how to achieve that?

I think both Stephen and I are saying we don't see merit in that.
Moving around pre-existing functions won't accomplish much except
causing include-list churn. Let's just standardize the names/APIs
and be done.

regards, tom lane

#7Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#6)
Re: mapping object names to role IDs

On Sun, May 23, 2010 at 11:30 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Sun, May 23, 2010 at 11:10 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I'm with Stephen on this one.  I agree that standardizing the function
names and behavior would be a good idea, but don't try to put them all
in one place.

Some of the existing functions are in lsyscache.c, some are in files
in the commands directory, and some are in files in the parser
directory; also, even between commands and parser, not every object
type has its own file.  It would be nice to bring some consistency to
where the functions are located as well as what they do.  Any thoughts
on how to achieve that?

I think both Stephen and I are saying we don't see merit in that.
Moving around pre-existing functions won't accomplish much except
causing include-list churn.  Let's just standardize the names/APIs
and be done.

Where do we put the new functions?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#7)
Re: mapping object names to role IDs

Robert Haas <robertmhaas@gmail.com> writes:

On Sun, May 23, 2010 at 11:30 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I think both Stephen and I are saying we don't see merit in that.
Moving around pre-existing functions won't accomplish much except
causing include-list churn. �Let's just standardize the names/APIs
and be done.

Where do we put the new functions?

Probably in files that are already concerned with each object type.

regards, tom lane

#9Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#8)
Re: mapping object names to role IDs

On Sun, May 23, 2010 at 1:39 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Sun, May 23, 2010 at 11:30 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I think both Stephen and I are saying we don't see merit in that.
Moving around pre-existing functions won't accomplish much except
causing include-list churn.  Let's just standardize the names/APIs
and be done.

Where do we put the new functions?

Probably in files that are already concerned with each object type.

Not every object type has a file, and the existing functions are split
across three different directories, sometimes in files that don't
really pertain to the object type being dealt with. I think this is
going to be difficult to maintain if we intentionally spread out the
parallel code across essentially the entire backend. But I guess I
can code it up and we can argue about it then.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#9)
Re: mapping object names to role IDs

Robert Haas <robertmhaas@gmail.com> writes:

Not every object type has a file, and the existing functions are split
across three different directories, sometimes in files that don't
really pertain to the object type being dealt with. I think this is
going to be difficult to maintain if we intentionally spread out the
parallel code across essentially the entire backend. But I guess I
can code it up and we can argue about it then.

The only thing that seems really significantly parallel is the error
message to be issued for object-not-found. I would suggest maybe
putting the code in lsyscache.c, except that lsyscache functions
generally are not expected to throw error on object-not-found.

As for "not every object type has a file", there is certainly code
someplace that would be calling these functions. Whereever (the
preponderance of) such calls are would be an appropriate place for the
function, IMO.

regards, tom lane

#11Peter Eisentraut
peter_e@gmx.net
In reply to: Robert Haas (#1)
Re: mapping object names to role IDs

On sön, 2010-05-23 at 00:50 -0400, Robert Haas wrote:

Oid get_<object-type>_oid(List *qualname, bool missingok);
-or-
Oid get_<object-type>_oid(char *name, bool missingok);

Thus get_database_oid and get_tablespace_oid would remain unchanged
except for taking a second argument, get_roleid and get_roleid_checked
would merge, and all of the others would change to match that style.

If you are doing some refactoring work in that area, maybe you can also
take care of the issue I talked about there:
http://archives.postgresql.org/pgsql-hackers/2008-12/msg00725.php

"""
Our code contains about 200 copies of the following code:

tuple = SearchSysCache[Copy](FOOOID, ObjectIdGetDatum(fooid), 0, 0, 0);
if (!HeapTupleIsValid(tuple))
elog(ERROR, "cache lookup failed for foo %u", fooid);
"""

It looks like your proposal would reduce that number significantly.

#12Robert Haas
robertmhaas@gmail.com
In reply to: Peter Eisentraut (#11)
Re: mapping object names to role IDs

On Wed, May 26, 2010 at 5:27 AM, Peter Eisentraut <peter_e@gmx.net> wrote:

On sön, 2010-05-23 at 00:50 -0400, Robert Haas wrote:

Oid get_<object-type>_oid(List *qualname, bool missingok);
-or-
Oid get_<object-type>_oid(char *name, bool missingok);

Thus get_database_oid and get_tablespace_oid would remain unchanged
except for taking a second argument, get_roleid and get_roleid_checked
would merge, and all of the others would change to match that style.

If you are doing some refactoring work in that area, maybe you can also
take care of the issue I talked about there:
http://archives.postgresql.org/pgsql-hackers/2008-12/msg00725.php

"""
Our code contains about 200 copies of the following code:

tuple = SearchSysCache[Copy](FOOOID, ObjectIdGetDatum(fooid), 0, 0, 0);
if (!HeapTupleIsValid(tuple))
   elog(ERROR, "cache lookup failed for foo %u", fooid);
"""

It looks like your proposal would reduce that number significantly.

Well, not directly, because I was proposing to do something about
wanting to turn an object identified by name into an OID, rather than
wanting to look up an OID and find a syscache tuple. But the same
approach could be applied to the problem you mention.

I still feel that we'd be better off putting all the functions that
use the same design pattern in a single file, rather than spreading
them out all over the backend. It's true that that one file will then
depend on all the catalog stuff, but it actually can limit
dependencies a little bit on the other end, because if someone wants
to call a bunch of these functions from the same file, they only need
to include the one header where they are all declared, rather than all
the individual files that contain the individual functions. And more
to the point, it's way easier from a maintenance standpoint: there is
much less chance that someone will change one function without
changing all the others if they are all in the same place.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

#13Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#12)
Re: mapping object names to role IDs

Robert Haas <robertmhaas@gmail.com> writes:

I still feel that we'd be better off putting all the functions that
use the same design pattern in a single file, rather than spreading
them out all over the backend. It's true that that one file will then
depend on all the catalog stuff, but it actually can limit
dependencies a little bit on the other end, because if someone wants
to call a bunch of these functions from the same file, they only need
to include the one header where they are all declared,

This is nonsense, because the call sites are going to be places that are
actually *doing* something with that catalog, and so will need not only
the catalog .h file but any other support functions associated with
doing work on that catalog. Centralizing the lookups will do nothing
whatsoever to reduce dependencies; it'll just create a central file
dependent on everything in addition to every dependency we have now.

The closest equivalent we have now is lsyscache.c, which is not exactly
a sterling example of how to design a module: it's got no conceptual
consistency whatsoever.

I'm for standardizing the API of lookup functions, but not for
relocating them.

regards, tom lane

#14Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#13)
Re: mapping object names to role IDs

On Wed, May 26, 2010 at 9:45 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

This is nonsense

You can assert that, but I don't agree. We certainly have places
(comment.c being the obvious example) where we need to look up a name
and map it to an OID without doing anything else, and actually I
believe there are useful ways to refactor the code that might lead to
more of this. Anyway, I think the code maintenance argument ought to
carry a lot more weight than whether one or two small files get
rebuilt for dependencies slightly more often. lsyscache.c might have
no conceptual consistency but it's extremely useful, and there are
plenty of other examples of where we've put code for different object
types into a single file to simplify maintenance and reduce code
complexity (e.g. copyfuncs, equalfuncs, outfuncs, etc.).

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

#15alvherre
alvherre@commandprompt.com
In reply to: Robert Haas (#12)
Re: mapping object names to role IDs

Excerpts from Robert Haas's message of mié may 26 07:20:30 -0400 2010:

I still feel that we'd be better off putting all the functions that
use the same design pattern in a single file, rather than spreading
them out all over the backend. It's true that that one file will then
depend on all the catalog stuff, but it actually can limit
dependencies a little bit on the other end, because if someone wants
to call a bunch of these functions from the same file, they only need
to include the one header where they are all declared, rather than all
the individual files that contain the individual functions.

This doesn't buy you anything, because that one header will likely have
to #include all the other headers anyway. And if this is so, then all
those headers will now be included in all files that require even a
single one of these functions.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#16Robert Haas
robertmhaas@gmail.com
In reply to: alvherre (#15)
Re: mapping object names to role IDs

On Wed, May 26, 2010 at 11:01 AM, alvherre <alvherre@commandprompt.com> wrote:

Excerpts from Robert Haas's message of mié may 26 07:20:30 -0400 2010:

I still feel that we'd be better off putting all the functions that
use the same design pattern in a single file, rather than spreading
them out all over the backend.  It's true that that one file will then
depend on all the catalog stuff, but it actually can limit
dependencies a little bit on the other end, because if someone wants
to call a bunch of these functions from the same file, they only need
to include the one header where they are all declared, rather than all
the individual files that contain the individual functions.

This doesn't buy you anything, because that one header will likely have
to #include all the other headers anyway.  And if this is so, then all
those headers will now be included in all files that require even a
single one of these functions.

Well, at any rate, I'm giving up on the argument.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

#17Tom Lane
tgl@sss.pgh.pa.us
In reply to: alvherre (#15)
Re: mapping object names to role IDs

alvherre <alvherre@commandprompt.com> writes:

Excerpts from Robert Haas's message of mié may 26 07:20:30 -0400 2010:

I still feel that we'd be better off putting all the functions that
use the same design pattern in a single file, rather than spreading
them out all over the backend.

This doesn't buy you anything, because that one header will likely have
to #include all the other headers anyway. And if this is so, then all
those headers will now be included in all files that require even a
single one of these functions.

For the particular case Robert is proposing, the *header* isn't a
problem, because the only types it would deal in are Oid, bool,
const char *, and List *. But you're right that in general this design
pattern carries a risk of having to include the world in a commonly-used
header file, which is certainly not a good idea.

regards, tom lane

#18Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#17)
Re: mapping object names to role IDs

On Wed, May 26, 2010 at 1:27 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

alvherre <alvherre@commandprompt.com> writes:

Excerpts from Robert Haas's message of mié may 26 07:20:30 -0400 2010:

I still feel that we'd be better off putting all the functions that
use the same design pattern in a single file, rather than spreading
them out all over the backend.

This doesn't buy you anything, because that one header will likely have
to #include all the other headers anyway.  And if this is so, then all
those headers will now be included in all files that require even a
single one of these functions.

For the particular case Robert is proposing, the *header* isn't a
problem, because the only types it would deal in are Oid, bool,
const char *, and List *.  But you're right that in general this design
pattern carries a risk of having to include the world in a commonly-used
header file, which is certainly not a good idea.

Right. I am very cognizant of the problem, but it isn't really an
issue in this case.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

#19alvherre
alvherre@commandprompt.com
In reply to: Robert Haas (#14)
Re: mapping object names to role IDs

Excerpts from Robert Haas's message of mié may 26 10:34:00 -0400 2010:

lsyscache.c might have no conceptual consistency but it's extremely
useful,

I know I've been annoyed by lsyscache: looking for accessors to catalog
stuff, not finding them and so creating my own by using syscache
directly, only to find out later that they already existed there.
I think we should be moving in the direction of *removing* lsyscache,
not replicating it.

BTW I quite agree with both the suggestion you give in this thread
(modulo this issue), and Peter's idea of getting rid of the repetitive
syscache coding pattern.

and there are
plenty of other examples of where we've put code for different object
types into a single file to simplify maintenance and reduce code
complexity (e.g. copyfuncs, equalfuncs, outfuncs, etc.).

Well, that's all related to node manipulation, so I'm not so sure it's
exactly the same.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support