refactoring comment.c
At PGCon, we discussed the possibility that a minimal SE-PostgreSQL
implementation would need little more than a hook in
ExecCheckRTPerms() [which we've since added] and a security label
facility [for which KaiGai has submitted a patch]. I actually sat
down to write the security label patch myself while we were in Ottawa,
but quickly ran into difficulties: while the hook we have now can't do
anything useful with objects other than relations, it's pretty clear
from previous discussions on this topic that the demand for labels on
other kinds of objects is not going to go away. Rather than adding
additional syntax to every object type in the system (some of which
don't even have ALTER commands at present), I suggested basing the
syntax on the existing COMMENT syntax. After some discussion[1]http://archives.postgresql.org/pgsql-hackers/2010-07/msg01328.php, we
seem to have settled on the following:
SECURITY LABEL [ FOR <provider> ] ON <object class> <object name> IS '<label>';
At present, there are some difficulties with generalizing this syntax
to other object types. As I found out when I initially set out to
write this patch, it'd basically require duplicating all of comment.c,
which is an unpleasant prospect, because that file is big and crufty;
it has a large amount of internal duplication. Furthermore, the
existing locking mechanism that we're using for comments is known to
be inadequate[2]http://archives.postgresql.org/pgsql-hackers/2010-07/msg00351.php. Dropping a comment while someone else is in the
midst of commenting on it leaves orphaned comments lying around in
pg_(sh)description that could later be inherited by a new object.
That's a minor nuisance for comments and would be nice to fix, but is
obviously a far larger problem for security labels, where even a small
chance of randomly mislabeling an object is no good.
So I wrote a patch. The attached patch factors out all of the code in
comment.c that is responsible for translating parser representations
into a new file parser/parse_object.c, leaving just the
comment-specific stuff in commands/comment.c. It also adds
appropriate locking, so that concurrent COMMENT/DROP scenarios don't
leave behind leftovers. It's a fairly large patch, but the changes
are extremely localized: comment.c gets a lot smaller, and
parse_object.c gets bigger by a slightly smaller amount.
Any comments? (ha ha ha...)
[1]: http://archives.postgresql.org/pgsql-hackers/2010-07/msg01328.php
[2]: http://archives.postgresql.org/pgsql-hackers/2010-07/msg00351.php
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company
Attachments:
refactor_comment-v1.patchapplication/octet-stream; name=refactor_comment-v1.patchDownload+795-809
Excerpts from Robert Haas's message of vie ago 06 11:02:58 -0400 2010:
Any comments? (ha ha ha...)
Interesting idea. The patch looks fine on a quick once-over. Two small
things: this comment
+ /*
+ * Databases, tablespaces, and roles are cluster-wide objects, so any
+ * comments on those objects are record in the shared pg_shdescription
+ * catalog. Comments on all other objects are recorded in pg_description.
+ */
says "record" where it should say "recorded".
Also, it strikes me that perhaps the ObjectAddress struct definition
should be moved to the new header file; seems a more natural place for
it (but then, it seems like a lot of files will need to include the new
header, so perhaps that should be left to a subsequent patch).
Thirdly, is it just me or just could replace a lot of code in DropFoo
functions with this new auxiliary code? Seems it's just missing
"missing_ok" support ...
--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Fri, Aug 6, 2010 at 11:36 AM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:
Excerpts from Robert Haas's message of vie ago 06 11:02:58 -0400 2010:
Any comments? (ha ha ha...)
Interesting idea. The patch looks fine on a quick once-over.
Thanks for taking a look.
Two small
things: this comment+ /* + * Databases, tablespaces, and roles are cluster-wide objects, so any + * comments on those objects are record in the shared pg_shdescription + * catalog. Comments on all other objects are recorded in pg_description. + */says "record" where it should say "recorded".
Thanks, good eye.
Also, it strikes me that perhaps the ObjectAddress struct definition
should be moved to the new header file; seems a more natural place for
it (but then, it seems like a lot of files will need to include the new
header, so perhaps that should be left to a subsequent patch).
I thought about that, but erred on the side of being conservative and
didn't move it. I like the idea, though.
Thirdly, is it just me or just could replace a lot of code in DropFoo
functions with this new auxiliary code? Seems it's just missing
"missing_ok" support ...
I am not sure how much code it would save you at the level of the
individual DropFoo() functions; I'd have to look through them more
carefully. But now that you mention it, what about getting rid of all
of the individual parse nodes for drop statements? Right now we have:
DropTableSpaceStmt
DropFdwStmt
DropForeignServerStmt
DropUserMappingStmt
DropPLangStmt
DropRoleStmt
DropStmt (table, sequence, view, index, domain, conversion, schema,
text search {parser, dictionary, template, configuration}
DropPropertyStmt (rules and triggers)
DropdbStmt (capitalized differently, just for fun)
DropCastStmt
RemoveFuncStmt (so you can't find it by grepping for Drop!)
RemoveOpClassStmt
RemoveOpFamilyStmt
It seems like we could probably unify all of these into a single
DropStmt, following the same pattern as CommentStmt, although I think
perhaps that should be a follow-on patch rather than doing it as part
of this one. GRANT also has some code to translate object names to
OIDs, which I thought might be able to use this machinery as well,
though I haven't really checked whether it makes sense.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company
On Fri, 2010-08-06 at 11:02 -0400, Robert Haas wrote:
At PGCon, we discussed the possibility that a minimal SE-PostgreSQL
implementation would need little more than a hook in
ExecCheckRTPerms() [which we've since added] and a security label
facility [for which KaiGai has submitted a patch]. I actually sat
down to write the security label patch myself while we were in Ottawa,
but quickly ran into difficulties: while the hook we have now can't do
anything useful with objects other than relations, it's pretty clear
from previous discussions on this topic that the demand for labels on
other kinds of objects is not going to go away. Rather than adding
additional syntax to every object type in the system (some of which
don't even have ALTER commands at present), I suggested basing the
syntax on the existing COMMENT syntax. After some discussion[1], we
seem to have settled on the following:SECURITY LABEL [ FOR <provider> ] ON <object class> <object name> IS '<label>';
I understand the concept and it seems like it might work. Not too keen
on pretending a noun is a verb. That leads to erroring.
<verb> SECURITY LABEL? verb = CREATE, ADD, ...
Can't objects have more than one label?
How will you set default security labels on objects?
Where do you define labels?
Will there be a new privilege to define this? Presumably object owners
would not be able to set that themselves, otherwise you could create an
object, add a security label to it and then use it to see other things
at that level.
--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Training and Services
On Fri, Aug 6, 2010 at 12:26 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
I understand the concept and it seems like it might work. Not too keen
on pretending a noun is a verb. That leads to erroring.<verb> SECURITY LABEL? verb = CREATE, ADD, ...
Can't objects have more than one label?
How will you set default security labels on objects?
Where do you define labels?
Will there be a new privilege to define this? Presumably object owners
would not be able to set that themselves, otherwise you could create an
object, add a security label to it and then use it to see other things
at that level.
Uh, these are all good questions, but I think they'd be more
appropriate on the thread about the security label patch, to which I
linked in my previous email. Many of them have already been discussed
there.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company
(2010/08/07 0:02), Robert Haas wrote:
At PGCon, we discussed the possibility that a minimal SE-PostgreSQL
implementation would need little more than a hook in
ExecCheckRTPerms() [which we've since added] and a security label
facility [for which KaiGai has submitted a patch]. I actually sat
down to write the security label patch myself while we were in Ottawa,
but quickly ran into difficulties: while the hook we have now can't do
anything useful with objects other than relations, it's pretty clear
from previous discussions on this topic that the demand for labels on
other kinds of objects is not going to go away. Rather than adding
additional syntax to every object type in the system (some of which
don't even have ALTER commands at present), I suggested basing the
syntax on the existing COMMENT syntax. After some discussion[1], we
seem to have settled on the following:SECURITY LABEL [ FOR<provider> ] ON<object class> <object name> IS '<label>';
At present, there are some difficulties with generalizing this syntax
to other object types. As I found out when I initially set out to
write this patch, it'd basically require duplicating all of comment.c,
which is an unpleasant prospect, because that file is big and crufty;
it has a large amount of internal duplication. Furthermore, the
existing locking mechanism that we're using for comments is known to
be inadequate[2]. Dropping a comment while someone else is in the
midst of commenting on it leaves orphaned comments lying around in
pg_(sh)description that could later be inherited by a new object.
That's a minor nuisance for comments and would be nice to fix, but is
obviously a far larger problem for security labels, where even a small
chance of randomly mislabeling an object is no good.So I wrote a patch. The attached patch factors out all of the code in
comment.c that is responsible for translating parser representations
into a new file parser/parse_object.c, leaving just the
comment-specific stuff in commands/comment.c. It also adds
appropriate locking, so that concurrent COMMENT/DROP scenarios don't
leave behind leftovers. It's a fairly large patch, but the changes
are extremely localized: comment.c gets a lot smaller, and
parse_object.c gets bigger by a slightly smaller amount.Any comments? (ha ha ha...)
[1] http://archives.postgresql.org/pgsql-hackers/2010-07/msg01328.php
[2] http://archives.postgresql.org/pgsql-hackers/2010-07/msg00351.php
Thanks for your efforts.
I believe the get_object_address() enables to implement security
label features on various kind of database objects.
I tried to look at the patch. Most part is fine, but I found out
two issues.
On the object_exists(), when we verify existence of a large object,
it needs to scan pg_largeobject_metadata, instead of pg_largeobject.
When we implement pg_largeobject_metadata catalog, we decided to set
LargeObjectRelationId on object.classId due to the backend compatibility.
| /*
| * For object types that have a relevant syscache, we use it; for
| * everything else, we'll have to do an index-scan. This switch
| * sets either the cache to be used for the syscache lookup, or the
| * index to be used for the index scan.
| */
| switch (address.classId)
| {
| case RelationRelationId:
| cache = RELOID;
| break;
| :
| case LargeObjectRelationId:
| indexoid = LargeObjectMetadataOidIndexId;
| break;
| :
| }
|
| /* Found a syscache? */
| if (cache != -1)
| return SearchSysCacheExists1(cache, ObjectIdGetDatum(address.objectId));
|
| /* No syscache, so examine the table directly. */
| Assert(OidIsValid(indexoid));
| ScanKeyInit(&skey[0], ObjectIdAttributeNumber, BTEqualStrategyNumber,
| F_OIDEQ, ObjectIdGetDatum(address.objectId));
| rel = heap_open(address.classId, AccessShareLock);
^^^^^^^^^^^^^^^ <- It tries to open pg_largeobject
| sd = systable_beginscan(rel, indexoid, true, SnapshotNow, 1, skey);
| found = HeapTupleIsValid(systable_getnext(sd));
| systable_endscan(sd);
| heap_close(rel, AccessShareLock);
| return found;
| }
Although no caller invokes get_object_address() with lockmode = NoLock,
isn't it necessary to skip locking if NoLock was given.
| /*
| * If we're dealing with a relation or attribute, then the relation is
| * already locked. If we're dealing with any other type of object, we need
| * to lock it and then verify that it still exists.
| */
| if (address.classId != RelationRelationId)
| {
| if (IsSharedRelation(address.classId))
| LockSharedObject(address.classId, address.objectId, 0, lockmode);
| else
| LockDatabaseObject(address.classId, address.objectId, 0, lockmode);
| /* Did it go away while we were waiting for the lock? */
| if (!object_exists(address))
| elog(ERROR, "cache lookup failed for class %u object %u subobj %d",
| address.classId, address.objectId, address.objectSubId);
| }
Thanks,
--
KaiGai Kohei <kaigai@kaigai.gr.jp>
On Fri, Aug 6, 2010 at 11:15 PM, KaiGai Kohei <kaigai@kaigai.gr.jp> wrote:
[brief review]
OK, here's an updated patch:
1. I fixed the typo Alvaro spotted.
2. I haven't done anything about moving the definition of
ObjectAddress elsewhere, as Alvaro suggested, because I'm not sure
quite where it ought to go. I still think it's a good idea, though
I'm not dead set on it, either. Suggestions?
3. I fixed the issue Kaigai Kohei spotted, regarding
LargeObjectRelationId vs. LargeObjectMetadataRelationId, by adding a
grotty hack. However, I feel that I'm not so much adding a new grotty
hack as working around an existing grotty hack which was added for
reasons I'm unclear on. Is there a pg_upgrade-related reason not to
revert the original hack instead?
4. In response to Kaigai Kohei's complaint about lockmode possibly
being NoLock, I've just added an Assert() that it isn't, in lieu of
trying to do something sensible in that case. I can't at present
think of a situation in which being able to call it that way would be
useful, and the Assert() seems like it ought to be enough warning to
anyone coming along later that they'd better think twice before
thinking that will work.
5. Since I'm hoping Tom will read this, I ran it through filterdiff. :-)
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company
Attachments:
refactor_comment-v2-context.patchapplication/octet-stream; name=refactor_comment-v2-context.patchDownload+1411-889
(2010/08/16 11:50), Robert Haas wrote:
On Fri, Aug 6, 2010 at 11:15 PM, KaiGai Kohei<kaigai@kaigai.gr.jp> wrote:
[brief review]
OK, here's an updated patch:
1. I fixed the typo Alvaro spotted.
2. I haven't done anything about moving the definition of
ObjectAddress elsewhere, as Alvaro suggested, because I'm not sure
quite where it ought to go. I still think it's a good idea, though
I'm not dead set on it, either. Suggestions?3. I fixed the issue Kaigai Kohei spotted, regarding
LargeObjectRelationId vs. LargeObjectMetadataRelationId, by adding a
grotty hack. However, I feel that I'm not so much adding a new grotty
hack as working around an existing grotty hack which was added for
reasons I'm unclear on. Is there a pg_upgrade-related reason not to
revert the original hack instead?
When we were developing largeobject access controls, Tom Lane commented
as follows:
* Re: [HACKERS] [PATCH] Largeobject access controls
http://marc.info/?l=pgsql-hackers&m=125548822906571&w=2
| I notice that the patch decides to change the pg_description classoid for
| LO comments from pg_largeobject's OID to pg_largeobject_metadata's. This
| will break existing clients that look at pg_description (eg, pg_dump and
| psql, plus any other clients that have any intelligence about comments,
| for instance it probably breaks pgAdmin). And there's just not a lot of
| return that I can see. I agree that using pg_largeobject_metadata would
| be more consistent given the new catalog layout, but I'm inclined to think
| we should stick to the old convention on compatibility grounds. Given
| that choice, for consistency we'd better also use pg_largeobject's OID not
| pg_largeobject_metadata's in pg_shdepend entries for LOs.
He concerned about existing applications which have knowledge about internal
layout of system catalogs, then I fixed up the patch according to the suggestion.
4. In response to Kaigai Kohei's complaint about lockmode possibly
being NoLock, I've just added an Assert() that it isn't, in lieu of
trying to do something sensible in that case. I can't at present
think of a situation in which being able to call it that way would be
useful, and the Assert() seems like it ought to be enough warning to
anyone coming along later that they'd better think twice before
thinking that will work.5. Since I'm hoping Tom will read this, I ran it through filterdiff. :-)
--
KaiGai Kohei <kaigai@ak.jp.nec.com>
Excerpts from KaiGai Kohei's message of lun ago 16 00:19:54 -0400 2010:
(2010/08/16 11:50), Robert Haas wrote:
When we were developing largeobject access controls, Tom Lane commented
as follows:* Re: [HACKERS] [PATCH] Largeobject access controls
http://marc.info/?l=pgsql-hackers&m=125548822906571&w=2
| I notice that the patch decides to change the pg_description classoid for
| LO comments from pg_largeobject's OID to pg_largeobject_metadata's. This
| will break existing clients that look at pg_description (eg, pg_dump and
| psql, plus any other clients that have any intelligence about comments,
| for instance it probably breaks pgAdmin). And there's just not a lot of
| return that I can see. I agree that using pg_largeobject_metadata would
| be more consistent given the new catalog layout, but I'm inclined to think
| we should stick to the old convention on compatibility grounds. Given
| that choice, for consistency we'd better also use pg_largeobject's OID not
| pg_largeobject_metadata's in pg_shdepend entries for LOs.He concerned about existing applications which have knowledge about internal
layout of system catalogs, then I fixed up the patch according to the suggestion.
I think that with this patch we have the return for the change that we
didn't have previously. A patch that changes it should also fix pg_dump
and psql at the same time, but IMHO it doesn't make sense to keep adding
grotty hacks on top of it.
Maybe we could do with a grotty hack in obj_description() instead?
(...checks...)
Oh, it's defined as a SQL function directly in pg_proc.h :-(
--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Robert Haas <robertmhaas@gmail.com> writes:
5. Since I'm hoping Tom will read this, I ran it through filterdiff. :-)
OK, I looked ;-). It mostly looks good, but of course I've got some
opinions...
2. I haven't done anything about moving the definition of
ObjectAddress elsewhere, as Alvaro suggested, because I'm not sure
quite where it ought to go. I still think it's a good idea, though
I'm not dead set on it, either. Suggestions?
I think the problem is you're trying to put this into backend/parser
which is not really the right place for it. It's an execution-time
animal not a parse-time animal. I would put it into backend/catalog,
perhaps named objectaddress.c, and similarly the header file would be
objectaddress.h. Then it would be reasonable to move struct
ObjectAddress into this header and have dependency.h #include it.
There might be some other stuff in dependency.c that more naturally
belongs here, too.
3. I fixed the issue Kaigai Kohei spotted, regarding
LargeObjectRelationId vs. LargeObjectMetadataRelationId, by adding a
grotty hack. However, I feel that I'm not so much adding a new grotty
hack as working around an existing grotty hack which was added for
reasons I'm unclear on. Is there a pg_upgrade-related reason not to
revert the original hack instead?
It's not pg_upgrade (nor psql, nor pg_dump) we are protecting here.
It's third-party applications that might understand the contents of
pg_description, pg_depend, etc. I think that hack is quite small
and localized enough to live with, rather than causing a flag day
for an unknown number of clients by changing the representation.
+ /*
+ * Translate the parser representation which identifies this object into
+ * an ObjectAddress. get_object_address() will throw an error if the
+ * object does not exist.
+ */
+ address = get_object_address(stmt->objtype, stmt->objname, stmt->objargs,
+ &relation, ShareUpdateExclusiveLock);
I think this comment should also explicitly mention that we're getting a
lock to protect against concurrent DROPs.
+ /*
+ * Translate an object name and arguments (as passed by the parser) to an
+ * ObjectAddress.
+ *
+ * The returned object will be locked using the specified lockmode. If a
+ * sub-object is looked up, the parent object will be locked instead.
+ *
+ * We don't currently provide a function to release the locks acquired here;
+ * typically, the lock must be held until commit to guard against a concurrent
+ * drop operation.
+ */
+ ObjectAddress
+ get_object_address(ObjectType objtype, List *objname, List *objargs,
+ Relation *relp, LOCKMODE lockmode)
This comment's a bit shy of a load too, since it totally fails to
mention the relp argument, or specify what the caller is required to
do with it, or explain how the locking on the relation works.
As a matter of style, I'd suggest putting the single externally callable
function (ie get_object_address) at the top of the file not the bottom.
People shouldn't have to read through the entire file before finding out
what API it is supposed to provide to the outside world.
regards, tom lane
On Mon, Aug 16, 2010 at 3:48 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
2. I haven't done anything about moving the definition of
ObjectAddress elsewhere, as Alvaro suggested, because I'm not sure
quite where it ought to go. I still think it's a good idea, though
I'm not dead set on it, either. Suggestions?I think the problem is you're trying to put this into backend/parser
which is not really the right place for it. It's an execution-time
animal not a parse-time animal. I would put it into backend/catalog,
perhaps named objectaddress.c, and similarly the header file would be
objectaddress.h. Then it would be reasonable to move struct
ObjectAddress into this header and have dependency.h #include it.
There might be some other stuff in dependency.c that more naturally
belongs here, too.
If this isn't parse analysis, then you and I have very different ideas
of what parse analysis is. And under this theory, what are routines
like LookupAggNameTypeNames() doing in src/backend/parser?
I'll make the rest of the changes you suggest...
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company
Robert Haas <robertmhaas@gmail.com> writes:
On Mon, Aug 16, 2010 at 3:48 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I think the problem is you're trying to put this into backend/parser
which is not really the right place for it.
If this isn't parse analysis, then you and I have very different ideas
of what parse analysis is.
Maybe so, but the parser is expected to put out a representation that
will still be valid when the command is executed some time later.
That is exactly why utility statements have the barely-more-than-source
parsetree representation they do: because we do not hold locks on the
objects from parsing to execution, we could not expect an OID-level
representation to remain good. This is a lot different from what we
do with DML statements, but there are good reasons for it.
I repeat my observation that this code doesn't belong in /parser.
The code you're replacing was not in /parser, and that was because
it didn't belong there, not because somebody didn't understand the
system structure.
regards, tom lane
I wrote:
Maybe so, but the parser is expected to put out a representation that
will still be valid when the command is executed some time later.
Rereading this, I see I didn't make my point very clearly. The reason
this code doesn't belong in parser/ is that there's no prospect the
parser itself would ever use it. ObjectAddress is an execution-time
creature because we don't want utility statement representations to be
resolved to OID-level detail before they execute.
regards, tom lane
On Tue, Aug 17, 2010 at 11:30 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I wrote:
Maybe so, but the parser is expected to put out a representation that
will still be valid when the command is executed some time later.Rereading this, I see I didn't make my point very clearly. The reason
this code doesn't belong in parser/ is that there's no prospect the
parser itself would ever use it. ObjectAddress is an execution-time
creature because we don't want utility statement representations to be
resolved to OID-level detail before they execute.
Well, that is a good reason for doing it your way, but I'm slightly
fuzzy on why we need a crisp separation between parse-time and
execution-time.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company
Robert Haas <robertmhaas@gmail.com> writes:
On Tue, Aug 17, 2010 at 11:30 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Rereading this, I see I didn't make my point very clearly. �The reason
this code doesn't belong in parser/ is that there's no prospect the
parser itself would ever use it. �ObjectAddress is an execution-time
creature because we don't want utility statement representations to be
resolved to OID-level detail before they execute.
Well, that is a good reason for doing it your way, but I'm slightly
fuzzy on why we need a crisp separation between parse-time and
execution-time.
I don't insist that the separation has to be crisp. I'm merely saying
that putting a large chunk of useful-only-at-execution-time code into
backend/parser is the Wrong Thing.
regards, tom lane
On Tue, Aug 17, 2010 at 2:24 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
On Tue, Aug 17, 2010 at 11:30 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Rereading this, I see I didn't make my point very clearly. The reason
this code doesn't belong in parser/ is that there's no prospect the
parser itself would ever use it. ObjectAddress is an execution-time
creature because we don't want utility statement representations to be
resolved to OID-level detail before they execute.Well, that is a good reason for doing it your way, but I'm slightly
fuzzy on why we need a crisp separation between parse-time and
execution-time.I don't insist that the separation has to be crisp. I'm merely saying
that putting a large chunk of useful-only-at-execution-time code into
backend/parser is the Wrong Thing.
OK, but there should be a reason for that. For example, if there are
circumstances when we parse a statement, and then time passes, and
then we execute it later, that's a good argument for what you're
saying here. But otherwise, the fact that these bits of barely-parsed
stuff get passed all over the backend seems like an inconvenient wart.
I was actually thinking of proposing that we make more things happen
during the parsing process and postpone less to the execution phase,
and I need to make sure that I understand why you don't want to do
that.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company
Robert Haas <robertmhaas@gmail.com> writes:
On Tue, Aug 17, 2010 at 2:24 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I don't insist that the separation has to be crisp. �I'm merely saying
that putting a large chunk of useful-only-at-execution-time code into
backend/parser is the Wrong Thing.
OK, but there should be a reason for that. For example, if there are
circumstances when we parse a statement, and then time passes, and
then we execute it later, that's a good argument for what you're
saying here.
Yeah, and that's exactly what happens with utility statements that (for
example) get into the plan cache.
I was actually thinking of proposing that we make more things happen
during the parsing process and postpone less to the execution phase,
and I need to make sure that I understand why you don't want to do
that.
The reason to not do that is that you'd have to hold more locks,
and do more management of those locks, in order to protect the
more-thoroughly-analyzed statements from parsing to execution. In the
case of utility statements, particularly ones that affect a lot of
objects, I think that would be a seriously bad idea. In fact it would
fly in the face of lessons we learned the hard way. The whole of
parser/parse_utilcmd.c is code that we used to execute "at parse time",
and had to rearrange to postpone to execution time, in order to avoid
nasty bugs. It's still in backend/parser because it fits well there
and uses a lot of parser infrastructure that's shared with parse-time
analysis of DML statements. But neither of those statements hold for
the ObjectAddress resolution code.
regards, tom lane
On Tue, Aug 17, 2010 at 2:49 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
On Tue, Aug 17, 2010 at 2:24 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I don't insist that the separation has to be crisp. I'm merely saying
that putting a large chunk of useful-only-at-execution-time code into
backend/parser is the Wrong Thing.OK, but there should be a reason for that. For example, if there are
circumstances when we parse a statement, and then time passes, and
then we execute it later, that's a good argument for what you're
saying here.Yeah, and that's exactly what happens with utility statements that (for
example) get into the plan cache.
OK. I'll have to look through that code some time.
Here's v3.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company
Attachments:
refactor_comment-v3.patchapplication/octet-stream; name=refactor_comment-v3.patchDownload+1440-907
Excerpts from Robert Haas's message of mié ago 18 21:32:48 -0400 2010:
Here's v3.
The header comment in objectaddress.c contains a funny mistake: it says
it works with ObjectAddresses. However, ObjectAddresses is a different
type altogether, so I recommend not using that as plural for
ObjectAddress. Maybe "ObjectAddress objects"? :-D
--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera <alvherre@commandprompt.com> writes:
Excerpts from Robert Haas's message of mié ago 18 21:32:48 -0400 2010:
Here's v3.
The header comment in objectaddress.c contains a funny mistake: it says
it works with ObjectAddresses. However, ObjectAddresses is a different
type altogether, so I recommend not using that as plural for
ObjectAddress. Maybe "ObjectAddress objects"? :-D
Alternatively, maybe ObjectAddresses was a bad choice of type name,
and it should be ObjectAddressList or ObjectAddressArray or some such.
But changing that might be more trouble than it's worth.
regards, tom lane