ProcessUtility_hook
We can hook SELECT, INSERT, UPDATE and DELETE with ExecutorXxx_hooks,
but there is no way to hook DDL commands. Can I submit a patch to add
ProcessUtility_hook?
pg_stat_statements module also handle DDL commands as same as normal
queries. Fortunately, autovacuum calls call vacuum() directly,
so the statistics will not filled with VACUUM and ANALYZE commands
by autovacuum.
There might be another usage for the hook. Since we can filter all
SQL commands using ExecutorRun_hook and ProcessUtility_hook, so we
could write query text for database audit with programmable filter
in the hooks.
Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center
Itagaki Takahiro escreveu:
We can hook SELECT, INSERT, UPDATE and DELETE with ExecutorXxx_hooks,
but there is no way to hook DDL commands. Can I submit a patch to add
ProcessUtility_hook?
+1. Hey, I was thinking about that yesterday. ;)
pg_stat_statements module also handle DDL commands as same as normal
queries. Fortunately, autovacuum calls call vacuum() directly,
so the statistics will not filled with VACUUM and ANALYZE commands
by autovacuum.
I don't look at the code but maybe there is a way to hook something there too.
But I'd leave it alone for now, unless it's few lines of code.
--
Euler Taveira de Oliveira
http://www.timbira.com/
Here is a patch to add ProcessUtility_hook to handle all DDL
commands in user modules. (ProcessUtility_hook_20091021.patch)
It adds a global variable 'ProcessUtility_hook' and
export the original function as 'standard_ProcessUtility'.
Euler Taveira de Oliveira <euler@timbira.com> wrote:
pg_stat_statements module also handle DDL commands as same as normal
queries. Fortunately, autovacuum calls call vacuum() directly,
so the statistics will not filled with VACUUM and ANALYZE commands
by autovacuum.I don't look at the code but maybe there is a way to hook something there too.
But I'd leave it alone for now, unless it's few lines of code.
I see it is debatable whether pg_stat_statements should support the hook.
So I split the change in another patch. (pgss_utility_hook_20091021.patch)
Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center
Itagaki Takahiro escreveu:
Here is a patch to add ProcessUtility_hook to handle all DDL
commands in user modules. (ProcessUtility_hook_20091021.patch)
It adds a global variable 'ProcessUtility_hook' and
export the original function as 'standard_ProcessUtility'.
I've reviewed your patch. It applies cleanly and compiles without warnings. It
lacks documentation. Could you update it?
The functionality is divided in two parts. The first part is a hook in the
utility module. The idea is capture the commands that doesn't pass through
executor. I'm afraid that that hook will be used only for capturing non-DML
queries. If so, why don't we hack the tcop/postgres.c and grab those queries
from the same point we log statements? This feature is similar to the executor
one. But in the executor case, we use the plan for other things. Other
utilities are (i) using that hook to filter out some non-DML commands and (ii)
developing some non-DML replication mechanism for trigger-based replication
solutions.
The second part is to use that hook to capture non-DML commands for
pg_stat_statements module. Do we need to have rows = 0 for non-DML commands?
Maybe NULL could be an appropriate value. The PREPARE command stopped to count
the number of rows. Should we count the rows in EXECUTE command or in the
PREPARE command? The other command that doesn't count properly is COPY. Could
you fix that? I'm concerned about storing some commands that expose passwords
like CREATE ROLE foo PASSWORD 'secret'; I know that the queries are only
showed to superusers but maybe we should add this information to documentation
or provide a mechanism to exclude those commands.
I don't know if it is worth the trouble adding some code to capture VACUUM and
ANALYZE commands called inside autovacuum.
--
Euler Taveira de Oliveira
http://www.timbira.com/
Euler Taveira de Oliveira <euler@timbira.com> wrote:
The functionality is divided in two parts. The first part is a hook in the
utility module. The idea is capture the commands that doesn't pass through
executor. I'm afraid that that hook will be used only for capturing non-DML
queries. If so, why don't we hack the tcop/postgres.c and grab those queries
from the same point we log statements?
DDLs can be used from user defined functions. It has the same reason
why we have executor hooks instead of tcop hooks.
The second part is to use that hook to capture non-DML commands for
pg_stat_statements module.
- I fixed a bug that it should handle only isTopLevel command.
- A new parameter pg_stat_statements.track_ddl (boolean) is
added to enable or disable the feature.
Do we need to have rows = 0 for non-DML commands?
Maybe NULL could be an appropriate value.
Yes, but it requires additional management to handle 0 and NULL
separately. I don't think it is worth doing.
The PREPARE command stopped to count
the number of rows. Should we count the rows in EXECUTE command or in the
PREPARE command?
It requires major rewrites of EXECUTE command to pass the number of
affected rows to caller. I doubt it is worth fixing because almost all
drivers use protocol-level prepared statements instead of PREPARE+EXECUTE.
The other command that doesn't count properly is COPY. Could
you fix that?
I added codes for it.
I'm concerned about storing some commands that expose passwords
like CREATE ROLE foo PASSWORD 'secret'; I know that the queries are only
showed to superusers but maybe we should add this information to documentation
or provide a mechanism to exclude those commands.
I think there is no additonal problem there because we can see the 'secret'
command using pg_stat_activity even now.
I don't know if it is worth the trouble adding some code to capture VACUUM and
ANALYZE commands called inside autovacuum.
I'd like to exclude VACUUM and ANALYZE by autovacuum because
pg_stat_statements should not filled with those commands.
Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center
Attachments:
ProcessUtility_hook_20091130.patchapplication/octet-stream; name=ProcessUtility_hook_20091130.patchDownload+138-14
I have applied this patch, with only a minor wording improvement:
Specify <literal>on</> to track DDL commands, which excludes <command>SELECT</>,
^^^^^^^^^^^^^^
Thanks.
---------------------------------------------------------------------------
Itagaki Takahiro wrote:
Euler Taveira de Oliveira <euler@timbira.com> wrote:
The functionality is divided in two parts. The first part is a hook in the
utility module. The idea is capture the commands that doesn't pass through
executor. I'm afraid that that hook will be used only for capturing non-DML
queries. If so, why don't we hack the tcop/postgres.c and grab those queries
from the same point we log statements?DDLs can be used from user defined functions. It has the same reason
why we have executor hooks instead of tcop hooks.The second part is to use that hook to capture non-DML commands for
pg_stat_statements module.- I fixed a bug that it should handle only isTopLevel command.
- A new parameter pg_stat_statements.track_ddl (boolean) is
added to enable or disable the feature.Do we need to have rows = 0 for non-DML commands?
Maybe NULL could be an appropriate value.Yes, but it requires additional management to handle 0 and NULL
separately. I don't think it is worth doing.The PREPARE command stopped to count
the number of rows. Should we count the rows in EXECUTE command or in the
PREPARE command?It requires major rewrites of EXECUTE command to pass the number of
affected rows to caller. I doubt it is worth fixing because almost all
drivers use protocol-level prepared statements instead of PREPARE+EXECUTE.The other command that doesn't count properly is COPY. Could
you fix that?I added codes for it.
I'm concerned about storing some commands that expose passwords
like CREATE ROLE foo PASSWORD 'secret'; I know that the queries are only
showed to superusers but maybe we should add this information to documentation
or provide a mechanism to exclude those commands.I think there is no additonal problem there because we can see the 'secret'
command using pg_stat_activity even now.I don't know if it is worth the trouble adding some code to capture VACUUM and
ANALYZE commands called inside autovacuum.I'd like to exclude VACUUM and ANALYZE by autovacuum because
pg_stat_statements should not filled with those commands.Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center
[ Attachment, skipping... ]
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +
Bruce Momjian <bruce@momjian.us> writes:
I have applied this patch, with only a minor wording improvement:
Uh, we weren't even done reviewing this were we?
regards, tom lane
Tom Lane wrote:
Bruce Momjian <bruce@momjian.us> writes:
I have applied this patch, with only a minor wording improvement:
Uh, we weren't even done reviewing this were we?
Uh, I am new to this commitfest wiki thing, but it did have a review by
Euler Taveira de Oliveira:
https://commitfest.postgresql.org/action/patch_view?id=196
and the author replied. Is there more that needs to be done?
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +
Bruce Momjian <bruce@momjian.us> writes:
Tom Lane wrote:
Uh, we weren't even done reviewing this were we?
Uh, I am new to this commitfest wiki thing, but it did have a review by
Euler Taveira de Oliveira:
https://commitfest.postgresql.org/action/patch_view?id=196
and the author replied. Is there more that needs to be done?
It wasn't marked Ready For Committer, so presumably the reviewer
wasn't done with it. I know I hadn't looked at it at all, because
I was waiting for the commitfest review process to finish.
regards, tom lane
Tom Lane wrote:
Bruce Momjian <bruce@momjian.us> writes:
Tom Lane wrote:
Uh, we weren't even done reviewing this were we?
Uh, I am new to this commitfest wiki thing, but it did have a review by
Euler Taveira de Oliveira:
https://commitfest.postgresql.org/action/patch_view?id=196
and the author replied. Is there more that needs to be done?It wasn't marked Ready For Committer, so presumably the reviewer
wasn't done with it. I know I hadn't looked at it at all, because
I was waiting for the commitfest review process to finish.
So, if someone writes a patch, and it is reviewed, and the patch author
updates the patch and replies, it still should be reviewed again before
being committed? I was unclear on that. The updated patch only
appeared today, so maybe it was ready, but the commit fest manager has
to indicate that in the status before I review/apply it? Should I
revert the patch?
So there is nothing for me to do to help? The only two patches I see as
ready for committer are HS and SR; not going to touch those. ;-)
Also, we are two weeks into the commit fest and we have more unapplied
patches than applied ones.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +
I wrote:
It wasn't marked Ready For Committer, so presumably the reviewer
wasn't done with it. I know I hadn't looked at it at all, because
I was waiting for the commitfest review process to finish.
... and now that I have, I find at least four highly questionable
things about it:
1. The placement of the hook. Why is it three lines down in
ProcessUtility? It's probably reasonable to have the Assert first,
but I don't see why the hook function should have the ability to
editorialize on the behavior of everything about ProcessUtility
*except* the read-only-xact check.
2. The naming and documentation of the added GUC setting for
pg_stat_statements. "track_ddl" seems pretty bizarre to me because
there are many utility statements that no one would call DDL. COPY,
for example, is certainly not DDL. Why not call it "track_utility"?
3. The enable-condition test in pgss_ProcessUtility. Is it really
appropriate to be gating this by isTopLevel? I should think that
the nested_level check in pgss_enabled would be sufficient and
more likely to do what's expected.
4. The special case for CopyStmt. That's just weird, and it adds
a maintenance requirement we don't need. I don't see a really good
argument why COPY (alone among utility statements) deserves to have
a rowcount tracked by pg_stat_statements, but even if you want that
it'd be better to rely on examining the completionTag after the fact.
The fact that the tag is "COPY nnnn" is part of the user-visible API
for COPY and won't change lightly. The division of labor between
ProcessUtility and copy.c is far more volatile, but this patch has
injected itself into that.
regards, tom lane
Bruce Momjian <bruce@momjian.us> writes:
So, if someone writes a patch, and it is reviewed, and the patch author
updates the patch and replies, it still should be reviewed again before
being committed?
Well, that's for the reviewer to say --- if the update satisfies his
concerns, he should sign off on it, if not not. I've tried to avoid
pre-empting that process.
Also, we are two weeks into the commit fest and we have more unapplied
patches than applied ones.
Yup. Lots of unfinished reviews out there. Robert spent a good deal
of effort in the last two fests trying to light fires under reviewers;
do you want to take up that cudgel? I think wholesale commits of things
that haven't finished review is mostly going to send a signal that the
review process doesn't matter, which is *not* the signal I think we
should send.
regards, tom lane
OK, reverted and placed back in "Needs Review" status.
---------------------------------------------------------------------------
Tom Lane wrote:
I wrote:
It wasn't marked Ready For Committer, so presumably the reviewer
wasn't done with it. I know I hadn't looked at it at all, because
I was waiting for the commitfest review process to finish.... and now that I have, I find at least four highly questionable
things about it:1. The placement of the hook. Why is it three lines down in
ProcessUtility? It's probably reasonable to have the Assert first,
but I don't see why the hook function should have the ability to
editorialize on the behavior of everything about ProcessUtility
*except* the read-only-xact check.2. The naming and documentation of the added GUC setting for
pg_stat_statements. "track_ddl" seems pretty bizarre to me because
there are many utility statements that no one would call DDL. COPY,
for example, is certainly not DDL. Why not call it "track_utility"?3. The enable-condition test in pgss_ProcessUtility. Is it really
appropriate to be gating this by isTopLevel? I should think that
the nested_level check in pgss_enabled would be sufficient and
more likely to do what's expected.4. The special case for CopyStmt. That's just weird, and it adds
a maintenance requirement we don't need. I don't see a really good
argument why COPY (alone among utility statements) deserves to have
a rowcount tracked by pg_stat_statements, but even if you want that
it'd be better to rely on examining the completionTag after the fact.
The fact that the tag is "COPY nnnn" is part of the user-visible API
for COPY and won't change lightly. The division of labor between
ProcessUtility and copy.c is far more volatile, but this patch has
injected itself into that.regards, tom lane
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +
Tom Lane wrote:
Bruce Momjian <bruce@momjian.us> writes:
So, if someone writes a patch, and it is reviewed, and the patch author
updates the patch and replies, it still should be reviewed again before
being committed?Well, that's for the reviewer to say --- if the update satisfies his
concerns, he should sign off on it, if not not. I've tried to avoid
pre-empting that process.
OK, so the reviewer knows he has to reply to the author's comments, OK.
Also, we are two weeks into the commit fest and we have more unapplied
patches than applied ones.Yup. Lots of unfinished reviews out there. Robert spent a good deal
of effort in the last two fests trying to light fires under reviewers;
do you want to take up that cudgel? I think wholesale commits of things
I am afraid I am then duplicating work the commit fest manager is doing,
and if someone is bugged by me and the CF manager, they might get upset.
that haven't finished review is mostly going to send a signal that the
review process doesn't matter, which is *not* the signal I think we
should send.
True.
Maybe I am best focusing on open issues like the threading and psql -1
patches I worked on today. There is certainly enough of that stuff to
keep me busy. I thought I could help with the commit fest load, but now
I am unsure. That non-commit-fest stuff has to be done too so maybe
managing that will help.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +
Bruce Momjian wrote:
So, if someone writes a patch, and it is reviewed, and the patch author
updates the patch and replies, it still should be reviewed again before
being committed?
That's generally how things were expected to happen--to at least
double-check that the proposed fixes match what was expected. I don't
think it was spelled out very well in the past though.
Also, we are two weeks into the commit fest and we have more unapplied
patches than applied ones.
Considering that one of those was a holiday week with a lot of schedule
disruption proceeding it, I don't know how much faster things could have
moved. There were large chunks of the reviewer volunteers who wanted
only jobs they could finish before the holiday, and others who weren't
available at all until afterwards. And I don't know about every else,
but I took all four days off and started today behind by several hundred
list messages. I was planning to start nagging again tomorrow, hoping
that most would be caught up from any holiday time off too by then.
Right now, of the 16 patches listed in "Needs Review" besides the ECPG
ones Michael is working through, the breakdown is like this:
Not reviewed at all yet: 6
Reviewed once, updated, waiting for re-review: 10
So the bulk of the problem for keeping the pipeline moving is in these
re-reviews holding up transitions to "Ready for committer". I've had
some discussion with Robert about how to better distinguish in the
management app when the reviewer has work they're expected to do vs.
when they think they're done with things. We're converging toward a
more clear set of written guidelines to provide for managing future
CommitFests as part of that, right now there's a few too many fuzzy
parts for my liking.
If the need here is to speed up how fast things are fed to committers,
we can certainly do that. The current process still favors having
reviewers do as much as possible first, as shown by all the stuff
sitting in the re-review queue. The work we're waiting on them for
could be done by the committers instead if we want to shorten the whole
process a bit. I don't think that's really what you want though.
--
Greg Smith 2ndQuadrant Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com www.2ndQuadrant.com
Greg Smith wrote:
If the need here is to speed up how fast things are fed to committers,
we can certainly do that. The current process still favors having
reviewers do as much as possible first, as shown by all the stuff
sitting in the re-review queue. The work we're waiting on them for
could be done by the committers instead if we want to shorten the
whole process a bit. I don't think that's really what you want though.
As I have observed before, I think we need some infrastructure to help
committers claim items, so we don't duplicate work.
Right now the only items marked "ready for reviewer" are Streaming
Replication and Hot Standby, which I imagine Heiki will be handling.
I'm going to look at the YAML format for EXPLAIN patch shortly.
cheers
andrew
Andrew Dunstan <andrew@dunslane.net> writes:
As I have observed before, I think we need some infrastructure to help
committers claim items, so we don't duplicate work.
Robert acknowledged the need for a "claimed by committer" field in the
fest application, but he hasn't got round to it yet. In the meantime
I've been adding a "Taking this one..." type of comment to an entry
I want to claim.
I'm going to look at the YAML format for EXPLAIN patch shortly.
Do we have consensus yet that we want YAML? It seemed, well,
yet another format without all that much advantage over what's
there.
regards, tom lane
On Nov 30, 2009, at 8:08 PM, Tom Lane wrote:
I'm going to look at the YAML format for EXPLAIN patch shortly.
Do we have consensus yet that we want YAML? It seemed, well,
yet another format without all that much advantage over what's
there.
Legibility++
David
Tom Lane wrote:
I'm going to look at the YAML format for EXPLAIN patch shortly.
Do we have consensus yet that we want YAML? It seemed, well,
yet another format without all that much advantage over what's
there.
I think you and I are the two main skeptics ;-) But we seem to be rather
in the minority, and it's not terribly invasive from what I have seen so
far.
cheers
andrew
On Mon, Nov 30, 2009 at 10:16 PM, Greg Smith <greg@2ndquadrant.com> wrote:
Considering that one of those was a holiday week with a lot of schedule
disruption proceeding it, I don't know how much faster things could have
moved. There were large chunks of the reviewer volunteers who wanted only
jobs they could finish before the holiday, and others who weren't available
at all until afterwards. And I don't know about every else, but I took all
four days off and started today behind by several hundred list messages. I
was planning to start nagging again tomorrow, hoping that most would be
caught up from any holiday time off too by then.Right now, of the 16 patches listed in "Needs Review" besides the ECPG ones
Michael is working through, the breakdown is like this:Not reviewed at all yet: 6
Reviewed once, updated, waiting for re-review: 10So the bulk of the problem for keeping the pipeline moving is in these
re-reviews holding up transitions to "Ready for committer". I've had some
discussion with Robert about how to better distinguish in the management app
when the reviewer has work they're expected to do vs. when they think
they're done with things. We're converging toward a more clear set of
written guidelines to provide for managing future CommitFests as part of
that, right now there's a few too many fuzzy parts for my liking.If the need here is to speed up how fast things are fed to committers, we
can certainly do that. The current process still favors having reviewers do
as much as possible first, as shown by all the stuff sitting in the
re-review queue. The work we're waiting on them for could be done by the
committers instead if we want to shorten the whole process a bit. I don't
think that's really what you want though.
I think the pressure has to be applied all through the process.
People who haven't provided a review by now are certainly overdue for
a polite poke, Thanksgiving or no Thanksgiving. If the first review
doesn't happen until the third week of the CommitFest, how is the
patch going to get a fair shake by the end of the fourth one? I mean,
if that happens to a small number of patches, OK, but when it's 20% of
what's in the CommitFest, it seems like it's bound to lead to a huge
crunch at the end.
In any case, unlike last CommitFest, where the problem was a lack of
adequate committer activity, it's pretty clear that the the problem
this CommitFest has been a combination of slow reviews and slow
updates by patch authors. I've been keeping a loose eye on the patch
queue and my impression is that there has rarely been more than 1
patch other than HS + SR in the "Ready for Committer" state, and many
times zero. That's means the pipeline is stalling, and that's bad.
...Robert