replication commands and log_statements
Hi,
Replication commands like IDENTIFY_COMMAND are not logged even when
log_statements is set to all. Some users who use log_statements to
audit *all* statements might dislike this current situation. So I'm
thinking to change log_statements or add something like log_replication
so that we can log replication commands. Thought?
Regards,
--
Fujii Masao
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2014-06-11 19:34:25 +0900, Fujii Masao wrote:
Hi,
Replication commands like IDENTIFY_COMMAND are not logged even when
log_statements is set to all. Some users who use log_statements to
audit *all* statements might dislike this current situation. So I'm
thinking to change log_statements or add something like log_replication
so that we can log replication commands. Thought?
Yes, I think we should do that. I can't really see it causing too much
volume...
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Jun 11, 2014 at 12:34 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
Hi,
Replication commands like IDENTIFY_COMMAND are not logged even when
log_statements is set to all. Some users who use log_statements to
audit *all* statements might dislike this current situation. So I'm
thinking to change log_statements or add something like log_replication
so that we can log replication commands. Thought?
+1. I think adding a separate parameter is the way to go.
The other option would be to turn log_statements into a parameter that you
specify multiple ones - so instead of "all" today it would be "ddl,dml,all"
or something like that, and then you'd also add "replication" as an option.
But that would cause all sorts of backwards compatibility annoyances.. And
do you really want to be able to say things like "ddl,all" meanin you'd get
ddl and select but not dml?
--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
On 2014-06-11 13:42:58 +0200, Magnus Hagander wrote:
On Wed, Jun 11, 2014 at 12:34 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
Hi,
Replication commands like IDENTIFY_COMMAND are not logged even when
log_statements is set to all. Some users who use log_statements to
audit *all* statements might dislike this current situation. So I'm
thinking to change log_statements or add something like log_replication
so that we can log replication commands. Thought?+1. I think adding a separate parameter is the way to go.
Why? I can't really see a case where the log volume by replication
connections is going to be significant in comparison to 'all'?
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Jun 11, 2014 at 2:17 PM, Andres Freund <andres@2ndquadrant.com>
wrote:
On 2014-06-11 13:42:58 +0200, Magnus Hagander wrote:
On Wed, Jun 11, 2014 at 12:34 PM, Fujii Masao <masao.fujii@gmail.com>
wrote:
Hi,
Replication commands like IDENTIFY_COMMAND are not logged even when
log_statements is set to all. Some users who use log_statements to
audit *all* statements might dislike this current situation. So I'm
thinking to change log_statements or add something like log_replication
so that we can log replication commands. Thought?+1. I think adding a separate parameter is the way to go.
Why? I can't really see a case where the log volume by replication
connections is going to be significant in comparison to 'all'?
Yes, but how would you specify for example "i want DDL and all replication
commands" (which is quite a reasonable thing to log, I believe). If you
actually require it to be set to "all", most people won't have any use at
all for it...
--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
On 2014-06-11 14:22:43 +0200, Magnus Hagander wrote:
Yes, but how would you specify for example "i want DDL and all replication
commands" (which is quite a reasonable thing to log, I believe). If you
actually require it to be set to "all", most people won't have any use at
all for it...
That's a reasonable feature request - but imo it's a separate
one. It seems pretty noncontroversial and simple to make
log_statement=all log replication commands. What you want - and you're
sure not alone with that - is something considerably more complex...
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Jun 11, 2014 at 2:35 PM, Andres Freund <andres@2ndquadrant.com>
wrote:
On 2014-06-11 14:22:43 +0200, Magnus Hagander wrote:
Yes, but how would you specify for example "i want DDL and all
replication
commands" (which is quite a reasonable thing to log, I believe). If you
actually require it to be set to "all", most people won't have any use at
all for it...That's a reasonable feature request - but imo it's a separate
one. It seems pretty noncontroversial and simple to make
log_statement=all log replication commands. What you want - and you're
sure not alone with that - is something considerably more complex...
I'm just saying if we're going to do it, let's do it right from the
beginning.
Or are you suggesting this would be something we'd backpatch? Given the
lack of complaints about it, I'd rather suggest we don't backpatch
anything, and instead make the correct feature in the first pace for the
next release.
--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
On 2014-06-11 14:50:34 +0200, Magnus Hagander wrote:
On Wed, Jun 11, 2014 at 2:35 PM, Andres Freund <andres@2ndquadrant.com>
wrote:On 2014-06-11 14:22:43 +0200, Magnus Hagander wrote:
Yes, but how would you specify for example "i want DDL and all
replication
commands" (which is quite a reasonable thing to log, I believe). If you
actually require it to be set to "all", most people won't have any use at
all for it...That's a reasonable feature request - but imo it's a separate
one. It seems pretty noncontroversial and simple to make
log_statement=all log replication commands. What you want - and you're
sure not alone with that - is something considerably more complex...
I'm just saying if we're going to do it, let's do it right from the
beginning.
Your wish just seems like a separate feature to me. Including
replication commands in 'all' seems correct independent of the desire
for a more granular control.
Or are you suggesting this would be something we'd backpatch?
Thought about it for a sec, and then decided it doesn't seem warranted.
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Andres Freund <andres@2ndquadrant.com> writes:
Your wish just seems like a separate feature to me. Including
replication commands in 'all' seems correct independent of the desire
for a more granular control.
No, I think I've got to vote with the other side on that.
The reason we can have log_statement as a scalar progression
"none < ddl < mod < all" is that there's little visible use-case
for logging DML but not DDL, nor for logging SELECTS but not
INSERT/UPDATE/DELETE. However, logging replication commands seems
like something people would reasonably want an orthogonal control for.
There's no nice way to squeeze such a behavior into log_statement.
I guess you could say that log_statement treats replication commands
as if they were DDL, but is that really going to satisfy users?
I think we should consider log_statement to control logging of
SQL only, and invent a separate GUC (or, in the future, likely
more than one GUC?) for logging of replication activity.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Jun 11, 2014 at 11:55 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Andres Freund <andres@2ndquadrant.com> writes:
Your wish just seems like a separate feature to me. Including
replication commands in 'all' seems correct independent of the desire
for a more granular control.No, I think I've got to vote with the other side on that.
The reason we can have log_statement as a scalar progression
"none < ddl < mod < all" is that there's little visible use-case
for logging DML but not DDL, nor for logging SELECTS but not
INSERT/UPDATE/DELETE. However, logging replication commands seems
like something people would reasonably want an orthogonal control for.
There's no nice way to squeeze such a behavior into log_statement.I guess you could say that log_statement treats replication commands
as if they were DDL, but is that really going to satisfy users?I think we should consider log_statement to control logging of
SQL only, and invent a separate GUC (or, in the future, likely
more than one GUC?) for logging of replication activity.
Seems reasonable. OK. The attached patch adds log_replication_command
parameter which causes replication commands to be logged. I added this to
next CF.
Regards,
--
Fujii Masao
Attachments:
0001-Add-log_replication_command-configuration-parameter.patchtext/x-patch; charset=US-ASCII; name=0001-Add-log_replication_command-configuration-parameter.patchDownload+36-3
On Wed, Jun 11, 2014 at 7:42 AM, Magnus Hagander <magnus@hagander.net> wrote:
Replication commands like IDENTIFY_COMMAND are not logged even when
log_statements is set to all. Some users who use log_statements to
audit *all* statements might dislike this current situation. So I'm
thinking to change log_statements or add something like log_replication
so that we can log replication commands. Thought?+1. I think adding a separate parameter is the way to go.
The other option would be to turn log_statements into a parameter that you
specify multiple ones
I kind of like this idea, but...
- so instead of "all" today it would be "ddl,dml,all"
or something like that, and then you'd also add "replication" as an option.
But that would cause all sorts of backwards compatibility annoyances.. And
do you really want to be able to say things like "ddl,all" meanin you'd get
ddl and select but not dml?
...you lost me here. I mean, I think it could be quite useful to
redefine the existing GUC as a list. We could continue to have ddl,
dml, and all as tokens that would be in the list, but you wouldn't
write "ddl,dml,all" because "all" would include everything that those
other ones would log. But then you could have combinations like
"dml,replication" and so on. And you could do much more fine-grained
things, like allow log_statement=create,alter,drop to log all such
statements but not, for example, cluster.
I think if we go the route of adding a separate GUC for this, we're
going to get tired of adding GUCs way before we've come close to
meeting the actual requirements in this area. A comma-separated list
of tokens seems to offer a lot more flexibility.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 12/06/14 20:37, Fujii Masao wrote:
On Wed, Jun 11, 2014 at 11:55 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Andres Freund <andres@2ndquadrant.com> writes:
Your wish just seems like a separate feature to me. Including
replication commands in 'all' seems correct independent of the desire
for a more granular control.No, I think I've got to vote with the other side on that.
The reason we can have log_statement as a scalar progression
"none < ddl < mod < all" is that there's little visible use-case
for logging DML but not DDL, nor for logging SELECTS but not
INSERT/UPDATE/DELETE. However, logging replication commands seems
like something people would reasonably want an orthogonal control for.
There's no nice way to squeeze such a behavior into log_statement.I guess you could say that log_statement treats replication commands
as if they were DDL, but is that really going to satisfy users?I think we should consider log_statement to control logging of
SQL only, and invent a separate GUC (or, in the future, likely
more than one GUC?) for logging of replication activity.Seems reasonable. OK. The attached patch adds log_replication_command
parameter which causes replication commands to be logged. I added this to
next CF.
A quick review:
- Compiles against HEAD
- Works as advertised
- Code style looks fine
A couple of suggestions:
- minor rewording for the description, mentioning that statements will
still be logged as DEBUG1 even if parameter set to 'off' (might
prevent reports of the kind "I set it to 'off', why am I still seeing
log entries?").
<para>
Causes each replication command to be logged in the server log.
See <xref linkend="protocol-replication"> for more information about
replication commands. The default value is <literal>off</>. When set to
<literal>off</>, commands will be logged at log level <literal>DEBUG1</literal>.
Only superusers can change this setting.
</para>
- I feel it would be more consistent to use the plural form
for this parameter, i.e. "log_replication_commands", in line with
"log_lock_waits", "log_temp_files", "log_disconnections" etc.
- It might be an idea to add a cross-reference to this parameter from
the "Streaming Replication Protocol" page:
http://www.postgresql.org/docs/devel/static/protocol-replication.html
Regards
Ian Barwick
--
Ian Barwick http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Jun 12, 2014 at 10:55 PM, Robert Haas <robertmhaas@gmail.com> wrote:
On Wed, Jun 11, 2014 at 7:42 AM, Magnus Hagander <magnus@hagander.net> wrote:
Replication commands like IDENTIFY_COMMAND are not logged even when
log_statements is set to all. Some users who use log_statements to
audit *all* statements might dislike this current situation. So I'm
thinking to change log_statements or add something like log_replication
so that we can log replication commands. Thought?+1. I think adding a separate parameter is the way to go.
The other option would be to turn log_statements into a parameter that you
specify multiple onesI kind of like this idea, but...
- so instead of "all" today it would be "ddl,dml,all"
or something like that, and then you'd also add "replication" as an option.
But that would cause all sorts of backwards compatibility annoyances.. And
do you really want to be able to say things like "ddl,all" meanin you'd get
ddl and select but not dml?...you lost me here. I mean, I think it could be quite useful to
redefine the existing GUC as a list. We could continue to have ddl,
dml, and all as tokens that would be in the list, but you wouldn't
write "ddl,dml,all" because "all" would include everything that those
other ones would log. But then you could have combinations like
"dml,replication" and so on.
Yep, that seems useful, even aside from logging of replication command topic.
OK, I've just implemented the patch (attached) which does this, i.e., redefines
log_statement as a list. Thanks to the patch, log_statement can be set
to "none",
"ddl", "mod", "dml", "all", and any combinations of them. The meanings of
"none", "ddl", "mod" and "all" are the same as they are now. New setting value
"dml" loggs both data modification statements like INSERT and query access
statements like SELECT and COPY TO.
What about applying this patch first?
Regards,
--
Fujii Masao
Attachments:
0001-Redefine-log_statement-as-a-list.patchtext/x-patch; charset=US-ASCII; name=0001-Redefine-log_statement-as-a-list.patchDownload+132-62
Fujii Masao <masao.fujii@gmail.com> writes:
OK, I've just implemented the patch (attached) which does this, i.e., redefines
log_statement as a list. Thanks to the patch, log_statement can be set
to "none",
"ddl", "mod", "dml", "all", and any combinations of them. The meanings of
"none", "ddl", "mod" and "all" are the same as they are now. New setting value
"dml" loggs both data modification statements like INSERT and query access
statements like SELECT and COPY TO.
I still don't like this one bit. It's turning log_statement from a
reasonably clean design into a complete mess, which will be made even
worse after you add replication control to it.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Jun 20, 2014 at 10:48 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Fujii Masao <masao.fujii@gmail.com> writes:
OK, I've just implemented the patch (attached) which does this, i.e., redefines
log_statement as a list. Thanks to the patch, log_statement can be set
to "none",
"ddl", "mod", "dml", "all", and any combinations of them. The meanings of
"none", "ddl", "mod" and "all" are the same as they are now. New setting value
"dml" loggs both data modification statements like INSERT and query access
statements like SELECT and COPY TO.I still don't like this one bit. It's turning log_statement from a
reasonably clean design into a complete mess, which will be made even
worse after you add replication control to it.
Yeah, I'd vote as well to let log_statement as it is (without
mentioning the annoyance it would cause to users), and to have
replication statement logging managed with a separate GUC for clarity.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Jun 20, 2014 at 9:48 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Fujii Masao <masao.fujii@gmail.com> writes:
OK, I've just implemented the patch (attached) which does this, i.e., redefines
log_statement as a list. Thanks to the patch, log_statement can be set
to "none",
"ddl", "mod", "dml", "all", and any combinations of them. The meanings of
"none", "ddl", "mod" and "all" are the same as they are now. New setting value
"dml" loggs both data modification statements like INSERT and query access
statements like SELECT and COPY TO.I still don't like this one bit. It's turning log_statement from a
reasonably clean design into a complete mess, which will be made even
worse after you add replication control to it.
Well, I don't object to having a separate GUC for replication command
logging if people prefer that design. But let's not have any
delusions of adequacy about log_statement. I've had more than one
conversation with customers about that particular parameter, all of
which involved the customer being unhappy that there were only four
choices and they couldn't log the stuff that they cared about without
logging a lot of other stuff that they didn't care about. Now,
providing more choices there will, indisputably, add complexity, but
it will also provide utility.
What we're talking about here is not unlike what we went through with
EXPLAIN syntax. We repeatedly rejected patches that might have added
useful functionality to EXPLAIN on the grounds that (1) we didn't want
to make new keywords and (2) even if we did add new keywords,
extending the EXPLAIN productions would produce grammar conflicts.
Then, we implemented the extensible-options stuff, and suddenly it
became possible for people to write patches adding useful
functionality to EXPLAIN that didn't get sunk before it got out of the
gate, and since then we've gotten a new EXPLAIN option every release
or two, and IMHO all of those options are pretty useful. Similarly,
building a logging facility that meets the needs of real users is
going to require a configuration method more flexible than a total
order with four choices. I happen to think a list of comma-separated
tokens is a pretty good choice, but something else could be OK, too.
We need something better than what we've got now, though.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
* Robert Haas (robertmhaas@gmail.com) wrote:
Similarly,
building a logging facility that meets the needs of real users is
going to require a configuration method more flexible than a total
order with four choices. I happen to think a list of comma-separated
tokens is a pretty good choice, but something else could be OK, too.
We need something better than what we've got now, though.
Absolutely, and not all of that should be done through postgresql.conf,
imv. The level of flexibility that is being asked for, from what I've
seen and heard, really needs to be met with new catalog tables or
extending existing ones. The nearby thread on pg_audit would be a good
example.
I certainly don't want to be specifying specific table names or
providing a list of roles through any GUC (or configuration file of any
kind). I'm not adverse to improving log_statement to a list with tokens
that indicate various meaning levels but anything done through that
mechanism will be very coarse.
Thanks,
Stephen
On Mon, Jun 23, 2014 at 11:15 AM, Stephen Frost <sfrost@snowman.net> wrote:
* Robert Haas (robertmhaas@gmail.com) wrote:
Similarly,
building a logging facility that meets the needs of real users is
going to require a configuration method more flexible than a total
order with four choices. I happen to think a list of comma-separated
tokens is a pretty good choice, but something else could be OK, too.
We need something better than what we've got now, though.Absolutely, and not all of that should be done through postgresql.conf,
imv. The level of flexibility that is being asked for, from what I've
seen and heard, really needs to be met with new catalog tables or
extending existing ones. The nearby thread on pg_audit would be a good
example.I certainly don't want to be specifying specific table names or
providing a list of roles through any GUC (or configuration file of any
kind). I'm not adverse to improving log_statement to a list with tokens
that indicate various meaning levels but anything done through that
mechanism will be very coarse.
True, but it would be enough to let you make a list of commands you'd
like to audit, and I think that might be valuable enough to justify
the price of admission. I certainly agree that logging based on which
object is being accessed needs a different design, tied into the
catalogs.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi.
Do we have any consensus about what to do with these two patches?
1. Introduce a "log_replication_command" setting.
2. Change log_statement to be a list of tokens.
If I understand correctly, there weren't any strong objections to the
former, but the situation is less clear when it comes to the second.
-- Abhijit
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Jul 2, 2014 at 4:26 AM, Abhijit Menon-Sen <ams@2ndquadrant.com> wrote:
Hi.
Do we have any consensus about what to do with these two patches?
1. Introduce a "log_replication_command" setting.
2. Change log_statement to be a list of tokens.If I understand correctly, there weren't any strong objections to the
former, but the situation is less clear when it comes to the second.
On second thought, I'd like to propose the third option. This is the same idea
as Andres suggested upthread. That is, we log replication commands only
when log_statement is set to all. Neither new parameter is introduced nor
log_statement is redefined as a list.
Firstly, since log_statement=all means that all statements are logged, it's
basically natural and intuitive to log even replication commands in that
setting. Secondly, any statements except DDL and data-modifying statements,
e.g., VACUUM, CHECKPOINT, BEGIN, ..etc, are logged only when log_statement=all.
So even if some users want to control the logging of DDL and maintenance
commands orthogonally, which is impossible for now. Therefore, even if the
logging of replication commands which are neither DDL nor data-modifying
statements also cannot be controlled orthogonally, I don't think that users
get so surprised. Of course I have no objection to address the issue by, e.g.,
enabling such orthogonal logging control, in the future, though.
Thought? What about introducing that simple but not so confusing change first?
Regards,
--
Fujii Masao
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers