log_statement and syslog severity

Started by G Duttonabout 16 years ago10 messagesgeneral
Jump to latest
#1G Dutton
gdutton+pgsql@inf.ed.ac.uk

Hello all,

I've seen some rather tangential postings about means of filtering log
messages, but none quite match up to (what I believe to be) my
requirement, so here goes:

As a means of auditing our database server, I would like to use the
PostgreSQL 'log_statement' mechanism. Having set log_statement = 'all' I
was disappointed to find that statement messages are logged with INFO
severity, alongside more general operational messages such as shutdown or
startup.

This means that, even using syslog as a destination, it's not possible for
me to filter statements without some sort of log-text parsing, which I'd
prefer to avoid on effort, performance and data-integrity grounds.

For my purposes, I'd like SQL statement logging to be completely separable
from other forms of logging, so that statements can be set aside for
several reasons, notably performance (logging the heavy statement traffic
to another set of spindles or even /dev/shm with rotation to persistent
storage, for instance) and administrative convenience (to make the human
portion of the auditing process more straightforward).

The most straightforward way in which I can think to do this, would be to
make the log_statement syslog (and therefore postgresql) severity
configurable. Does anyone think that a

# combined with syslog logging destination, statements go to "local0.debug"
log_statement_severity = <pgsql-severity, e.g. 'debug1'>

configuration parameter is sensible? out of the question? Is it a good
idea to generalise this even further? Or is there perhaps an alternative
that I've not considered, for easy and performant redirection of just my
logged statements?

Cheers,

Graham

--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

#2Bruce Momjian
bruce@momjian.us
In reply to: G Dutton (#1)
Re: log_statement and syslog severity

G Dutton wrote:

Hello all,

I've seen some rather tangential postings about means of filtering log
messages, but none quite match up to (what I believe to be) my
requirement, so here goes:

As a means of auditing our database server, I would like to use the
PostgreSQL 'log_statement' mechanism. Having set log_statement = 'all' I
was disappointed to find that statement messages are logged with INFO
severity, alongside more general operational messages such as shutdown or
startup.

This means that, even using syslog as a destination, it's not possible for
me to filter statements without some sort of log-text parsing, which I'd
prefer to avoid on effort, performance and data-integrity grounds.

For my purposes, I'd like SQL statement logging to be completely separable
from other forms of logging, so that statements can be set aside for
several reasons, notably performance (logging the heavy statement traffic
to another set of spindles or even /dev/shm with rotation to persistent
storage, for instance) and administrative convenience (to make the human
portion of the auditing process more straightforward).

The most straightforward way in which I can think to do this, would be to
make the log_statement syslog (and therefore postgresql) severity
configurable. Does anyone think that a

# combined with syslog logging destination, statements go to "local0.debug"
log_statement_severity = <pgsql-severity, e.g. 'debug1'>

configuration parameter is sensible? out of the question? Is it a good
idea to generalise this even further? Or is there perhaps an alternative
that I've not considered, for easy and performant redirection of just my
logged statements?

Our logging system is very flexible, but not work-free on the user end.
I don't see us changing things in that area.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

PG East: http://www.enterprisedb.com/community/nav-pg-east-2010.do

#3Greg Sabino Mullane
greg@turnstep.com
In reply to: Bruce Momjian (#2)
Re: log_statement and syslog severity

-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160

Bruce replied:
...

This means that, even using syslog as a destination, it's not possible for
me to filter statements without some sort of log-text parsing, which I'd
prefer to avoid on effort, performance and data-integrity grounds.

Our logging system is very flexible, but not work-free on the user end.
I don't see us changing things in that area.

Bruce, that's a little harsh, I think the original poster has a legitimate
request. Personally, I'd love to be able to split the logs on various things,
the most important to me being durations and per-database. I looked at the
code about a year ago to see how hard this would be and found it non-trivial
(for me), as we're really assuming hard things go to a single filehandle.
It's definitely an area for improvement, and should be a TODO if not already.

- --
Greg Sabino Mullane greg@turnstep.com
PGP Key: 0x14964AC8 201003091022
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8

-----BEGIN PGP SIGNATURE-----

iEYEAREDAAYFAkuWZ88ACgkQvJuQZxSWSsgVvgCfaWaOqcJEzfKBQiN5ttvU/EMB
lVYAn1Ud1AccynciWGvhfqIInNZK1+DZ
=qEnX
-----END PGP SIGNATURE-----

#4Bruce Momjian
bruce@momjian.us
In reply to: Greg Sabino Mullane (#3)
Re: log_statement and syslog severity

Greg Sabino Mullane wrote:

Bruce replied:
...

This means that, even using syslog as a destination, it's not possible for
me to filter statements without some sort of log-text parsing, which I'd
prefer to avoid on effort, performance and data-integrity grounds.

Our logging system is very flexible, but not work-free on the user end.
I don't see us changing things in that area.

Bruce, that's a little harsh, I think the original poster has a legitimate
request. Personally, I'd love to be able to split the logs on various things,
the most important to me being durations and per-database. I looked at the
code about a year ago to see how hard this would be and found it non-trivial
(for me), as we're really assuming hard things go to a single filehandle.
It's definitely an area for improvement, and should be a TODO if not already.

This issue has been discussed and I think the community conclusion was
that this should not be done by the database but rather by external
tools. I think I was giving an accurate portrayal of the odds of this
getting added. I do not think there is enough support for this to be a
TODO item.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

PG East: http://www.enterprisedb.com/community/nav-pg-east-2010.do

#5Stuart Bishop
stuart@stuartbishop.net
In reply to: Bruce Momjian (#4)
Re: log_statement and syslog severity

On Wed, Mar 10, 2010 at 8:51 AM, Bruce Momjian <bruce@momjian.us> wrote:

Greg Sabino Mullane wrote:

Bruce replied:
...

This means that, even using syslog as a destination, it's not possible for
me to filter statements without some sort of log-text parsing, which I'd
prefer to avoid on effort, performance and data-integrity grounds.

Our logging system is very flexible, but not work-free on the user end.
I don't see us changing things in that area.

Bruce, that's a little harsh, I think the original poster has a legitimate
request. Personally, I'd love to be able to split the logs on various things,
the most important to me being durations and per-database. I looked at the
code about a year ago to see how hard this would be and found it non-trivial
(for me), as we're really assuming hard things go to a single filehandle.
It's definitely an area for improvement, and should be a TODO if not already.

This issue has been discussed and I think the community conclusion was
that this should not be done by the database but rather by external
tools.  I think I was giving an accurate portrayal of the odds of this
getting added.  I do not think there is enough support for this to be a
TODO item.

How do you plug in this external tool?

Installing a filter on stderr doesn't play well with packaged versions of PostgreSQL and probably not Windows either. You also don't get easily machine readable output.

It might be possible to trick csvlog to log to a static filename, and perhaps substituting that with a named pipe might work (under unix at least).

syslog doesn't give you easily machine readable output. I'm not sure how syslog implementations handle high load (our sysadmins won't use it, so I haven't investigated this further).

I need to be analyzing log messages from PostgreSQL in real time, so am starting to investigate solutions. It seems painful, which would be avoidable for future generations if PostgreSQL could spawn a subprocess and send log messages to that in a machine readable format. Perhaps useful filters might start to exist and eventually end up in contrib or core.

--
Stuart Bishop <stuart@stuartbishop.net>
http://www.stuartbishop.net/

#6Magnus Hagander
magnus@hagander.net
In reply to: Stuart Bishop (#5)
Re: log_statement and syslog severity

2010/3/10 Stuart Bishop <stuart@stuartbishop.net>:

On Wed, Mar 10, 2010 at 8:51 AM, Bruce Momjian <bruce@momjian.us> wrote:

Greg Sabino Mullane wrote:

Bruce replied:
...

This means that, even using syslog as a destination, it's not possible for
me to filter statements without some sort of log-text parsing, which I'd
prefer to avoid on effort, performance and data-integrity grounds.

Our logging system is very flexible, but not work-free on the user end.
I don't see us changing things in that area.

Bruce, that's a little harsh, I think the original poster has a legitimate
request. Personally, I'd love to be able to split the logs on various things,
the most important to me being durations and per-database. I looked at the
code about a year ago to see how hard this would be and found it non-trivial
(for me), as we're really assuming hard things go to a single filehandle.
It's definitely an area for improvement, and should be a TODO if not already.

This issue has been discussed and I think the community conclusion was
that this should not be done by the database but rather by external
tools.  I think I was giving an accurate portrayal of the odds of this
getting added.  I do not think there is enough support for this to be a
TODO item.

How do you plug in this external tool?

Installing a filter on stderr doesn't play well with packaged versions of PostgreSQL and probably not Windows either. You also don't get easily machine readable output.

It might be possible to trick csvlog to log to a static filename, and perhaps substituting that with a named pipe might work (under unix at least).

syslog doesn't give you easily machine readable output. I'm not sure how syslog implementations handle high load (our sysadmins won't use it, so I haven't investigated this further).

I need to be analyzing log messages from PostgreSQL in real time, so am starting to investigate solutions. It seems painful, which would be avoidable for future generations if PostgreSQL could spawn a subprocess and send log messages to that in a machine readable format. Perhaps useful filters might start to exist and eventually end up in contrib or core.

We already have a subprocess that takes this, and if we use CSV format
it's machine readable.

I had a patch sometime back last autumn that did a fairly major
restructuring to allow some of this kind of refactoring, but it was
rejected (on reasonable grounds). My next thought around that was to
add a "pipe" style log_destination to just make it possible to hand
things off to a different process. The reasonable way to do it would
be to send it out in CSV format. It'll cause a fairly large amount of
parsing overhead and such compared to a builtin solution, but it'll
give the flexibility to develop all such filters outside of core.

But that's all 9.1 material, of course.

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

#7Ben
bench@silentmedia.com
In reply to: Stuart Bishop (#5)
Re: log_statement and syslog severity

On Mar 10, 2010, at 12:15 AM, Stuart Bishop wrote:

syslog doesn't give you easily machine readable output. I'm not sure how syslog implementations handle high load (our sysadmins won't use it, so I haven't investigated this further).

Have you looked into syslog-ng? I believe it does just this sort of thing for you.

#8dennis jenkins
dennis.jenkins.75@gmail.com
In reply to: Ben (#7)
Re: log_statement and syslog severity

On Wed, Mar 10, 2010 at 10:55 AM, Ben Chobot <bench@silentmedia.com> wrote:

On Mar 10, 2010, at 12:15 AM, Stuart Bishop wrote:

syslog doesn't give you easily machine readable output. I'm not sure how

syslog implementations handle high load (our sysadmins won't use it, so I
haven't investigated this further).

Have you looked into syslog-ng? I believe it does just this sort of thing
for you.
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

We use syslog-ng for sorting our syslogs into different log files. It works
well enough.

What annoys me is that postgresql will split a SQL statement across several
syslog "events". I know that syslog itself has a maximum message size and
that this is required for really long SQL. However, I wanted to log each
SQL statement as a single "event". Syslog-ng can direct output to a pipe.
Putting these together I had begun working on a system where out syslog host
(which logs this from lots of other, unrelated systems) would sort the
postgresql logs to STDIN on a perl program that would reassemble the
individual SQL statements and write them out in a different format. I never
completed the project (just got busy with more important things). However,
I thought that this approach was feasible.

That being said, I would love it if Postgresql had logging mechanism
plugin. Maybe it could load a so/dll that would handle logging. That
so/dll would export a function like this:

void pg_log_init (/* some params related to the database instance */);
void pg_log_done (void); // called on database shutdown.

void pg_log_sql (int runtime, int rows, int status, const char *sql,
const char *user);

"status" would be some code to indicate if the sql was successful or not.

The above is just off the top of my head. I've done no real research on if
the above would be sufficient or correct.

Just an idea...

#9Bruce Momjian
bruce@momjian.us
In reply to: Stuart Bishop (#5)
Re: log_statement and syslog severity

Stuart Bishop wrote:
-- Start of PGP signed section.

On Wed, Mar 10, 2010 at 8:51 AM, Bruce Momjian <bruce@momjian.us> wrote:

Greg Sabino Mullane wrote:

Bruce replied:
...

This means that, even using syslog as a destination, it's not possible for
me to filter statements without some sort of log-text parsing, which I'd
prefer to avoid on effort, performance and data-integrity grounds.

Our logging system is very flexible, but not work-free on the user end.
I don't see us changing things in that area.

Bruce, that's a little harsh, I think the original poster has a legitimate
request. Personally, I'd love to be able to split the logs on various things,
the most important to me being durations and per-database. I looked at the
code about a year ago to see how hard this would be and found it non-trivial
(for me), as we're really assuming hard things go to a single filehandle.
It's definitely an area for improvement, and should be a TODO if not already.

This issue has been discussed and I think the community conclusion was
that this should not be done by the database but rather by external
tools. ?I think I was giving an accurate portrayal of the odds of this
getting added. ?I do not think there is enough support for this to be a
TODO item.

How do you plug in this external tool?

I think the real problem with log splitting in the server is designing a
clean API to do this, and I haven't seen one yet.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

PG East: http://www.enterprisedb.com/community/nav-pg-east-2010.do

#10Greg Smith
gsmith@gregsmith.com
In reply to: Stuart Bishop (#5)
Re: log_statement and syslog severity

Stuart Bishop wrote:

It might be possible to trick csvlog to log to a static filename, and
perhaps substituting that with a named pipe might work (under unix at
least).

As someone who did a bit of the work on the CSV log feature, I'll tell
you the way you have to note the log filename, account for rotations,
and everything else involved makes for a painful API to actually use was
obvious from day one. What I suggested was that many admins would want
a "tail-f like" API available to grabs at they come out, without having
to care about the underlying name. But no one has dumped enough
development resources into actually building one of them. At the time,
there were a host of genuine bugs in the logging approached used for CVS
logs, and just closing them all up before release time was difficult
enough. And there hasn't been enough asking about it to inspire
development since.

I need to be analyzing log messages from PostgreSQL in real time, so
am starting to investigate solutions. It seems painful, which would be
avoidable for future generations if PostgreSQL could spawn a
subprocess and send log messages to that in a machine readable format.

That is the only direction something like this is going to get built
in. What Bruce was suggesting is that the idea of building any more
logging intelligence into the database itself will never go anywhere.
The alternate question of "how do I get a better API for exporting
real-time logging messages I can process?" is still quite open in my
mind. The idea Magnus was already suggesting here, to add an alternate
"pipe" destination, would be one useful step forward here.

--
Greg Smith 2ndQuadrant US Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com www.2ndQuadrant.us