log_destination=file
Attached is a very much VIP (AKA rough draft) for $subject.
Right now we have two parameters controlling logging destination, and they
work in interesting combinations:
log_destination=stderr, logging_collector=off -> log to stderr (makes sense)
log_destination=stderr, logging_collector=on -> log to file (*highly*
illogical for most users, to set stderr when they want file)
log_destination=csvlog, logging_collector=on -> log to csvfile (makes sense)
log_destination=csvlog, logging_collector=off -> fail (ugh)
(beyond that we also have syslog and eventlog, but those are different and
not touched by this patch)
My understanding is that the main reason for this is that we cannot change
logging_collector without restarting postmaster, whereas we can change
log_destination.
My suggestion is we work around this by just always starting the logging
collector, even if we're not planning to use it. Then we'd just have:
log_destination = stderr -> log to stderr
log_destination = file -> log to file
log_destination = csvlog -> log to csv file
The main difference here is that log_destination=stderr would also go
through the logging collector which would then assemble it and write it out
to it's own stderr.
Do people see an actual problem with that? I agree it's an extra round of
indirection there, but is that a problem? It would also cause one more
backgorund process to always be running, but again, is that really a
problem? The overhead is not exactly large.
It would make configuration a lot more logical for logging. It would also
make it possible to switch between all logging configurations without
restarting.
The attached WIP is mostly working for the simple cases. It fails
completely if the syslogger is restarted at this point, simply because I
haven't figured out how to pass the original stderr down to the syslogger.
I'm sure there are also many other smaller issues around it, but I wanted
to get the discussion done before I spend the time to go through those.
Thoughts?
--
Magnus Hagander
Me: https://www.hagander.net/ <http://www.hagander.net/>
Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/>
Attachments:
log_destination_file.patchtext/x-patch; charset=US-ASCII; name=log_destination_file.patchDownload
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index 95180b2ef5..7c4eb5743f 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -1764,7 +1764,7 @@ ServerLoop(void)
}
/* If we have lost the log collector, try to start a new one */
- if (SysLoggerPID == 0 && Logging_collector)
+ if (SysLoggerPID == 0)
SysLoggerPID = SysLogger_Start();
/*
diff --git a/src/backend/postmaster/syslogger.c b/src/backend/postmaster/syslogger.c
index 3255b42c7d..15fc606d24 100644
--- a/src/backend/postmaster/syslogger.c
+++ b/src/backend/postmaster/syslogger.c
@@ -57,10 +57,8 @@
/*
- * GUC parameters. Logging_collector cannot be changed after postmaster
- * start, but the rest can change at SIGHUP.
+ * GUC parameters. Can change at SIGHUP.
*/
-bool Logging_collector = false;
int Log_RotationAge = HOURS_PER_DAY * MINS_PER_HOUR;
int Log_RotationSize = 10 * 1024;
char *Log_directory = NULL;
@@ -182,6 +180,10 @@ SysLoggerMain(int argc, char *argv[])
* assumes that all interesting messages generated in the syslogger will
* come through elog.c and will be sent to write_syslogger_file.
*/
+ /*
+ * XXX: this does not work when we want to use stderr in the syslogger.
+ * needs fixing!
+ */
if (redirection_done)
{
int fd = open(DEVNULL, O_WRONLY, 0);
@@ -370,10 +372,11 @@ SysLoggerMain(int argc, char *argv[])
if (!rotation_requested && Log_RotationSize > 0 && !rotation_disabled)
{
/* Do a rotation if file is too big */
- if (ftell(syslogFile) >= Log_RotationSize * 1024L)
+ if (syslogFile != NULL &&
+ ftell(syslogFile) >= Log_RotationSize * 1024L)
{
rotation_requested = true;
- size_rotation_for |= LOG_DESTINATION_STDERR;
+ size_rotation_for |= LOG_DESTINATION_FILE;
}
if (csvlogFile != NULL &&
ftell(csvlogFile) >= Log_RotationSize * 1024L)
@@ -390,7 +393,7 @@ SysLoggerMain(int argc, char *argv[])
* was sent by pg_rotate_logfile.
*/
if (!time_based_rotation && size_rotation_for == 0)
- size_rotation_for = LOG_DESTINATION_STDERR | LOG_DESTINATION_CSVLOG;
+ size_rotation_for = LOG_DESTINATION_FILE | LOG_DESTINATION_CSVLOG;
logfile_rotate(time_based_rotation, size_rotation_for);
}
@@ -522,9 +525,6 @@ SysLogger_Start(void)
pid_t sysloggerPid;
char *filename;
- if (!Logging_collector)
- return 0;
-
/*
* If first time through, create the pipe which will receive stderr
* output.
@@ -581,7 +581,10 @@ SysLogger_Start(void)
first_syslogger_file_time = time(NULL);
filename = logfile_getname(first_syslogger_file_time, NULL);
- syslogFile = logfile_open(filename, "a", false);
+ if (Log_destination & LOG_DESTINATION_FILE)
+ syslogFile = logfile_open(filename, "a", false);
+ else
+ syslogFile = NULL;
pfree(filename);
@@ -675,7 +678,8 @@ SysLogger_Start(void)
}
/* postmaster will never write the file; close it */
- fclose(syslogFile);
+ if (syslogFile != NULL)
+ fclose(syslogFile);
syslogFile = NULL;
return (int) sysloggerPid;
}
@@ -794,7 +798,7 @@ process_pipe_input(char *logbuffer, int *bytes_in_logbuffer)
{
char *cursor = logbuffer;
int count = *bytes_in_logbuffer;
- int dest = LOG_DESTINATION_STDERR;
+ bool dest_csv = false;
/* While we have enough for a header, process data... */
while (count >= (int) (offsetof(PipeProtoHeader, data) + 1))
@@ -822,8 +826,7 @@ process_pipe_input(char *logbuffer, int *bytes_in_logbuffer)
if (count < chunklen)
break;
- dest = (p.is_last == 'T' || p.is_last == 'F') ?
- LOG_DESTINATION_CSVLOG : LOG_DESTINATION_STDERR;
+ dest_csv = (p.is_last == 'T' || p.is_last == 'F');
/* Locate any existing buffer for this source pid */
buffer_list = buffer_lists[p.pid % NBUFFER_LISTS];
@@ -886,7 +889,7 @@ process_pipe_input(char *logbuffer, int *bytes_in_logbuffer)
appendBinaryStringInfo(str,
cursor + PIPE_HEADER_SIZE,
p.len);
- write_syslogger_file(str->data, str->len, dest);
+ write_syslogger_file(str->data, str->len, dest_csv);
/* Mark the buffer unused, and reclaim string storage */
existing_slot->pid = 0;
pfree(str->data);
@@ -895,7 +898,7 @@ process_pipe_input(char *logbuffer, int *bytes_in_logbuffer)
{
/* The whole message was one chunk, evidently. */
write_syslogger_file(cursor + PIPE_HEADER_SIZE, p.len,
- dest);
+ dest_csv);
}
}
@@ -922,7 +925,7 @@ process_pipe_input(char *logbuffer, int *bytes_in_logbuffer)
break;
}
/* fall back on the stderr log as the destination */
- write_syslogger_file(cursor, chunklen, LOG_DESTINATION_STDERR);
+ write_syslogger_file(cursor, chunklen, false);
cursor += chunklen;
count -= chunklen;
}
@@ -960,7 +963,7 @@ flush_pipe_input(char *logbuffer, int *bytes_in_logbuffer)
StringInfo str = &(buf->data);
write_syslogger_file(str->data, str->len,
- LOG_DESTINATION_STDERR);
+ LOG_DESTINATION_FILE);
/* Mark the buffer unused, and reclaim string storage */
buf->pid = 0;
pfree(str->data);
@@ -974,7 +977,7 @@ flush_pipe_input(char *logbuffer, int *bytes_in_logbuffer)
*/
if (*bytes_in_logbuffer > 0)
write_syslogger_file(logbuffer, *bytes_in_logbuffer,
- LOG_DESTINATION_STDERR);
+ LOG_DESTINATION_FILE);
*bytes_in_logbuffer = 0;
}
@@ -985,23 +988,39 @@ flush_pipe_input(char *logbuffer, int *bytes_in_logbuffer)
*/
/*
- * Write text to the currently open logfile
+ * Write text to the currently open logfile and/or stderr.
*
* This is exported so that elog.c can call it when am_syslogger is true.
* This allows the syslogger process to record elog messages of its own,
* even though its stderr does not point at the syslog pipe.
*/
void
-write_syslogger_file(const char *buffer, int count, int destination)
+write_syslogger_file(const char *buffer, int count, bool dest_csv)
{
- int rc;
- FILE *logfile;
-
- if (destination == LOG_DESTINATION_CSVLOG && csvlogFile == NULL)
- open_csvlogfile();
+ int rc = 0;
- logfile = destination == LOG_DESTINATION_CSVLOG ? csvlogFile : syslogFile;
- rc = fwrite(buffer, 1, count, logfile);
+ if (dest_csv)
+ {
+ if (csvlogFile == NULL)
+ open_csvlogfile();
+ rc = fwrite(buffer, 1, count, csvlogFile);
+ }
+ else
+ {
+ /* non-csv can go to file, stderr or both */
+ if (Log_destination & LOG_DESTINATION_STDERR)
+ {
+ rc = fwrite(buffer, 1, count, stderr);
+ /*
+ * if we fail to write to stderr, we're not likely to be able
+ * to write the error message about it either, so just
+ * ignore and let the result be overwritten if we log to both
+ * file and stderr.
+ */
+ }
+ if (Log_destination & LOG_DESTINATION_FILE)
+ rc = fwrite(buffer, 1, count, syslogFile);
+ }
/* can't use ereport here because of possible recursion */
if (rc != count)
@@ -1064,7 +1083,7 @@ pipeThread(void *arg)
*/
if (Log_RotationSize > 0)
{
- if (ftell(syslogFile) >= Log_RotationSize * 1024L ||
+ if ((syslogFile != NULL && ftell(syslogFile) >= Log_RotationSize * 1024L ||
(csvlogFile != NULL && ftell(csvlogFile) >= Log_RotationSize * 1024L))
SetLatch(MyLatch);
}
@@ -1189,7 +1208,7 @@ logfile_rotate(bool time_based_rotation, int size_rotation_for)
*
* Note: last_file_name should never be NULL here, but if it is, append.
*/
- if (time_based_rotation || (size_rotation_for & LOG_DESTINATION_STDERR))
+ if (time_based_rotation || (size_rotation_for & LOG_DESTINATION_FILE))
{
if (Log_truncate_on_rotation && time_based_rotation &&
last_file_name != NULL &&
@@ -1220,7 +1239,8 @@ logfile_rotate(bool time_based_rotation, int size_rotation_for)
return;
}
- fclose(syslogFile);
+ if (syslogFile != NULL)
+ fclose(syslogFile);
syslogFile = fh;
/* instead of pfree'ing filename, remember it for next time */
@@ -1362,7 +1382,7 @@ update_metainfo_datafile(void)
{
FILE *fh;
- if (!(Log_destination & LOG_DESTINATION_STDERR) &&
+ if (!(Log_destination & LOG_DESTINATION_FILE) &&
!(Log_destination & LOG_DESTINATION_CSVLOG))
{
if (unlink(LOG_METAINFO_DATAFILE) < 0 && errno != ENOENT)
@@ -1382,7 +1402,7 @@ update_metainfo_datafile(void)
return;
}
- if (last_file_name && (Log_destination & LOG_DESTINATION_STDERR))
+ if (last_file_name && (Log_destination & LOG_DESTINATION_FILE))
{
if (fprintf(fh, "stderr %s\n", last_file_name) < 0)
{
diff --git a/src/backend/utils/adt/misc.c b/src/backend/utils/adt/misc.c
index 62341b84d1..5e7df05e01 100644
--- a/src/backend/utils/adt/misc.c
+++ b/src/backend/utils/adt/misc.c
@@ -349,13 +349,6 @@ pg_reload_conf(PG_FUNCTION_ARGS)
Datum
pg_rotate_logfile(PG_FUNCTION_ARGS)
{
- if (!Logging_collector)
- {
- ereport(WARNING,
- (errmsg("rotation not possible because log collection not active")));
- PG_RETURN_BOOL(false);
- }
-
SendPostmasterSignal(PMSIGNAL_ROTATE_LOGFILE);
PG_RETURN_BOOL(true);
}
diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
index 918db0a8f2..2c23914b82 100644
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -175,7 +175,7 @@ static const char *process_log_prefix_padding(const char *p, int *padding);
static void log_line_prefix(StringInfo buf, ErrorData *edata);
static void write_csvlog(ErrorData *edata);
static void send_message_to_server_log(ErrorData *edata);
-static void write_pipe_chunks(char *data, int len, int dest);
+static void write_pipe_chunks(char *data, int len, bool csv);
static void send_message_to_frontend(ErrorData *edata);
static char *expand_fmt_string(const char *fmt, ErrorData *edata);
static const char *useful_strerror(int errnum);
@@ -2831,7 +2831,7 @@ write_csvlog(ErrorData *edata)
if (am_syslogger)
write_syslogger_file(buf.data, buf.len, LOG_DESTINATION_CSVLOG);
else
- write_pipe_chunks(buf.data, buf.len, LOG_DESTINATION_CSVLOG);
+ write_pipe_chunks(buf.data, buf.len, true);
pfree(buf.data);
}
@@ -3008,7 +3008,7 @@ send_message_to_server_log(ErrorData *edata)
#endif /* WIN32 */
/* Write to stderr, if enabled */
- if ((Log_destination & LOG_DESTINATION_STDERR) || whereToSendOutput == DestDebug)
+ if ((Log_destination & (LOG_DESTINATION_STDERR | LOG_DESTINATION_FILE)) || whereToSendOutput == DestDebug)
{
/*
* Use the chunking protocol if we know the syslogger should be
@@ -3016,7 +3016,7 @@ send_message_to_server_log(ErrorData *edata)
* Otherwise, just do a vanilla write to stderr.
*/
if (redirection_done && !am_syslogger)
- write_pipe_chunks(buf.data, buf.len, LOG_DESTINATION_STDERR);
+ write_pipe_chunks(buf.data, buf.len, false);
#ifdef WIN32
/*
@@ -3088,7 +3088,7 @@ send_message_to_server_log(ErrorData *edata)
* rc to void to shut up the compiler.
*/
static void
-write_pipe_chunks(char *data, int len, int dest)
+write_pipe_chunks(char *data, int len, bool csv)
{
PipeProtoChunk p;
int fd = fileno(stderr);
@@ -3102,7 +3102,7 @@ write_pipe_chunks(char *data, int len, int dest)
/* write all but the last chunk */
while (len > PIPE_MAX_PAYLOAD)
{
- p.proto.is_last = (dest == LOG_DESTINATION_CSVLOG ? 'F' : 'f');
+ p.proto.is_last = (csv ? 'F' : 'f');
p.proto.len = PIPE_MAX_PAYLOAD;
memcpy(p.proto.data, data, PIPE_MAX_PAYLOAD);
rc = write(fd, &p, PIPE_HEADER_SIZE + PIPE_MAX_PAYLOAD);
@@ -3112,7 +3112,7 @@ write_pipe_chunks(char *data, int len, int dest)
}
/* write the last chunk */
- p.proto.is_last = (dest == LOG_DESTINATION_CSVLOG ? 'T' : 't');
+ p.proto.is_last = (csv ? 'T' : 't');
p.proto.len = len;
memcpy(p.proto.data, data, len);
rc = write(fd, &p, PIPE_HEADER_SIZE + len);
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 246fea8693..26ae6bf296 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -1436,15 +1436,6 @@ static struct config_bool ConfigureNamesBool[] =
NULL, NULL, NULL
},
{
- {"logging_collector", PGC_POSTMASTER, LOGGING_WHERE,
- gettext_noop("Start a subprocess to capture stderr output and/or csvlogs into log files."),
- NULL
- },
- &Logging_collector,
- false,
- NULL, NULL, NULL
- },
- {
{"log_truncate_on_rotation", PGC_SIGHUP, LOGGING_WHERE,
gettext_noop("Truncate existing log files of same name during log rotation."),
NULL
@@ -3325,7 +3316,7 @@ static struct config_string ConfigureNamesString[] =
{"log_destination", PGC_SIGHUP, LOGGING_WHERE,
gettext_noop("Sets the destination for server log output."),
gettext_noop("Valid values are combinations of \"stderr\", "
- "\"syslog\", \"csvlog\", and \"eventlog\", "
+ "\"file\", \"syslog\", \"csvlog\", and \"eventlog\", "
"depending on the platform."),
GUC_LIST_INPUT
},
@@ -10077,6 +10068,8 @@ check_log_destination(char **newval, void **extra, GucSource source)
if (pg_strcasecmp(tok, "stderr") == 0)
newlogdest |= LOG_DESTINATION_STDERR;
+ else if (pg_strcasecmp(tok, "file") == 0)
+ newlogdest |= LOG_DESTINATION_FILE;
else if (pg_strcasecmp(tok, "csvlog") == 0)
newlogdest |= LOG_DESTINATION_CSVLOG;
#ifdef HAVE_SYSLOG
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index df5d2f3f22..58f899c097 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -342,17 +342,10 @@
# - Where to Log -
#log_destination = 'stderr' # Valid values are combinations of
- # stderr, csvlog, syslog, and eventlog,
- # depending on platform. csvlog
- # requires logging_collector to be on.
-
-# This is used when logging to stderr:
-#logging_collector = off # Enable capturing of stderr and csvlog
- # into log files. Required to be on for
- # csvlogs.
- # (change requires restart)
+ # stderr, file, csvlog, syslog, and
+ # eventlog, depending on platform.
-# These are only used if logging_collector is on:
+# These are only used if log_destination is file or csvlog
#log_directory = 'log' # directory where log files are written,
# can be absolute or relative to PGDATA
#log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log' # log file name pattern,
diff --git a/src/include/postmaster/syslogger.h b/src/include/postmaster/syslogger.h
index f4248ef5d4..82eafc0241 100644
--- a/src/include/postmaster/syslogger.h
+++ b/src/include/postmaster/syslogger.h
@@ -62,7 +62,6 @@ typedef union
/* GUC options */
-extern bool Logging_collector;
extern int Log_RotationAge;
extern int Log_RotationSize;
extern PGDLLIMPORT char *Log_directory;
@@ -81,7 +80,7 @@ extern HANDLE syslogPipe[2];
extern int SysLogger_Start(void);
-extern void write_syslogger_file(const char *buffer, int count, int dest);
+extern void write_syslogger_file(const char *buffer, int count, bool dest_csv);
#ifdef EXEC_BACKEND
extern void SysLoggerMain(int argc, char *argv[]) pg_attribute_noreturn();
diff --git a/src/include/utils/elog.h b/src/include/utils/elog.h
index 7bfd25a9e9..49dc7ea9e5 100644
--- a/src/include/utils/elog.h
+++ b/src/include/utils/elog.h
@@ -396,6 +396,7 @@ extern bool syslog_split_messages;
#define LOG_DESTINATION_SYSLOG 2
#define LOG_DESTINATION_EVENTLOG 4
#define LOG_DESTINATION_CSVLOG 8
+#define LOG_DESTINATION_FILE 16
/* Other exported functions */
extern void DebugFileOpen(void);
Magnus Hagander <magnus@hagander.net> writes:
My understanding is that the main reason for this is that we cannot change
logging_collector without restarting postmaster, whereas we can change
log_destination.
Right, because the decision whether to redirect stdout/stderr can't
be changed on the fly.
My suggestion is we work around this by just always starting the logging
collector, even if we're not planning to use it.
Umm....
Do people see an actual problem with that? I agree it's an extra round of
indirection there, but is that a problem? It would also cause one more
backgorund process to always be running, but again, is that really a
problem? The overhead is not exactly large.
You just made three assertions about "this isn't a problem" without
providing any evidence in support of any of them. Maybe with some
measurements we could have a real discussion.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Aug 31, 2017 at 2:34 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Magnus Hagander <magnus@hagander.net> writes:
My understanding is that the main reason for this is that we cannot
change
logging_collector without restarting postmaster, whereas we can change
log_destination.Right, because the decision whether to redirect stdout/stderr can't
be changed on the fly.
Right.
We could of course also say we only care about things generated by our
ereport framework, in which case we don't need to redirect stderr and can
just use a "regular pipe". But IIRC that's functionality we also
specifically wanted (though I don't think I've ever needed it myself,
presumably others have).
My suggestion is we work around this by just always starting the logging
collector, even if we're not planning to use it.Umm....
Do people see an actual problem with that? I agree it's an extra round of
indirection there, but is that a problem? It would also cause one more
backgorund process to always be running, but again, is that really a
problem? The overhead is not exactly large.You just made three assertions about "this isn't a problem" without
providing any evidence in support of any of them. Maybe with some
measurements we could have a real discussion.
Well, not entirely. The first two are questions "is this really a problem".
The overhead of an extra process certainly isn't much, I'm wiling to say
that as an assertion.
The other two, less so, that's more question.
Are you actually asking for a benchmark of if logging gets slower? If so,
could you suggest a workload to make an actual benchmark of it (where
logging would be high enough that it could be come a bottleneck -- and not
writing the log data to disk, but the actual logging). I'm not sure what a
good one would be.
--
Magnus Hagander
Me: https://www.hagander.net/ <http://www.hagander.net/>
Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/>
Magnus Hagander <magnus@hagander.net> writes:
On Thu, Aug 31, 2017 at 2:34 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Right, because the decision whether to redirect stdout/stderr can't
be changed on the fly.
Right.
We could of course also say we only care about things generated by our
ereport framework, in which case we don't need to redirect stderr and can
just use a "regular pipe". But IIRC that's functionality we also
specifically wanted (though I don't think I've ever needed it myself,
presumably others have).
Yes, it's pretty important, because of assorted stuff not-under-our-
control that doesn't know about ereport and will just print to stderr
anyway. Some examples: dynamic linker can't-resolve-symbol failure
messages, glibc malloc corruption error messages, just about any external
module in plperl or plpython. I don't find this to be negotiable.
Are you actually asking for a benchmark of if logging gets slower?
Yes.
If so,
could you suggest a workload to make an actual benchmark of it (where
logging would be high enough that it could be come a bottleneck -- and not
writing the log data to disk, but the actual logging). I'm not sure what a
good one would be.
pgbench with log_statement = all would be a pretty easy test case.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 31 August 2017 at 13:49, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Magnus Hagander <magnus@hagander.net> writes:
On Thu, Aug 31, 2017 at 2:34 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Yes, it's pretty important, because of assorted stuff not-under-our-
control that doesn't know about ereport and will just print to stderr
anyway. Some examples: dynamic linker can't-resolve-symbol failure
messages, glibc malloc corruption error messages, just about any external
module in plperl or plpython. I don't find this to be negotiable.
So what happens now with these messages? My understanding is that
they're missing from the CSV logs and are simply inserted into the
text logs without any log_line_prefix? The logging collector doesn't
recognize these messages and reformat them for the CSV logs does it?
I'm actually asking because I'm more concerned with JSON logs or
msgpack logs. Currently these are supported with an emit_log_hook but
they can't capture these non-ereport logs either.
Also the CSV and emit_log_hook based logs don't have any convenient
way to turn them on and off and control the location and filename of
the logs. It would be nice if we could have something like
log_destinations='stderr=text,syslog=json,postgresql-%Y-%m-%d_%H%M%S.csv=csv'
Are you actually asking for a benchmark of if logging gets slower?
Yes.
Personally I don't think it's "performance" so much as operational
issues that are more concerning. For all we know there are people out
there who tried to use the logging collector and found it didn't work
well on some system -- perhaps it interacted with systemd or something
else on the system -- and they switched back to just using stderr. I
don't know how to flush these users out though if there are any. Just
making this change early in a release cycle is the best we can do.
--
greg
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Sep 1, 2017 at 6:00 PM, Greg Stark <stark@mit.edu> wrote:
On 31 August 2017 at 13:49, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Magnus Hagander <magnus@hagander.net> writes:
On Thu, Aug 31, 2017 at 2:34 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Yes, it's pretty important, because of assorted stuff not-under-our-
control that doesn't know about ereport and will just print to stderr
anyway. Some examples: dynamic linker can't-resolve-symbol failure
messages, glibc malloc corruption error messages, just about any external
module in plperl or plpython. I don't find this to be negotiable.So what happens now with these messages? My understanding is that
they're missing from the CSV logs and are simply inserted into the
text logs without any log_line_prefix? The logging collector doesn't
recognize these messages and reformat them for the CSV logs does it?
Yeah, that's pretty much it.
Fixing that is also on my plan, but for later :)
I'm actually asking because I'm more concerned with JSON logs or
msgpack logs. Currently these are supported with an emit_log_hook but
they can't capture these non-ereport logs either.Also the CSV and emit_log_hook based logs don't have any convenient
way to turn them on and off and control the location and filename of
the logs. It would be nice if we could have something likelog_destinations='stderr=text,syslog=json,postgresql-%Y-%m-%
d_%H%M%S.csv=csv'
That could definitely be an interesting improvement as well. Decoupling the
format from the destination would make a lot of sense. (Arguments can
certainly be made for example for csv-to-syslog, when you have a custom
syslog server)
Are you actually asking for a benchmark of if logging gets slower?
Yes.
Personally I don't think it's "performance" so much as operational
issues that are more concerning. For all we know there are people out
there who tried to use the logging collector and found it didn't work
well on some system -- perhaps it interacted with systemd or something
else on the system -- and they switched back to just using stderr. I
don't know how to flush these users out though if there are any. Just
making this change early in a release cycle is the best we can do.
The most common use of stderr I think is debian/ubuntu which uses that and
then pg_ctl based redirect. Or at least used to (I remember being annoyed
by that at least once..)
AFAIK all other major platforms already use the logging collector in the
standard packages.
--
Magnus Hagander
Me: https://www.hagander.net/ <http://www.hagander.net/>
Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/>
Magnus Hagander <magnus@hagander.net> writes:
On Fri, Sep 1, 2017 at 6:00 PM, Greg Stark <stark@mit.edu> wrote:
So what happens now with these messages? My understanding is that
they're missing from the CSV logs and are simply inserted into the
text logs without any log_line_prefix? The logging collector doesn't
recognize these messages and reformat them for the CSV logs does it?
Yeah, that's pretty much it.
Fixing that is also on my plan, but for later :)
Keep in mind that you've got to be really, really conservative about
adding functionality in the logging collector process. If it ever
crashes, we have problems. If memory serves, the postmaster is able
to restart it, but we cannot restore the original stdout/stderr
destination, which is problematic if that's where its output had
been going. So it's pretty close to being as mission-critical as
the postmaster itself.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Sep 1, 2017 at 6:44 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Magnus Hagander <magnus@hagander.net> writes:
On Fri, Sep 1, 2017 at 6:00 PM, Greg Stark <stark@mit.edu> wrote:
So what happens now with these messages? My understanding is that
they're missing from the CSV logs and are simply inserted into the
text logs without any log_line_prefix? The logging collector doesn't
recognize these messages and reformat them for the CSV logs does it?Yeah, that's pretty much it.
Fixing that is also on my plan, but for later :)
Keep in mind that you've got to be really, really conservative about
adding functionality in the logging collector process. If it ever
crashes, we have problems. If memory serves, the postmaster is able
to restart it, but we cannot restore the original stdout/stderr
destination, which is problematic if that's where its output had
been going. So it's pretty close to being as mission-critical as
the postmaster itself.
Yeah. That's one of the reasons I'm not trying to do it all in one batch.
But yeah, the postmaster restarts it just fine. And with the WIP patch I
posted earlier, the message from the postmaster that it did gets lost if
you are logging to stderr. It does end up in the CSV file though. And I'm
sure there are plenty of other corner cases around that one to consider.
--
Magnus Hagander
Me: https://www.hagander.net/ <http://www.hagander.net/>
Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/>
Hi
On 31 August 2017 at 14:49, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Are you actually asking for a benchmark of if logging gets slower?
Yes.
If so,
could you suggest a workload to make an actual benchmark of it (where
logging would be high enough that it could be come a bottleneck -- and
not
writing the log data to disk, but the actual logging). I'm not sure what
a
good one would be.
pgbench with log_statement = all would be a pretty easy test case.
This part of the discussion caught my attention, and I tried to perform such
easy test. I was doing:
pgbench -S -j2 -c${clients} -T500 test
with `log_statement=all` and `log_destination=stderr`, what I assume should
be
enough to get some approximation for numbers.
scaling factor: 100
average latency:
clients patch master
10 1.827 1.456
20 4.027 3.300
30 6.284 4.921
40 8.409 6.767
50 10.985 8.646
It seems that for this particular workload it was about 20-25% slower.
Dmitry Dolgov <9erthalion6@gmail.com> writes:
On 31 August 2017 at 14:49, Tom Lane <tgl@sss.pgh.pa.us> wrote:
pgbench with log_statement = all would be a pretty easy test case.
It seems that for this particular workload it was about 20-25% slower.
Ouch. That seems like rather a large hit :-(. I actually expected
it to be close to negligible, but I don't think 20% qualifies.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Sep 7, 2017 at 7:02 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Dmitry Dolgov <9erthalion6@gmail.com> writes:
On 31 August 2017 at 14:49, Tom Lane <tgl@sss.pgh.pa.us> wrote:
pgbench with log_statement = all would be a pretty easy test case.It seems that for this particular workload it was about 20-25% slower.
Ouch. That seems like rather a large hit :-(. I actually expected
it to be close to negligible, but I don't think 20% qualifies.
Agreed, that's a lot more than I expected. A few questions though:
1. where was stderr actually sent? To the console, to /dev/null or to a
file?
2. Could you run the same thing (on the same machine) with the logging
collector on and logging to file, without the patch? I'd assume that gives
performance similar to running with the patch, but it would be good to
confirm that.
And thanks for running the benchmark, saved me some time!
--
Magnus Hagander
Me: https://www.hagander.net/ <http://www.hagander.net/>
Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/>
On 8 September 2017 at 01:32, Magnus Hagander <magnus@hagander.net> wrote:
1. where was stderr actually sent? To the console, to /dev/null or to a
file?
To the console (but I can also try other options, although I'm not sure if
it would have any impact).
2. Could you run the same thing (on the same machine) with the logging
collector on and logging to file, without the patch? I'd assume that gives
performance similar to running with the patch, but it would be good to
confirm that.
Sure, I'll do it this weekend.
On 7 September 2017 at 15:46, Dmitry Dolgov <9erthalion6@gmail.com> wrote:
It seems that for this particular workload it was about 20-25% slower.
Hmm...looks like I provided misleading data, sorry. The numbers from
previous
email are correct and I'm able to reproduce them, but surprisingly for me
only
for one particular situation when `log_statement=all;
log_destination=stderr`
and stderr is sent to a console with a tmux session running in it (and
apparently
it has some impact on the performance, but I'm not sure how exactly). In all
other cases (when stderr is sent to a plain console, /dev/null, or we send
logs
to a file) numbers are different, and the difference between patch and
master
is quite less significant.
average latency:
clients patch master
10 0.321 0.286
20 0.669 0.602
30 1.016 0.942
40 1.358 1.280
50 1.727 1.637
On Sun, Sep 10, 2017 at 5:29 AM, Dmitry Dolgov <9erthalion6@gmail.com> wrote:
average latency:
clients patch master
10 0.321 0.286
20 0.669 0.602
30 1.016 0.942
40 1.358 1.280
50 1.727 1.637
That's still a noticeable slowdown, though. And we've had previous
reports of the overhead of logging being significant as well:
/messages/by-id/CACLsApsA7U0GCFpojVQem6SGTEkv8vnwdBfhVi+dqO+gu5gdCA@mail.gmail.com
I seem to recall a discussion, perhaps in-person, around the time Theo
submitted that patch where it was reported that the logging collector
could not be used on some systems he was working with because it
became a major performance bottleneck. With each backend writing its
own messages to a file, it was tolerable, but when you tried to funnel
everything through a single process, the back-pressure slowed down the
entire system unacceptably.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Wed, Nov 15, 2017 at 12:33 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Sun, Sep 10, 2017 at 5:29 AM, Dmitry Dolgov <9erthalion6@gmail.com> wrote:
average latency:
clients patch master
10 0.321 0.286
20 0.669 0.602
30 1.016 0.942
40 1.358 1.280
50 1.727 1.637That's still a noticeable slowdown, though. And we've had previous
reports of the overhead of logging being significant as well:/messages/by-id/CACLsApsA7U0GCFpojVQem6SGTEkv8vnwdBfhVi+dqO+gu5gdCA@mail.gmail.com
I seem to recall a discussion, perhaps in-person, around the time Theo
submitted that patch where it was reported that the logging collector
could not be used on some systems he was working with because it
became a major performance bottleneck. With each backend writing its
own messages to a file, it was tolerable, but when you tried to funnel
everything through a single process, the back-pressure slowed down the
entire system unacceptably.
I have moved this patch to next CF for now.. The original patch still applies.
--
Michael
On Tue, Nov 14, 2017 at 5:33 PM, Robert Haas <robertmhaas@gmail.com> wrote:
On Sun, Sep 10, 2017 at 5:29 AM, Dmitry Dolgov <9erthalion6@gmail.com>
wrote:average latency:
clients patch master
10 0.321 0.286
20 0.669 0.602
30 1.016 0.942
40 1.358 1.280
50 1.727 1.637That's still a noticeable slowdown, though. And we've had previous
reports of the overhead of logging being significant as well:/messages/by-id/CACLsApsA7U0GCFpojVQem6SGTEkv8
vnwdBfhVi+dqO+gu5gdCA@mail.gmail.comI seem to recall a discussion, perhaps in-person, around the time Theo
submitted that patch where it was reported that the logging collector
could not be used on some systems he was working with because it
became a major performance bottleneck. With each backend writing its
own messages to a file, it was tolerable, but when you tried to funnel
everything through a single process, the back-pressure slowed down the
entire system unacceptably.
Finally found myself back at this one, because I still think this is a
problem we definitely need to adress (whether with this file or not).
The funneling into a single process is definitely an issue.
But we don't really solve that problem today wit logging to stderr, do we?
Because somebody has to pick up the log as it came from stderr. Yes, you
get more overhead when sending the log to devnull, but that isn't really a
realistic scenario. The question is what to do when you actually want to
collect that much logging that quickly.
If each backend could actually log to *its own file*, then things would get
sped up. But we can't do that today. Unless you use the hooks and build it
yourself.
Per the thread referenced, using the hooks to handle the
very-high-rate-logging case seems to be the conclusion. But is that still
the conclusion, or do we feel we need to also have a native solution?
And if the conclusion is that hooks is the way to go for that, then is the
slowdown of this patch actually a relevant problem to it?
--
Magnus Hagander
Me: https://www.hagander.net/ <http://www.hagander.net/>
Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/>
On Sat, Jan 20, 2018 at 7:51 AM, Magnus Hagander <magnus@hagander.net> wrote:
Finally found myself back at this one, because I still think this is a
problem we definitely need to adress (whether with this file or not).The funneling into a single process is definitely an issue.
But we don't really solve that problem today wit logging to stderr, do we?
Because somebody has to pick up the log as it came from stderr. Yes, you get
more overhead when sending the log to devnull, but that isn't really a
realistic scenario. The question is what to do when you actually want to
collect that much logging that quickly.
I think it depends on where the bottleneck is. If you're limited by
the speed at which a single process can write, shutting the logging
collector off and letting everyone write fixes it, because now you can
bring the CPU cycles of many processes to bear rather than just one.
If you're limited by the rate at which you can lay the file down on
disk, then turning off the logging collector doesn't help, but I don't
think that's the main problem. Now, of course, if you're writing the
file to disk faster than a single process could do all those writes,
then you're probably also going to need multiple processes to keep up
with reading it, parsing it, etc. But that's not a problem for
PostgreSQL core unless we decide to start shipping an in-core log
analyzer.
If each backend could actually log to *its own file*, then things would get
sped up. But we can't do that today. Unless you use the hooks and build it
yourself.
That seems like a useful thing to support in core.
Per the thread referenced, using the hooks to handle the
very-high-rate-logging case seems to be the conclusion. But is that still
the conclusion, or do we feel we need to also have a native solution?And if the conclusion is that hooks is the way to go for that, then is the
slowdown of this patch actually a relevant problem to it?
I think that if we commit what you've proposed, we're making it harder
for people who have a high volume of logging but are not currently
using hooks. I think we should try really hard to avoid the situation
where our suggested workaround for a server change is "go write some C
code and maybe you can get back to the performance you had with
release N-1". That's just not friendly.
I wonder if it would be feasible to set things up so that the logging
collector was always started, but whether or not backends used it or
wrote directly to their original stderr was configurable (e.g. dup
stderr elsewhere, then dup whichever output source is currently
selected onto stderr, then dup the other one if the config is changed
later).
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On 01/22/2018 08:52 PM, Robert Haas wrote:
On Sat, Jan 20, 2018 at 7:51 AM, Magnus Hagander <magnus@hagander.net> wrote:
Finally found myself back at this one, because I still think this is a
problem we definitely need to adress (whether with this file or not).The funneling into a single process is definitely an issue.
But we don't really solve that problem today wit logging to stderr, do we?
Because somebody has to pick up the log as it came from stderr. Yes, you get
more overhead when sending the log to devnull, but that isn't really a
realistic scenario. The question is what to do when you actually want to
collect that much logging that quickly.I think it depends on where the bottleneck is. If you're limited by
the speed at which a single process can write, shutting the logging
collector off and letting everyone write fixes it, because now you
can bring the CPU cycles of many processes to bear rather than just
one. If you're limited by the rate at which you can lay the file down
on disk, then turning off the logging collector doesn't help, but I
don't think that's the main problem. Now, of course, if you're
writing the file to disk faster than a single process could do all
those writes, then you're probably also going to need multiple
processes to keep up with reading it, parsing it, etc. But that's not
a problem for PostgreSQL core unless we decide to start shipping an
in-core log analyzer.
Sorry for the naive question, but which of these bottlenecks are we
actually hitting? I don't recall dealing with an actual production
system where the log collector would be an issue, so I don't have a very
good idea where the actual bottleneck is in this case.
I find it hard to believe the collector would be limited by I/O when
writing the data to disk (buffered sequential writes and all that).
So I guess it's more about the process of collecting the data from all
the processes through pipe, with the PIPE_MAX_PAYLOAD dance, etc.
I plan to do some experiments with log_statements=all, but my guess is
that the main issue is in transferring individual messages (which may be
further split into multiple chunks). What if we instead send the log
messages in larger batches? Of course, that would require alternative
transport mechanism (say, through shared memory) and the collector would
have to merge to maintain ordering. And then there's the issue that the
collector is pretty critical component.
FWIW I'm pretty sure we're not the only project facing the need to
collect high volume of logs from many processes, so how do the other
projects handle this issue?
If each backend could actually log to *its own file*, then things would get
sped up. But we can't do that today. Unless you use the hooks and build it
yourself.That seems like a useful thing to support in core.
Yeah, I agree with that.
Per the thread referenced, using the hooks to handle the
very-high-rate-logging case seems to be the conclusion. But is that still
the conclusion, or do we feel we need to also have a native solution?And if the conclusion is that hooks is the way to go for that, then is the
slowdown of this patch actually a relevant problem to it?I think that if we commit what you've proposed, we're making it harder
for people who have a high volume of logging but are not currently
using hooks. I think we should try really hard to avoid the situation
where our suggested workaround for a server change is "go write some C
code and maybe you can get back to the performance you had with
release N-1". That's just not friendly.I wonder if it would be feasible to set things up so that the logging
collector was always started, but whether or not backends used it or
wrote directly to their original stderr was configurable (e.g. dup
stderr elsewhere, then dup whichever output source is currently
selected onto stderr, then dup the other one if the config is changed
later).
I think the hook system is a really powerful tool, but it seems a bit
awkward to force people to use it to improve performance like this ...
That seems like something the core should to out of the box.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hi,
On 2018-01-20 14:51:12 +0200, Magnus Hagander wrote:
Finally found myself back at this one, because I still think this is a
problem we definitely need to adress (whether with this file or not).
This has an open CF entry for CF 2018-03. Given that there's still a lot
of uncertainty what we want, I don't think that's a reasonable target.
I would suggest moving it to 'returned with feedback'. Arguments
against?
- Andres
On Fri, Mar 2, 2018 at 8:27 AM, Andres Freund <andres@anarazel.de> wrote:
Hi,
On 2018-01-20 14:51:12 +0200, Magnus Hagander wrote:
Finally found myself back at this one, because I still think this is a
problem we definitely need to adress (whether with this file or not).This has an open CF entry for CF 2018-03. Given that there's still a lot
of uncertainty what we want, I don't think that's a reasonable target.
I would suggest moving it to 'returned with feedback'. Arguments
against?
I agree, let's boot it. It needs more discussions about what to actually
do, and after that how. Let's focus on other things for 11, and have this
discussion again when the time comes. I'll go mark it.
--
Magnus Hagander
Me: https://www.hagander.net/ <http://www.hagander.net/>
Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/>
On Mon, Jan 22, 2018 at 4:25 PM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
Sorry for the naive question, but which of these bottlenecks are we
actually hitting? I don't recall dealing with an actual production
system where the log collector would be an issue, so I don't have a very
good idea where the actual bottleneck is in this case.
Unfortunately, neither do I -- but I do believe that the problem is real.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes:
On Mon, Jan 22, 2018 at 4:25 PM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:Sorry for the naive question, but which of these bottlenecks are we
actually hitting? I don't recall dealing with an actual production
system where the log collector would be an issue, so I don't have a very
good idea where the actual bottleneck is in this case.
Unfortunately, neither do I -- but I do believe that the problem is real.
Not to state the obvious ... but if we do not know where the bottleneck
is, isn't it a bit premature to be proposing changes to "fix" it?
But that argument just leads to the same conclusion that seems to be
prevalent at the moment, ie boot this to v12.
regards, tom lane
On Fri, Mar 2, 2018 at 7:29 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
On Mon, Jan 22, 2018 at 4:25 PM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:Sorry for the naive question, but which of these bottlenecks are we
actually hitting? I don't recall dealing with an actual production
system where the log collector would be an issue, so I don't have a very
good idea where the actual bottleneck is in this case.Unfortunately, neither do I -- but I do believe that the problem is real.
Not to state the obvious ... but if we do not know where the bottleneck
is, isn't it a bit premature to be proposing changes to "fix" it?But that argument just leads to the same conclusion that seems to be
prevalent at the moment, ie boot this to v12.
Just to be clear, it has already been booted to v12.
--
Magnus Hagander
Me: https://www.hagander.net/ <http://www.hagander.net/>
Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/>