log_destination=file

Started by Magnus Haganderover 8 years ago23 messageshackers
Jump to latest
#1Magnus Hagander
magnus@hagander.net

Attached is a very much VIP (AKA rough draft) for $subject.

Right now we have two parameters controlling logging destination, and they
work in interesting combinations:

log_destination=stderr, logging_collector=off -> log to stderr (makes sense)
log_destination=stderr, logging_collector=on -> log to file (*highly*
illogical for most users, to set stderr when they want file)
log_destination=csvlog, logging_collector=on -> log to csvfile (makes sense)
log_destination=csvlog, logging_collector=off -> fail (ugh)

(beyond that we also have syslog and eventlog, but those are different and
not touched by this patch)

My understanding is that the main reason for this is that we cannot change
logging_collector without restarting postmaster, whereas we can change
log_destination.

My suggestion is we work around this by just always starting the logging
collector, even if we're not planning to use it. Then we'd just have:

log_destination = stderr -> log to stderr
log_destination = file -> log to file
log_destination = csvlog -> log to csv file

The main difference here is that log_destination=stderr would also go
through the logging collector which would then assemble it and write it out
to it's own stderr.

Do people see an actual problem with that? I agree it's an extra round of
indirection there, but is that a problem? It would also cause one more
backgorund process to always be running, but again, is that really a
problem? The overhead is not exactly large.

It would make configuration a lot more logical for logging. It would also
make it possible to switch between all logging configurations without
restarting.

The attached WIP is mostly working for the simple cases. It fails
completely if the syslogger is restarted at this point, simply because I
haven't figured out how to pass the original stderr down to the syslogger.
I'm sure there are also many other smaller issues around it, but I wanted
to get the discussion done before I spend the time to go through those.

Thoughts?

--
Magnus Hagander
Me: https://www.hagander.net/ <http://www.hagander.net/&gt;
Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/&gt;

Attachments:

log_destination_file.patchtext/x-patch; charset=US-ASCII; name=log_destination_file.patchDownload+69-70
#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Magnus Hagander (#1)
Re: log_destination=file

Magnus Hagander <magnus@hagander.net> writes:

My understanding is that the main reason for this is that we cannot change
logging_collector without restarting postmaster, whereas we can change
log_destination.

Right, because the decision whether to redirect stdout/stderr can't
be changed on the fly.

My suggestion is we work around this by just always starting the logging
collector, even if we're not planning to use it.

Umm....

Do people see an actual problem with that? I agree it's an extra round of
indirection there, but is that a problem? It would also cause one more
backgorund process to always be running, but again, is that really a
problem? The overhead is not exactly large.

You just made three assertions about "this isn't a problem" without
providing any evidence in support of any of them. Maybe with some
measurements we could have a real discussion.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Magnus Hagander
magnus@hagander.net
In reply to: Tom Lane (#2)
Re: log_destination=file

On Thu, Aug 31, 2017 at 2:34 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Magnus Hagander <magnus@hagander.net> writes:

My understanding is that the main reason for this is that we cannot

change

logging_collector without restarting postmaster, whereas we can change
log_destination.

Right, because the decision whether to redirect stdout/stderr can't
be changed on the fly.

Right.

We could of course also say we only care about things generated by our
ereport framework, in which case we don't need to redirect stderr and can
just use a "regular pipe". But IIRC that's functionality we also
specifically wanted (though I don't think I've ever needed it myself,
presumably others have).

My suggestion is we work around this by just always starting the logging
collector, even if we're not planning to use it.

Umm....

Do people see an actual problem with that? I agree it's an extra round of
indirection there, but is that a problem? It would also cause one more
backgorund process to always be running, but again, is that really a
problem? The overhead is not exactly large.

You just made three assertions about "this isn't a problem" without
providing any evidence in support of any of them. Maybe with some
measurements we could have a real discussion.

Well, not entirely. The first two are questions "is this really a problem".

The overhead of an extra process certainly isn't much, I'm wiling to say
that as an assertion.

The other two, less so, that's more question.

Are you actually asking for a benchmark of if logging gets slower? If so,
could you suggest a workload to make an actual benchmark of it (where
logging would be high enough that it could be come a bottleneck -- and not
writing the log data to disk, but the actual logging). I'm not sure what a
good one would be.

--
Magnus Hagander
Me: https://www.hagander.net/ <http://www.hagander.net/&gt;
Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/&gt;

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Magnus Hagander (#3)
Re: log_destination=file

Magnus Hagander <magnus@hagander.net> writes:

On Thu, Aug 31, 2017 at 2:34 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Right, because the decision whether to redirect stdout/stderr can't
be changed on the fly.

Right.

We could of course also say we only care about things generated by our
ereport framework, in which case we don't need to redirect stderr and can
just use a "regular pipe". But IIRC that's functionality we also
specifically wanted (though I don't think I've ever needed it myself,
presumably others have).

Yes, it's pretty important, because of assorted stuff not-under-our-
control that doesn't know about ereport and will just print to stderr
anyway. Some examples: dynamic linker can't-resolve-symbol failure
messages, glibc malloc corruption error messages, just about any external
module in plperl or plpython. I don't find this to be negotiable.

Are you actually asking for a benchmark of if logging gets slower?

Yes.

If so,
could you suggest a workload to make an actual benchmark of it (where
logging would be high enough that it could be come a bottleneck -- and not
writing the log data to disk, but the actual logging). I'm not sure what a
good one would be.

pgbench with log_statement = all would be a pretty easy test case.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#4)
Re: log_destination=file

On 31 August 2017 at 13:49, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Magnus Hagander <magnus@hagander.net> writes:

On Thu, Aug 31, 2017 at 2:34 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Yes, it's pretty important, because of assorted stuff not-under-our-
control that doesn't know about ereport and will just print to stderr
anyway. Some examples: dynamic linker can't-resolve-symbol failure
messages, glibc malloc corruption error messages, just about any external
module in plperl or plpython. I don't find this to be negotiable.

So what happens now with these messages? My understanding is that
they're missing from the CSV logs and are simply inserted into the
text logs without any log_line_prefix? The logging collector doesn't
recognize these messages and reformat them for the CSV logs does it?

I'm actually asking because I'm more concerned with JSON logs or
msgpack logs. Currently these are supported with an emit_log_hook but
they can't capture these non-ereport logs either.

Also the CSV and emit_log_hook based logs don't have any convenient
way to turn them on and off and control the location and filename of
the logs. It would be nice if we could have something like

log_destinations='stderr=text,syslog=json,postgresql-%Y-%m-%d_%H%M%S.csv=csv'

Are you actually asking for a benchmark of if logging gets slower?

Yes.

Personally I don't think it's "performance" so much as operational
issues that are more concerning. For all we know there are people out
there who tried to use the logging collector and found it didn't work
well on some system -- perhaps it interacted with systemd or something
else on the system -- and they switched back to just using stderr. I
don't know how to flush these users out though if there are any. Just
making this change early in a release cycle is the best we can do.

--
greg

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Magnus Hagander
magnus@hagander.net
In reply to: Bruce Momjian (#5)
Re: log_destination=file

On Fri, Sep 1, 2017 at 6:00 PM, Greg Stark <stark@mit.edu> wrote:

On 31 August 2017 at 13:49, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Magnus Hagander <magnus@hagander.net> writes:

On Thu, Aug 31, 2017 at 2:34 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Yes, it's pretty important, because of assorted stuff not-under-our-
control that doesn't know about ereport and will just print to stderr
anyway. Some examples: dynamic linker can't-resolve-symbol failure
messages, glibc malloc corruption error messages, just about any external
module in plperl or plpython. I don't find this to be negotiable.

So what happens now with these messages? My understanding is that
they're missing from the CSV logs and are simply inserted into the
text logs without any log_line_prefix? The logging collector doesn't
recognize these messages and reformat them for the CSV logs does it?

Yeah, that's pretty much it.

Fixing that is also on my plan, but for later :)

I'm actually asking because I'm more concerned with JSON logs or
msgpack logs. Currently these are supported with an emit_log_hook but
they can't capture these non-ereport logs either.

Also the CSV and emit_log_hook based logs don't have any convenient
way to turn them on and off and control the location and filename of
the logs. It would be nice if we could have something like

log_destinations='stderr=text,syslog=json,postgresql-%Y-%m-%
d_%H%M%S.csv=csv'

That could definitely be an interesting improvement as well. Decoupling the
format from the destination would make a lot of sense. (Arguments can
certainly be made for example for csv-to-syslog, when you have a custom
syslog server)

Are you actually asking for a benchmark of if logging gets slower?

Yes.

Personally I don't think it's "performance" so much as operational
issues that are more concerning. For all we know there are people out
there who tried to use the logging collector and found it didn't work
well on some system -- perhaps it interacted with systemd or something
else on the system -- and they switched back to just using stderr. I
don't know how to flush these users out though if there are any. Just
making this change early in a release cycle is the best we can do.

The most common use of stderr I think is debian/ubuntu which uses that and
then pg_ctl based redirect. Or at least used to (I remember being annoyed
by that at least once..)

AFAIK all other major platforms already use the logging collector in the
standard packages.

--
Magnus Hagander
Me: https://www.hagander.net/ <http://www.hagander.net/&gt;
Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/&gt;

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Magnus Hagander (#6)
Re: log_destination=file

Magnus Hagander <magnus@hagander.net> writes:

On Fri, Sep 1, 2017 at 6:00 PM, Greg Stark <stark@mit.edu> wrote:

So what happens now with these messages? My understanding is that
they're missing from the CSV logs and are simply inserted into the
text logs without any log_line_prefix? The logging collector doesn't
recognize these messages and reformat them for the CSV logs does it?

Yeah, that's pretty much it.

Fixing that is also on my plan, but for later :)

Keep in mind that you've got to be really, really conservative about
adding functionality in the logging collector process. If it ever
crashes, we have problems. If memory serves, the postmaster is able
to restart it, but we cannot restore the original stdout/stderr
destination, which is problematic if that's where its output had
been going. So it's pretty close to being as mission-critical as
the postmaster itself.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Magnus Hagander
magnus@hagander.net
In reply to: Tom Lane (#7)
Re: log_destination=file

On Fri, Sep 1, 2017 at 6:44 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Magnus Hagander <magnus@hagander.net> writes:

On Fri, Sep 1, 2017 at 6:00 PM, Greg Stark <stark@mit.edu> wrote:

So what happens now with these messages? My understanding is that
they're missing from the CSV logs and are simply inserted into the
text logs without any log_line_prefix? The logging collector doesn't
recognize these messages and reformat them for the CSV logs does it?

Yeah, that's pretty much it.

Fixing that is also on my plan, but for later :)

Keep in mind that you've got to be really, really conservative about
adding functionality in the logging collector process. If it ever
crashes, we have problems. If memory serves, the postmaster is able
to restart it, but we cannot restore the original stdout/stderr
destination, which is problematic if that's where its output had
been going. So it's pretty close to being as mission-critical as
the postmaster itself.

Yeah. That's one of the reasons I'm not trying to do it all in one batch.

But yeah, the postmaster restarts it just fine. And with the WIP patch I
posted earlier, the message from the postmaster that it did gets lost if
you are logging to stderr. It does end up in the CSV file though. And I'm
sure there are plenty of other corner cases around that one to consider.

--
Magnus Hagander
Me: https://www.hagander.net/ <http://www.hagander.net/&gt;
Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/&gt;

#9Dmitry Dolgov
9erthalion6@gmail.com
In reply to: Magnus Hagander (#8)
Re: log_destination=file

Hi

On 31 August 2017 at 14:49, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Are you actually asking for a benchmark of if logging gets slower?

Yes.

If so,
could you suggest a workload to make an actual benchmark of it (where
logging would be high enough that it could be come a bottleneck -- and

not

writing the log data to disk, but the actual logging). I'm not sure what

a

good one would be.

pgbench with log_statement = all would be a pretty easy test case.

This part of the discussion caught my attention, and I tried to perform such
easy test. I was doing:

pgbench -S -j2 -c${clients} -T500 test

with `log_statement=all` and `log_destination=stderr`, what I assume should
be
enough to get some approximation for numbers.

scaling factor: 100
average latency:

clients patch master

10 1.827 1.456

20 4.027 3.300

30 6.284 4.921

40 8.409 6.767

50 10.985 8.646

It seems that for this particular workload it was about 20-25% slower.

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Dmitry Dolgov (#9)
Re: log_destination=file

Dmitry Dolgov <9erthalion6@gmail.com> writes:

On 31 August 2017 at 14:49, Tom Lane <tgl@sss.pgh.pa.us> wrote:
pgbench with log_statement = all would be a pretty easy test case.

It seems that for this particular workload it was about 20-25% slower.

Ouch. That seems like rather a large hit :-(. I actually expected
it to be close to negligible, but I don't think 20% qualifies.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11Magnus Hagander
magnus@hagander.net
In reply to: Tom Lane (#10)
Re: log_destination=file

On Thu, Sep 7, 2017 at 7:02 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Dmitry Dolgov <9erthalion6@gmail.com> writes:

On 31 August 2017 at 14:49, Tom Lane <tgl@sss.pgh.pa.us> wrote:
pgbench with log_statement = all would be a pretty easy test case.

It seems that for this particular workload it was about 20-25% slower.

Ouch. That seems like rather a large hit :-(. I actually expected
it to be close to negligible, but I don't think 20% qualifies.

Agreed, that's a lot more than I expected. A few questions though:

1. where was stderr actually sent? To the console, to /dev/null or to a
file?

2. Could you run the same thing (on the same machine) with the logging
collector on and logging to file, without the patch? I'd assume that gives
performance similar to running with the patch, but it would be good to
confirm that.

And thanks for running the benchmark, saved me some time!

--
Magnus Hagander
Me: https://www.hagander.net/ <http://www.hagander.net/&gt;
Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/&gt;

#12Dmitry Dolgov
9erthalion6@gmail.com
In reply to: Magnus Hagander (#11)
Re: log_destination=file

On 8 September 2017 at 01:32, Magnus Hagander <magnus@hagander.net> wrote:

1. where was stderr actually sent? To the console, to /dev/null or to a

file?

To the console (but I can also try other options, although I'm not sure if
it would have any impact).

2. Could you run the same thing (on the same machine) with the logging

collector on and logging to file, without the patch? I'd assume that gives
performance similar to running with the patch, but it would be good to
confirm that.

Sure, I'll do it this weekend.

#13Dmitry Dolgov
9erthalion6@gmail.com
In reply to: Dmitry Dolgov (#12)
Re: log_destination=file

On 7 September 2017 at 15:46, Dmitry Dolgov <9erthalion6@gmail.com> wrote:

It seems that for this particular workload it was about 20-25% slower.​

Hmm...looks like I provided misleading data, sorry. The numbers from
previous
email are correct and I'm able to reproduce them, but surprisingly for me
only
for one particular situation when `log_statement=all;
log_destination=stderr`
and stderr is sent to a console with a tmux session running in it (and
apparently
it has some impact on the performance, but I'm not sure how exactly). In all
other cases (when stderr is sent to a plain console, /dev/null, or we send
logs
to a file) numbers are different, and the difference between patch and
master
is quite less significant.

average latency:

clients patch master

10 0.321 0.286

20 0.669 0.602

30 1.016 0.942

40 1.358 1.280

50 1.727 1.637

#14Robert Haas
robertmhaas@gmail.com
In reply to: Dmitry Dolgov (#13)
Re: [HACKERS] log_destination=file

On Sun, Sep 10, 2017 at 5:29 AM, Dmitry Dolgov <9erthalion6@gmail.com> wrote:

average latency:

clients patch master
10 0.321 0.286
20 0.669 0.602
30 1.016 0.942
40 1.358 1.280
50 1.727 1.637

That's still a noticeable slowdown, though. And we've had previous
reports of the overhead of logging being significant as well:

/messages/by-id/CACLsApsA7U0GCFpojVQem6SGTEkv8vnwdBfhVi+dqO+gu5gdCA@mail.gmail.com

I seem to recall a discussion, perhaps in-person, around the time Theo
submitted that patch where it was reported that the logging collector
could not be used on some systems he was working with because it
became a major performance bottleneck. With each backend writing its
own messages to a file, it was tolerable, but when you tried to funnel
everything through a single process, the back-pressure slowed down the
entire system unacceptably.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#15Michael Paquier
michael@paquier.xyz
In reply to: Robert Haas (#14)
Re: [HACKERS] log_destination=file

On Wed, Nov 15, 2017 at 12:33 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Sun, Sep 10, 2017 at 5:29 AM, Dmitry Dolgov <9erthalion6@gmail.com> wrote:

average latency:

clients patch master
10 0.321 0.286
20 0.669 0.602
30 1.016 0.942
40 1.358 1.280
50 1.727 1.637

That's still a noticeable slowdown, though. And we've had previous
reports of the overhead of logging being significant as well:

/messages/by-id/CACLsApsA7U0GCFpojVQem6SGTEkv8vnwdBfhVi+dqO+gu5gdCA@mail.gmail.com

I seem to recall a discussion, perhaps in-person, around the time Theo
submitted that patch where it was reported that the logging collector
could not be used on some systems he was working with because it
became a major performance bottleneck. With each backend writing its
own messages to a file, it was tolerable, but when you tried to funnel
everything through a single process, the back-pressure slowed down the
entire system unacceptably.

I have moved this patch to next CF for now.. The original patch still applies.
--
Michael

#16Magnus Hagander
magnus@hagander.net
In reply to: Robert Haas (#14)
Re: [HACKERS] log_destination=file

On Tue, Nov 14, 2017 at 5:33 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Sun, Sep 10, 2017 at 5:29 AM, Dmitry Dolgov <9erthalion6@gmail.com>
wrote:

average latency:

clients patch master
10 0.321 0.286
20 0.669 0.602
30 1.016 0.942
40 1.358 1.280
50 1.727 1.637

That's still a noticeable slowdown, though. And we've had previous
reports of the overhead of logging being significant as well:

/messages/by-id/CACLsApsA7U0GCFpojVQem6SGTEkv8
vnwdBfhVi+dqO+gu5gdCA@mail.gmail.com

I seem to recall a discussion, perhaps in-person, around the time Theo
submitted that patch where it was reported that the logging collector
could not be used on some systems he was working with because it
became a major performance bottleneck. With each backend writing its
own messages to a file, it was tolerable, but when you tried to funnel
everything through a single process, the back-pressure slowed down the
entire system unacceptably.

Finally found myself back at this one, because I still think this is a
problem we definitely need to adress (whether with this file or not).

The funneling into a single process is definitely an issue.

But we don't really solve that problem today wit logging to stderr, do we?
Because somebody has to pick up the log as it came from stderr. Yes, you
get more overhead when sending the log to devnull, but that isn't really a
realistic scenario. The question is what to do when you actually want to
collect that much logging that quickly.

If each backend could actually log to *its own file*, then things would get
sped up. But we can't do that today. Unless you use the hooks and build it
yourself.

Per the thread referenced, using the hooks to handle the
very-high-rate-logging case seems to be the conclusion. But is that still
the conclusion, or do we feel we need to also have a native solution?

And if the conclusion is that hooks is the way to go for that, then is the
slowdown of this patch actually a relevant problem to it?

--
Magnus Hagander
Me: https://www.hagander.net/ <http://www.hagander.net/&gt;
Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/&gt;

#17Robert Haas
robertmhaas@gmail.com
In reply to: Magnus Hagander (#16)
Re: [HACKERS] log_destination=file

On Sat, Jan 20, 2018 at 7:51 AM, Magnus Hagander <magnus@hagander.net> wrote:

Finally found myself back at this one, because I still think this is a
problem we definitely need to adress (whether with this file or not).

The funneling into a single process is definitely an issue.

But we don't really solve that problem today wit logging to stderr, do we?
Because somebody has to pick up the log as it came from stderr. Yes, you get
more overhead when sending the log to devnull, but that isn't really a
realistic scenario. The question is what to do when you actually want to
collect that much logging that quickly.

I think it depends on where the bottleneck is. If you're limited by
the speed at which a single process can write, shutting the logging
collector off and letting everyone write fixes it, because now you can
bring the CPU cycles of many processes to bear rather than just one.
If you're limited by the rate at which you can lay the file down on
disk, then turning off the logging collector doesn't help, but I don't
think that's the main problem. Now, of course, if you're writing the
file to disk faster than a single process could do all those writes,
then you're probably also going to need multiple processes to keep up
with reading it, parsing it, etc. But that's not a problem for
PostgreSQL core unless we decide to start shipping an in-core log
analyzer.

If each backend could actually log to *its own file*, then things would get
sped up. But we can't do that today. Unless you use the hooks and build it
yourself.

That seems like a useful thing to support in core.

Per the thread referenced, using the hooks to handle the
very-high-rate-logging case seems to be the conclusion. But is that still
the conclusion, or do we feel we need to also have a native solution?

And if the conclusion is that hooks is the way to go for that, then is the
slowdown of this patch actually a relevant problem to it?

I think that if we commit what you've proposed, we're making it harder
for people who have a high volume of logging but are not currently
using hooks. I think we should try really hard to avoid the situation
where our suggested workaround for a server change is "go write some C
code and maybe you can get back to the performance you had with
release N-1". That's just not friendly.

I wonder if it would be feasible to set things up so that the logging
collector was always started, but whether or not backends used it or
wrote directly to their original stderr was configurable (e.g. dup
stderr elsewhere, then dup whichever output source is currently
selected onto stderr, then dup the other one if the config is changed
later).

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#18Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Robert Haas (#17)
Re: [HACKERS] log_destination=file

On 01/22/2018 08:52 PM, Robert Haas wrote:

On Sat, Jan 20, 2018 at 7:51 AM, Magnus Hagander <magnus@hagander.net> wrote:

Finally found myself back at this one, because I still think this is a
problem we definitely need to adress (whether with this file or not).

The funneling into a single process is definitely an issue.

But we don't really solve that problem today wit logging to stderr, do we?
Because somebody has to pick up the log as it came from stderr. Yes, you get
more overhead when sending the log to devnull, but that isn't really a
realistic scenario. The question is what to do when you actually want to
collect that much logging that quickly.

I think it depends on where the bottleneck is. If you're limited by
the speed at which a single process can write, shutting the logging
collector off and letting everyone write fixes it, because now you
can bring the CPU cycles of many processes to bear rather than just
one. If you're limited by the rate at which you can lay the file down
on disk, then turning off the logging collector doesn't help, but I
don't think that's the main problem. Now, of course, if you're
writing the file to disk faster than a single process could do all
those writes, then you're probably also going to need multiple
processes to keep up with reading it, parsing it, etc. But that's not
a problem for PostgreSQL core unless we decide to start shipping an
in-core log analyzer.

Sorry for the naive question, but which of these bottlenecks are we
actually hitting? I don't recall dealing with an actual production
system where the log collector would be an issue, so I don't have a very
good idea where the actual bottleneck is in this case.

I find it hard to believe the collector would be limited by I/O when
writing the data to disk (buffered sequential writes and all that).

So I guess it's more about the process of collecting the data from all
the processes through pipe, with the PIPE_MAX_PAYLOAD dance, etc.

I plan to do some experiments with log_statements=all, but my guess is
that the main issue is in transferring individual messages (which may be
further split into multiple chunks). What if we instead send the log
messages in larger batches? Of course, that would require alternative
transport mechanism (say, through shared memory) and the collector would
have to merge to maintain ordering. And then there's the issue that the
collector is pretty critical component.

FWIW I'm pretty sure we're not the only project facing the need to
collect high volume of logs from many processes, so how do the other
projects handle this issue?

If each backend could actually log to *its own file*, then things would get
sped up. But we can't do that today. Unless you use the hooks and build it
yourself.

That seems like a useful thing to support in core.

Yeah, I agree with that.

Per the thread referenced, using the hooks to handle the
very-high-rate-logging case seems to be the conclusion. But is that still
the conclusion, or do we feel we need to also have a native solution?

And if the conclusion is that hooks is the way to go for that, then is the
slowdown of this patch actually a relevant problem to it?

I think that if we commit what you've proposed, we're making it harder
for people who have a high volume of logging but are not currently
using hooks. I think we should try really hard to avoid the situation
where our suggested workaround for a server change is "go write some C
code and maybe you can get back to the performance you had with
release N-1". That's just not friendly.

I wonder if it would be feasible to set things up so that the logging
collector was always started, but whether or not backends used it or
wrote directly to their original stderr was configurable (e.g. dup
stderr elsewhere, then dup whichever output source is currently
selected onto stderr, then dup the other one if the config is changed
later).

I think the hook system is a really powerful tool, but it seems a bit
awkward to force people to use it to improve performance like this ...
That seems like something the core should to out of the box.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#19Andres Freund
andres@anarazel.de
In reply to: Magnus Hagander (#16)
Re: [HACKERS] log_destination=file

Hi,

On 2018-01-20 14:51:12 +0200, Magnus Hagander wrote:

Finally found myself back at this one, because I still think this is a
problem we definitely need to adress (whether with this file or not).

This has an open CF entry for CF 2018-03. Given that there's still a lot
of uncertainty what we want, I don't think that's a reasonable target.
I would suggest moving it to 'returned with feedback'. Arguments
against?

- Andres

#20Magnus Hagander
magnus@hagander.net
In reply to: Andres Freund (#19)
Re: [HACKERS] log_destination=file

On Fri, Mar 2, 2018 at 8:27 AM, Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2018-01-20 14:51:12 +0200, Magnus Hagander wrote:

Finally found myself back at this one, because I still think this is a
problem we definitely need to adress (whether with this file or not).

This has an open CF entry for CF 2018-03. Given that there's still a lot
of uncertainty what we want, I don't think that's a reasonable target.
I would suggest moving it to 'returned with feedback'. Arguments
against?

I agree, let's boot it. It needs more discussions about what to actually
do, and after that how. Let's focus on other things for 11, and have this
discussion again when the time comes. I'll go mark it.

--
Magnus Hagander
Me: https://www.hagander.net/ <http://www.hagander.net/&gt;
Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/&gt;

#21Robert Haas
robertmhaas@gmail.com
In reply to: Tomas Vondra (#18)
#22Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#21)
#23Magnus Hagander
magnus@hagander.net
In reply to: Tom Lane (#22)