pg_waldump stucks with options --follow or -f and --stats or -z
Hi,
pg_waldump options, --follow or -f(to keep polling once per second for
new WAL to appear) and --stats or -z don't work well together i.e. the
command stucks [1]The following commands stuck: ./pg_waldump -p data/ -s 0/7000060 -f -z ./pg_waldump -p data/ -s 0/7000060 -f --stats='record' ./pg_waldump -p data/ -s 0/7000060 -f --stats='rmgr'. I think the pg_waldump should emit an error. Note
that the pg_basebakup does error out on incompatible options.
Here's a small patch for fixing above along with a note in the documentation.
Thoughts?
[1]: The following commands stuck: ./pg_waldump -p data/ -s 0/7000060 -f -z ./pg_waldump -p data/ -s 0/7000060 -f --stats='record' ./pg_waldump -p data/ -s 0/7000060 -f --stats='rmgr'
./pg_waldump -p data/ -s 0/7000060 -f -z
./pg_waldump -p data/ -s 0/7000060 -f --stats='record'
./pg_waldump -p data/ -s 0/7000060 -f --stats='rmgr'
[2]: ubuntu3:~/postgres/inst/bin$ ./pg_waldump -p data/ -s 0/7000060 -f -z pg_waldump: error: --follow and --stats are incompatible options Try "pg_waldump --help" for more information. bharath@bharathubuntu3:~/postgres/inst/bin$
ubuntu3:~/postgres/inst/bin$ ./pg_waldump -p data/ -s 0/7000060 -f -z
pg_waldump: error: --follow and --stats are incompatible options
Try "pg_waldump --help" for more information.
bharath@bharathubuntu3:~/postgres/inst/bin$
Regards,
Bharath Rupireddy.
Attachments:
v1-0001-pg_waldump-error-out-with-options-follow-and-stat.patchapplication/octet-stream; name=v1-0001-pg_waldump-error-out-with-options-follow-and-stat.patchDownload
From 8b02f81af635357a2ee70d561fc023aa3950e0cb Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Mon, 15 Nov 2021 13:59:44 +0000
Subject: [PATCH v1] pg_waldump:error out with options --follow and --stats
pg_waldump options, --follow or -f(to keep polling once per second
for new WAL to appear) and --stats or -z don't work well together
i.e. the command stucks [1]. This patch emits an error.
---
doc/src/sgml/ref/pg_waldump.sgml | 12 ++++++++++--
src/bin/pg_waldump/pg_waldump.c | 12 ++++++++++++
2 files changed, 22 insertions(+), 2 deletions(-)
diff --git a/doc/src/sgml/ref/pg_waldump.sgml b/doc/src/sgml/ref/pg_waldump.sgml
index 432254d2d5..b97534ced0 100644
--- a/doc/src/sgml/ref/pg_waldump.sgml
+++ b/doc/src/sgml/ref/pg_waldump.sgml
@@ -95,7 +95,8 @@ PostgreSQL documentation
<listitem>
<para>
After reaching the end of valid WAL, keep polling once per second for
- new WAL to appear.
+ new WAL to appear. Note that <option>--stats</option> cannot be
+ specified with this option.
</para>
</listitem>
</varlistentry>
@@ -200,7 +201,9 @@ PostgreSQL documentation
<para>
Display summary statistics (number and size of records and
full-page images) instead of individual records. Optionally
- generate statistics per-record instead of per-rmgr.
+ generate statistics per-record instead of per-rmgr. Note
+ that <option>--follow</option> cannot be specified with
+ this option.
</para>
</listitem>
</varlistentry>
@@ -261,6 +264,11 @@ PostgreSQL documentation
<literal>.partial</literal>. If those files need to be read, <literal>.partial</literal>
suffix needs to be removed from the file name.
</para>
+
+ <para>
+ Only the specified timeline is displayed (or the default, if none is
+ specified). Records in other timelines are ignored.
+ </para>
</refsect1>
<refsect1>
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 1e3894b9c4..58d6e4b2bd 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -952,6 +952,18 @@ main(int argc, char **argv)
goto bad_argument;
}
+ /*
+ * Mutually exclusive arguments
+ */
+ if (config.follow && config.stats)
+ {
+ pg_log_error("%s and %s are incompatible options",
+ "--follow", "--stats");
+ fprintf(stderr, _("Try \"%s --help\" for more information.\n"),
+ progname);
+ exit(1);
+ }
+
if (waldir != NULL)
{
/* validate path points to directory */
--
2.25.1
On Mon, Nov 15, 2021 at 07:32:38PM +0530, Bharath Rupireddy wrote:
pg_waldump options, --follow or -f(to keep polling once per second for
new WAL to appear) and --stats or -z don't work well together i.e. the
command stucks [1]. I think the pg_waldump should emit an error. Note
that the pg_basebakup does error out on incompatible options.Here's a small patch for fixing above along with a note in the documentation.
Thoughts?
I don't think that we should block this combination of options as you
are proposing. The existing behavior is useful for users when it
comes to an end position specified with -e, to be able to gather some
stats on a cluster or an archive where we may not have all the
contents wanted yet.
[1] The following commands stuck:
./pg_waldump -p data/ -s 0/7000060 -f -z
./pg_waldump -p data/ -s 0/7000060 -f --stats='record'
./pg_waldump -p data/ -s 0/7000060 -f --stats='rmgr'
Saying that, you are not completely wrong either, as following
something while we won't print any stats at all is not really
helpful. Another thing I can think of here is to make pg_waldump
more responsive to the dump of the stats when interrupted, via
XLogDumpDisplayStats().
--
Michael
On Tue, Nov 16, 2021 at 8:26 AM Michael Paquier <michael@paquier.xyz> wrote:
On Mon, Nov 15, 2021 at 07:32:38PM +0530, Bharath Rupireddy wrote:
pg_waldump options, --follow or -f(to keep polling once per second for
new WAL to appear) and --stats or -z don't work well together i.e. the
command stucks [1]. I think the pg_waldump should emit an error. Note
that the pg_basebakup does error out on incompatible options.Here's a small patch for fixing above along with a note in the documentation.
Thoughts?
I don't think that we should block this combination of options as you
are proposing. The existing behavior is useful for users when it
comes to an end position specified with -e, to be able to gather some
stats on a cluster or an archive where we may not have all the
contents wanted yet.
You are right. The "./pg_waldump -p data/ -s 0/14CC090 -e 0/14CC390 -f
-z" would fail with what I proposed which isn't a good thing at all.
Should we block both the combinations "-s/-f/-z", "-s/-e/-f/-z"?
[1] The following commands stuck:
./pg_waldump -p data/ -s 0/7000060 -f -z
./pg_waldump -p data/ -s 0/7000060 -f --stats='record'
./pg_waldump -p data/ -s 0/7000060 -f --stats='rmgr'Saying that, you are not completely wrong either, as following
something while we won't print any stats at all is not really
helpful. Another thing I can think of here is to make pg_waldump
more responsive to the dump of the stats when interrupted, via
XLogDumpDisplayStats().
It looks okay to be more responsive with the -f option so that the
above commands keep emitting stats for every 1 second(the --follow
option waits for 1 second). Should we really be emitting stats for
every 1 second? Isn't there going to be too much information on the
stdout? Or should we be emitting the stats for every 5 or 10 seconds?
If we be more responsive and keep emitting stats per 1 or 5 or 10 etc.
seconds(we can give an option to the user to choose when he/she would
like to see the stats with -f option, but this is going to be an
overkill IMO), can the user make anything out of those loads of stats
getting emitted? Another idea is to give some incremental stats but it
is overkill again IMO. And when the pg_waldump has the capability to
emit the stats, why to block it?
In summary, we have the following options:
1) block the combinations "-s/-f/-z", "-s/-e/-f/-z"
2) be more responsive and keep emitting the stats per 1 sec default
with -f option
3) be more responsive and keep emitting the stats per user's choice
of seconds (a new option that can be used with the -s/-e/-f/-z).
Thoughts?
Regards,
Bharath Rupireddy.
On Tue, Nov 16, 2021 at 09:34:25AM +0530, Bharath Rupireddy wrote:
It looks okay to be more responsive with the -f option so that the
above commands keep emitting stats for every 1 second(the --follow
option waits for 1 second). Should we really be emitting stats for
every 1 second? Isn't there going to be too much information on the
stdout? Or should we be emitting the stats for every 5 or 10 seconds?
I don't have a perfect answer to this question, but dumping the stats
with a fixed frequency is not going to be useful if we don't have in
those logs a current timestamp and/or a current LSN. This is
basically about how much we want to call XLogDumpDisplayStats() and
how useful it is, but note that my argument is a bit different than
what you are describing here: we could try to make
XLogDumpDisplayStats() responsive on SIGINT or SIGTERM to show
statistics reports. This way, a --follow without an --end LSN
specified could still provide some information rather than nothing.
That could also be useful if defining an --end but interrupting the
call.
At the same time, we could also just let things as they are. --follow
and --stats being specified together is what the user looked for, so
they get what they wanted.
In summary, we have the following options:
1) block the combinations "-s/-f/-z", "-s/-e/-f/-z"
2) be more responsive and keep emitting the stats per 1 sec default
with -f option
3) be more responsive and keep emitting the stats per user's choice
of seconds (a new option that can be used with the -s/-e/-f/-z).
A frequency cannot be measured only in time here, but also in bytes in
terms of a minimum amount of WAL replayed between two reports. I
don't like much the idea of multiple stats reports emitted in a single
pg_waldump call, TBH. This makes things more complicated than they
should, and the gain is not obvious, at least to me.
--
Michael
On Wed, Nov 17, 2021 at 7:49 AM Michael Paquier <michael@paquier.xyz> wrote:
At the same time, we could also just let things as they are. --follow
and --stats being specified together is what the user looked for, so
they get what they wanted.
I think the existing way of pg_waldump getting stuck with the
combination of options "-s/-f/-z" and "-s/-e/-f/-z" doesn't look good
for an end user. In the simplest way, the pg_waldump can just emit an
error for these combinations. That shouldn't a big problem as we do
emit errors for a lot of mutually exclusive combination of options
other SQL statements/commands.
Regards,
Bharath Rupireddy.
On 2021-Nov-17, Bharath Rupireddy wrote:
On Wed, Nov 17, 2021 at 7:49 AM Michael Paquier <michael@paquier.xyz> wrote:
At the same time, we could also just let things as they are. --follow
and --stats being specified together is what the user looked for, so
they get what they wanted.I think the existing way of pg_waldump getting stuck with the
combination of options "-s/-f/-z" and "-s/-e/-f/-z" doesn't look good
for an end user.
I agree that it's not right, but I don't think the solution is to ban
the combination. I think what we should do is change the behavior to
make the combination potentially useful, so that it waits until program
termination and print the summary then. Consider "strace -c" as a
precedent: it will also "become stuck" until you kill it, and then it'll
print a nice table, just like the pg_waldump -z gives you.
I think I would even have had occasion to use this as a feature in the
past ...
--
Álvaro Herrera 39°49'30"S 73°17'W — https://www.EnterpriseDB.com/
On Thu, Nov 18, 2021 at 12:05 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
On 2021-Nov-17, Bharath Rupireddy wrote:
On Wed, Nov 17, 2021 at 7:49 AM Michael Paquier <michael@paquier.xyz> wrote:
At the same time, we could also just let things as they are. --follow
and --stats being specified together is what the user looked for, so
they get what they wanted.I think the existing way of pg_waldump getting stuck with the
combination of options "-s/-f/-z" and "-s/-e/-f/-z" doesn't look good
for an end user.I agree that it's not right, but I don't think the solution is to ban
the combination. I think what we should do is change the behavior to
make the combination potentially useful, so that it waits until program
termination and print the summary then. Consider "strace -c" as a
precedent: it will also "become stuck" until you kill it, and then it'll
print a nice table, just like the pg_waldump -z gives you.I think I would even have had occasion to use this as a feature in the
past ...
Hm. So, the pg_waldump can have handlers for SIGINT, SIGTERM, SIGQUIT
and then it should emit the computed stats in those handlers the
comobinations - "-s/-f/-z" and "-s/-e/-f/-z". I'm okay with this
behaviour. Michael Paquier had suggested the same upthread. If okay, I
will work on that patch.
Thoughts?
Regards,
Bharath Rupireddy.
On Thu, Nov 18, 2021 at 01:32:08PM +0530, Bharath Rupireddy wrote:
Hm. So, the pg_waldump can have handlers for SIGINT, SIGTERM, SIGQUIT
and then it should emit the computed stats in those handlers the
comobinations - "-s/-f/-z" and "-s/-e/-f/-z". I'm okay with this
behaviour. Michael Paquier had suggested the same upthread. If okay, I
will work on that patch.
While on it, I am pretty sure that we should add in the stats report
the start and end LSNs matching with the reports. If you use --follow
and the call is interrupted, we would not really know to which part of
the WAL stream a report applies to without this information, likely in
the shape of an extra header.
--
Michael
On Thu, Nov 18, 2021 at 1:51 PM Michael Paquier <michael@paquier.xyz> wrote:
On Thu, Nov 18, 2021 at 01:32:08PM +0530, Bharath Rupireddy wrote:
Hm. So, the pg_waldump can have handlers for SIGINT, SIGTERM, SIGQUIT
and then it should emit the computed stats in those handlers the
comobinations - "-s/-f/-z" and "-s/-e/-f/-z". I'm okay with this
behaviour. Michael Paquier had suggested the same upthread. If okay, I
will work on that patch.While on it, I am pretty sure that we should add in the stats report
the start and end LSNs matching with the reports. If you use --follow
and the call is interrupted, we would not really know to which part of
the WAL stream a report applies to without this information, likely in
the shape of an extra header.
Currently, the SIGINT, SIGTERM, SIGQUIT interruptions will cause the
pg_waldump process with --follow option to get killed. And it is a good
idea to specify the range of LSNs (start and end) upto which the stats are
computed. Maybe as another printf message at the end of the stats report,
after the "Total" section or at the beginning of the stats report.
Something like "summary of the statistics computed from << LSN >> to
<<LSN>> are:", see [1]ubuntu3:~/postgres/src/bin/pg_waldump$ ./pg_waldump -p data -s 0/1480000 --stats summary of the statistics computed from 0/1480000 and 0/158ABCD are: Type N (%) Record size (%) FPI size (%) Combined size (%) ---- - --- ----------- --- -------- --- ------------- --- XLOG 13 ( 0.13) 1138 ( 0.18) 8424 ( 2.46) 9562 ( 0.99) Transaction 14 ( 0.14) 1407 ( 0.22) 0 ( 0.00) 1407 ( 0.15) Storage 4 ( 0.04) 172 ( 0.03) 0 ( 0.00) 172 ( 0.02) CLOG 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) Database 1 ( 0.01) 42 ( 0.01) 0 ( 0.00) 42 ( 0.00) Tablespace 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) MultiXact 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) RelMap 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) Standby 37 ( 0.36) 2318 ( 0.37) 0 ( 0.00) 2318 ( 0.24) Heap2 74 ( 0.73) 12841 ( 2.05) 121184 ( 35.46) 134025 ( 13.84) Heap 9005 ( 88.23) 532313 ( 84.91) 84208 ( 24.64) 616521 ( 63.64) Btree 1058 ( 10.37) 76710 ( 12.24) 127932 ( 37.43) 204642 ( 21.13) Hash 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) Gin 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) Gist 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) Sequence 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) SPGist 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) BRIN 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) CommitTs 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) ReplicationOrigin 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) Generic 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) LogicalMessage 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) -------- -------- -------- -------- Total 10206 626941 [64.72%] 341748 [35.28%] 968689 [100%].
[1]: ubuntu3:~/postgres/src/bin/pg_waldump$ ./pg_waldump -p data -s 0/1480000 --stats summary of the statistics computed from 0/1480000 and 0/158ABCD are: Type N (%) Record size (%) FPI size (%) Combined size (%) ---- - --- ----------- --- -------- --- ------------- --- XLOG 13 ( 0.13) 1138 ( 0.18) 8424 ( 2.46) 9562 ( 0.99) Transaction 14 ( 0.14) 1407 ( 0.22) 0 ( 0.00) 1407 ( 0.15) Storage 4 ( 0.04) 172 ( 0.03) 0 ( 0.00) 172 ( 0.02) CLOG 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) Database 1 ( 0.01) 42 ( 0.01) 0 ( 0.00) 42 ( 0.00) Tablespace 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) MultiXact 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) RelMap 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) Standby 37 ( 0.36) 2318 ( 0.37) 0 ( 0.00) 2318 ( 0.24) Heap2 74 ( 0.73) 12841 ( 2.05) 121184 ( 35.46) 134025 ( 13.84) Heap 9005 ( 88.23) 532313 ( 84.91) 84208 ( 24.64) 616521 ( 63.64) Btree 1058 ( 10.37) 76710 ( 12.24) 127932 ( 37.43) 204642 ( 21.13) Hash 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) Gin 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) Gist 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) Sequence 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) SPGist 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) BRIN 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) CommitTs 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) ReplicationOrigin 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) Generic 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) LogicalMessage 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) -------- -------- -------- -------- Total 10206 626941 [64.72%] 341748 [35.28%] 968689 [100%]
0/1480000 --stats
summary of the statistics computed from 0/1480000 and 0/158ABCD are:
Type N (%) Record
size (%) FPI size (%) Combined size (%)
---- - ---
----------- --- -------- --- -------------
---
XLOG 13 ( 0.13)
1138 ( 0.18) 8424 ( 2.46) 9562 ( 0.99)
Transaction 14 ( 0.14)
1407 ( 0.22) 0 ( 0.00) 1407 ( 0.15)
Storage 4 ( 0.04)
172 ( 0.03) 0 ( 0.00) 172 ( 0.02)
CLOG 0 ( 0.00)
0 ( 0.00) 0 ( 0.00) 0 ( 0.00)
Database 1 ( 0.01)
42 ( 0.01) 0 ( 0.00) 42 ( 0.00)
Tablespace 0 ( 0.00)
0 ( 0.00) 0 ( 0.00) 0 ( 0.00)
MultiXact 0 ( 0.00)
0 ( 0.00) 0 ( 0.00) 0 ( 0.00)
RelMap 0 ( 0.00)
0 ( 0.00) 0 ( 0.00) 0 ( 0.00)
Standby 37 ( 0.36)
2318 ( 0.37) 0 ( 0.00) 2318 ( 0.24)
Heap2 74 ( 0.73)
12841 ( 2.05) 121184 ( 35.46) 134025 ( 13.84)
Heap 9005 ( 88.23)
532313 ( 84.91) 84208 ( 24.64) 616521 ( 63.64)
Btree 1058 ( 10.37)
76710 ( 12.24) 127932 ( 37.43) 204642 ( 21.13)
Hash 0 ( 0.00)
0 ( 0.00) 0 ( 0.00) 0 ( 0.00)
Gin 0 ( 0.00)
0 ( 0.00) 0 ( 0.00) 0 ( 0.00)
Gist 0 ( 0.00)
0 ( 0.00) 0 ( 0.00) 0 ( 0.00)
Sequence 0 ( 0.00)
0 ( 0.00) 0 ( 0.00) 0 ( 0.00)
SPGist 0 ( 0.00)
0 ( 0.00) 0 ( 0.00) 0 ( 0.00)
BRIN 0 ( 0.00)
0 ( 0.00) 0 ( 0.00) 0 ( 0.00)
CommitTs 0 ( 0.00)
0 ( 0.00) 0 ( 0.00) 0 ( 0.00)
ReplicationOrigin 0 ( 0.00)
0 ( 0.00) 0 ( 0.00) 0 ( 0.00)
Generic 0 ( 0.00)
0 ( 0.00) 0 ( 0.00) 0 ( 0.00)
LogicalMessage 0 ( 0.00)
0 ( 0.00) 0 ( 0.00) 0 ( 0.00)
--------
-------- -------- --------
Total 10206
626941 [64.72%] 341748 [35.28%] 968689 [100%]
Regards,
Bharath Rupireddy.
On Thu, Nov 18, 2021 at 2:03 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
On Thu, Nov 18, 2021 at 1:51 PM Michael Paquier <michael@paquier.xyz> wrote:
On Thu, Nov 18, 2021 at 01:32:08PM +0530, Bharath Rupireddy wrote:
Hm. So, the pg_waldump can have handlers for SIGINT, SIGTERM, SIGQUIT
and then it should emit the computed stats in those handlers the
comobinations - "-s/-f/-z" and "-s/-e/-f/-z". I'm okay with this
behaviour. Michael Paquier had suggested the same upthread. If okay, I
will work on that patch.While on it, I am pretty sure that we should add in the stats report
the start and end LSNs matching with the reports. If you use --follow
and the call is interrupted, we would not really know to which part of
the WAL stream a report applies to without this information, likely in
the shape of an extra header.Currently, the SIGINT, SIGTERM, SIGQUIT interruptions will cause the pg_waldump process with --follow option to get killed. And it is a good idea to specify the range of LSNs (start and end) upto which the stats are computed. Maybe as another printf message at the end of the stats report, after the "Total" section or at the beginning of the stats report. Something like "summary of the statistics computed from << LSN >> to <<LSN>> are:", see [1].
Here's the v2 patch with the above changes i.e. pg_waldump now eimits
the computed stats at the time of termination. Please review it.
Regards,
Bharath Rupireddy.
Attachments:
v2-0001-pg_waldump-emit-stats-while-terminating.patchapplication/octet-stream; name=v2-0001-pg_waldump-emit-stats-while-terminating.patchDownload
From f879d58dd8beb8408563baf8b40634628b5f2086 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Sat, 20 Nov 2021 04:31:11 +0000
Subject: [PATCH v2] pg_waldump: emit stats while terminating
Currently, pg_waldump keeps accumulating the stats with options
--follow and --stats, but outputs nothing while terminating/exiting.
This patch adds signal handlers for SIGINT, SIGTERM and SIGQUIT to
display the stats computed as of the termination.
---
doc/src/sgml/ref/pg_waldump.sgml | 9 ++++
src/bin/pg_waldump/pg_waldump.c | 84 ++++++++++++++++++++++++--------
2 files changed, 74 insertions(+), 19 deletions(-)
diff --git a/doc/src/sgml/ref/pg_waldump.sgml b/doc/src/sgml/ref/pg_waldump.sgml
index 432254d2d5..cf75d1dd21 100644
--- a/doc/src/sgml/ref/pg_waldump.sgml
+++ b/doc/src/sgml/ref/pg_waldump.sgml
@@ -261,6 +261,15 @@ PostgreSQL documentation
<literal>.partial</literal>. If those files need to be read, <literal>.partial</literal>
suffix needs to be removed from the file name.
</para>
+
+ <para>
+ When <option>--follow</option> is used with <option>--stats</option> and
+ the <application>pg_waldump</application> is terminated or interrupted
+ with signal <systemitem>SIGINT</systemitem> or <systemitem>SIGTERM</systemitem>
+ or <systemitem>SIGQUIT</systemitem>, the summary statistics computed
+ as of the termination will be displayed.
+ </para>
+
</refsect1>
<refsect1>
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 1e3894b9c4..d7ec48a658 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -13,6 +13,7 @@
#include "postgres.h"
#include <dirent.h>
+#include <signal.h>
#include <sys/stat.h>
#include <unistd.h>
@@ -25,10 +26,8 @@
#include "getopt_long.h"
#include "rmgrdesc.h"
-static const char *progname;
-
-static int WalSegSz;
-
+#define MAX_XLINFO_TYPES 16
+#define fatal_error(...) do { pg_log_fatal(__VA_ARGS__); exit(EXIT_FAILURE); } while(0)
typedef struct XLogDumpPrivate
{
TimeLineID timeline;
@@ -62,8 +61,6 @@ typedef struct Stats
uint64 fpi_len;
} Stats;
-#define MAX_XLINFO_TYPES 16
-
typedef struct XLogDumpStats
{
uint64 count;
@@ -71,7 +68,35 @@ typedef struct XLogDumpStats
Stats record_stats[RM_NEXT_ID][MAX_XLINFO_TYPES];
} XLogDumpStats;
-#define fatal_error(...) do { pg_log_fatal(__VA_ARGS__); exit(EXIT_FAILURE); } while(0)
+static const char *progname;
+static int WalSegSz;
+static XLogDumpConfig *opts = NULL;
+static XLogDumpStats *stats = NULL;
+static XLogRecPtr stats_startptr = InvalidXLogRecPtr;
+static XLogRecPtr stats_endptr = InvalidXLogRecPtr;
+
+static void XLogDumpDisplayStats(void);
+
+static void
+SignalHandlerForTermination(int signum)
+{
+ /* Display the stats only if they are valid */
+ if (stats &&
+ !XLogRecPtrIsInvalid(stats_startptr) &&
+ !XLogRecPtrIsInvalid(stats_endptr))
+ {
+ XLogDumpDisplayStats();
+ pg_free(stats);
+ }
+
+ /*
+ * Although the pg_free calls (above and below) are unnecessary here on the
+ * process exit, let's not leak any memory knowingly.
+ */
+ pg_free(opts);
+
+ exit(EXIT_FAILURE);
+}
static void
print_rmgr_list(void)
@@ -410,8 +435,7 @@ XLogDumpRecordLen(XLogReaderState *record, uint32 *rec_len, uint32 *fpi_len)
* Store per-rmgr and per-record statistics for a given record.
*/
static void
-XLogDumpCountRecord(XLogDumpConfig *config, XLogDumpStats *stats,
- XLogReaderState *record)
+XLogDumpCountRecord(XLogReaderState *record)
{
RmgrId rmid;
uint8 recid;
@@ -457,7 +481,7 @@ XLogDumpCountRecord(XLogDumpConfig *config, XLogDumpStats *stats,
* Print a record to stdout
*/
static void
-XLogDumpDisplayRecord(XLogDumpConfig *config, XLogReaderState *record)
+XLogDumpDisplayRecord(XLogReaderState *record)
{
const char *id;
const RmgrDescData *desc = &RmgrDescTable[XLogRecGetRmid(record)];
@@ -491,7 +515,7 @@ XLogDumpDisplayRecord(XLogDumpConfig *config, XLogReaderState *record)
printf("%s", s.data);
pfree(s.data);
- if (!config->bkp_details)
+ if (!opts->bkp_details)
{
/* print block references (short format) */
for (block_id = 0; block_id <= record->max_block_id; block_id++)
@@ -621,7 +645,7 @@ XLogDumpStatsRow(const char *name,
* Display summary statistics about the records seen so far.
*/
static void
-XLogDumpDisplayStats(XLogDumpConfig *config, XLogDumpStats *stats)
+XLogDumpDisplayStats(void)
{
int ri,
rj;
@@ -645,6 +669,12 @@ XLogDumpDisplayStats(XLogDumpConfig *config, XLogDumpStats *stats)
}
total_len = total_rec_len + total_fpi_len;
+ Assert(!XLogRecPtrIsInvalid(stats_startptr) &&
+ !XLogRecPtrIsInvalid(stats_endptr) &&
+ stats_startptr <= stats_endptr);
+
+ printf("Summary of the WAL statistics computed between LSN %X/%X and LSN %X/%X is:\n",
+ LSN_FORMAT_ARGS(stats_startptr), LSN_FORMAT_ARGS(stats_endptr));
/*
* 27 is strlen("Transaction/COMMIT_PREPARED"), 20 is strlen(2^64), 8 is
* strlen("(100.00%)")
@@ -663,7 +693,7 @@ XLogDumpDisplayStats(XLogDumpConfig *config, XLogDumpStats *stats)
tot_len;
const RmgrDescData *desc = &RmgrDescTable[ri];
- if (!config->stats_per_record)
+ if (!opts->stats_per_record)
{
count = stats->rmgr_stats[ri].count;
rec_len = stats->rmgr_stats[ri].rec_len;
@@ -768,7 +798,6 @@ main(int argc, char **argv)
XLogReaderState *xlogreader_state;
XLogDumpPrivate private;
XLogDumpConfig config;
- XLogDumpStats stats;
XLogRecord *record;
XLogRecPtr first_record;
char *waldir = NULL;
@@ -794,6 +823,10 @@ main(int argc, char **argv)
int option;
int optindex = 0;
+ pqsignal(SIGINT, SignalHandlerForTermination);
+ pqsignal(SIGTERM, SignalHandlerForTermination);
+ pqsignal(SIGQUIT, SignalHandlerForTermination);
+
pg_logging_init(argv[0]);
set_pglocale_pgservice(argv[0], PG_TEXTDOMAIN("pg_waldump"));
progname = get_progname(argv[0]);
@@ -813,8 +846,8 @@ main(int argc, char **argv)
}
memset(&private, 0, sizeof(XLogDumpPrivate));
- memset(&config, 0, sizeof(XLogDumpConfig));
- memset(&stats, 0, sizeof(XLogDumpStats));
+ opts = (XLogDumpConfig *) pg_malloc0(sizeof(XLogDumpConfig));
+ config = *opts;
private.timeline = 1;
private.startptr = InvalidXLogRecPtr;
@@ -1084,6 +1117,12 @@ main(int argc, char **argv)
LSN_FORMAT_ARGS(first_record),
(uint32) (first_record - private.startptr));
+ if (config.stats == true && !config.quiet)
+ {
+ stats = (XLogDumpStats *) pg_malloc0(sizeof(XLogDumpStats));
+ stats_startptr = first_record;
+ }
+
for (;;)
{
/* try to read the next record */
@@ -1112,9 +1151,12 @@ main(int argc, char **argv)
if (!config.quiet)
{
if (config.stats == true)
- XLogDumpCountRecord(&config, &stats, xlogreader_state);
+ {
+ XLogDumpCountRecord(xlogreader_state);
+ stats_endptr = xlogreader_state->currRecPtr;
+ }
else
- XLogDumpDisplayRecord(&config, xlogreader_state);
+ XLogDumpDisplayRecord(xlogreader_state);
}
/* check whether we printed enough */
@@ -1125,7 +1167,7 @@ main(int argc, char **argv)
}
if (config.stats == true && !config.quiet)
- XLogDumpDisplayStats(&config, &stats);
+ XLogDumpDisplayStats();
if (errormsg)
fatal_error("error in WAL record at %X/%X: %s",
@@ -1134,9 +1176,13 @@ main(int argc, char **argv)
XLogReaderFree(xlogreader_state);
+ pg_free(opts);
+ pg_free(stats);
+
return EXIT_SUCCESS;
bad_argument:
fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
+ pg_free(opts);
return EXIT_FAILURE;
}
--
2.25.1
On Sat, Nov 20, 2021 at 10:03:27AM +0530, Bharath Rupireddy wrote:
Here's the v2 patch with the above changes i.e. pg_waldump now eimits
the computed stats at the time of termination. Please review it.
+#define MAX_XLINFO_TYPES 16
+#define fatal_error(...) do { pg_log_fatal(__VA_ARGS__);
exit(EXIT_FAILURE); } while(0)
typedef struct XLogDumpPrivate
Moving around those declarations does not relate to this patch, so I
would recommend to leave this out to ease reviews.
+ /*
+ * Although the pg_free calls (above and below) are unnecessary here on the
+ * process exit, let's not leak any memory knowingly.
+ */
+ pg_free(opts);
+
+ exit(EXIT_FAILURE);
There is no need to care about the two pg_free() calls here, so let's
remove all that.
+ pg_free(opts);
+ pg_free(stats);
+
return EXIT_SUCCESS;
[...]
bad_argument:
fprintf(stderr, _("Try \"%s --help\" for more information.\n"),
progname);
+ pg_free(opts);
return EXIT_FAILURE;
Same here.
+static void
+SignalHandlerForTermination(int signum)
+{
+ /* Display the stats only if they are valid */
+ if (stats &&
+ !XLogRecPtrIsInvalid(stats_startptr) &&
+ !XLogRecPtrIsInvalid(stats_endptr))
+ {
+ XLogDumpDisplayStats();
+ pg_free(stats);
+ }
+
+ /*
+ * Although the pg_free calls (above and below) are unnecessary here on the
+ * process exit, let's not leak any memory knowingly.
+ */
+ pg_free(opts);
+
+ exit(EXIT_FAILURE);
+}
Doing the dump of the stats followed by the exit() within the signal
handler callback is incorrect. The only safe thing that can be set
and manipulated in a signal handler is roughly a sig_atomic_t flag.
What you should do is to set such a flag in the signal handler, and
then check its value in the main loop of pg_waldump, dumping the stats
if it is detected as turned on because of a signal.
--
Michael
On Sat, Nov 20, 2021 at 11:10 AM Michael Paquier <michael@paquier.xyz> wrote:
What you should do is to set such a flag in the signal handler, and
then check its value in the main loop of pg_waldump, dumping the stats
if it is detected as turned on because of a signal.
Thanks. Here's the v3 patch, a much simpler one. Please review it.
Regards,
Bharath Rupireddy.
Attachments:
v3-0001-pg_waldump-emit-stats-while-terminating.patchapplication/octet-stream; name=v3-0001-pg_waldump-emit-stats-while-terminating.patchDownload
From 2a9bbeb0dd8cab27738e31e20c2c3f4ca735e988 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Sat, 20 Nov 2021 06:14:58 +0000
Subject: [PATCH v3] pg_waldump: emit stats while terminating
Currently, pg_waldump keeps accumulating the stats with options
--follow and --stats, but outputs nothing while terminating/exiting.
This patch adds signal handlers for SIGINT, SIGTERM and SIGQUIT to
display the stats computed as of the termination.
---
doc/src/sgml/ref/pg_waldump.sgml | 9 +++++++++
src/bin/pg_waldump/pg_waldump.c | 32 ++++++++++++++++++++++++++++++++
2 files changed, 41 insertions(+)
diff --git a/doc/src/sgml/ref/pg_waldump.sgml b/doc/src/sgml/ref/pg_waldump.sgml
index 432254d2d5..cf75d1dd21 100644
--- a/doc/src/sgml/ref/pg_waldump.sgml
+++ b/doc/src/sgml/ref/pg_waldump.sgml
@@ -261,6 +261,15 @@ PostgreSQL documentation
<literal>.partial</literal>. If those files need to be read, <literal>.partial</literal>
suffix needs to be removed from the file name.
</para>
+
+ <para>
+ When <option>--follow</option> is used with <option>--stats</option> and
+ the <application>pg_waldump</application> is terminated or interrupted
+ with signal <systemitem>SIGINT</systemitem> or <systemitem>SIGTERM</systemitem>
+ or <systemitem>SIGQUIT</systemitem>, the summary statistics computed
+ as of the termination will be displayed.
+ </para>
+
</refsect1>
<refsect1>
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 1e3894b9c4..a3d80feda0 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -13,6 +13,7 @@
#include "postgres.h"
#include <dirent.h>
+#include <signal.h>
#include <sys/stat.h>
#include <unistd.h>
@@ -28,6 +29,7 @@
static const char *progname;
static int WalSegSz;
+static volatile sig_atomic_t TerminationRequestPending = false;
typedef struct XLogDumpPrivate
{
@@ -67,12 +69,20 @@ typedef struct Stats
typedef struct XLogDumpStats
{
uint64 count;
+ XLogRecPtr startptr;
+ XLogRecPtr endptr;
Stats rmgr_stats[RM_NEXT_ID];
Stats record_stats[RM_NEXT_ID][MAX_XLINFO_TYPES];
} XLogDumpStats;
#define fatal_error(...) do { pg_log_fatal(__VA_ARGS__); exit(EXIT_FAILURE); } while(0)
+static void
+SignalHandlerForTermination(int signum)
+{
+ TerminationRequestPending = true;
+}
+
static void
print_rmgr_list(void)
{
@@ -645,6 +655,9 @@ XLogDumpDisplayStats(XLogDumpConfig *config, XLogDumpStats *stats)
}
total_len = total_rec_len + total_fpi_len;
+ printf("Summary of the WAL statistics computed between LSN %X/%X and LSN %X/%X is:\n",
+ LSN_FORMAT_ARGS(stats->startptr), LSN_FORMAT_ARGS(stats->endptr));
+
/*
* 27 is strlen("Transaction/COMMIT_PREPARED"), 20 is strlen(2^64), 8 is
* strlen("(100.00%)")
@@ -794,6 +807,10 @@ main(int argc, char **argv)
int option;
int optindex = 0;
+ pqsignal(SIGINT, SignalHandlerForTermination);
+ pqsignal(SIGTERM, SignalHandlerForTermination);
+ pqsignal(SIGQUIT, SignalHandlerForTermination);
+
pg_logging_init(argv[0]);
set_pglocale_pgservice(argv[0], PG_TEXTDOMAIN("pg_waldump"));
progname = get_progname(argv[0]);
@@ -833,6 +850,9 @@ main(int argc, char **argv)
config.stats = false;
config.stats_per_record = false;
+ stats.startptr = InvalidXLogRecPtr;
+ stats.endptr = InvalidXLogRecPtr;
+
if (argc <= 1)
{
pg_log_error("no arguments specified");
@@ -1084,8 +1104,14 @@ main(int argc, char **argv)
LSN_FORMAT_ARGS(first_record),
(uint32) (first_record - private.startptr));
+ if (config.stats == true && !config.quiet)
+ stats.startptr = first_record;
+
for (;;)
{
+ if (TerminationRequestPending)
+ break;
+
/* try to read the next record */
record = XLogReadRecord(xlogreader_state, &errormsg);
if (!record)
@@ -1112,7 +1138,10 @@ main(int argc, char **argv)
if (!config.quiet)
{
if (config.stats == true)
+ {
XLogDumpCountRecord(&config, &stats, xlogreader_state);
+ stats.endptr = xlogreader_state->currRecPtr;
+ }
else
XLogDumpDisplayRecord(&config, xlogreader_state);
}
@@ -1127,6 +1156,9 @@ main(int argc, char **argv)
if (config.stats == true && !config.quiet)
XLogDumpDisplayStats(&config, &stats);
+ if (TerminationRequestPending)
+ return EXIT_FAILURE;
+
if (errormsg)
fatal_error("error in WAL record at %X/%X: %s",
LSN_FORMAT_ARGS(xlogreader_state->ReadRecPtr),
--
2.25.1
On Sat, Nov 20, 2021 at 11:46:35AM +0530, Bharath Rupireddy wrote:
Thanks. Here's the v3 patch, a much simpler one. Please review it.
+ pqsignal(SIGINT, SignalHandlerForTermination);
+ pqsignal(SIGTERM, SignalHandlerForTermination);
+ pqsignal(SIGQUIT, SignalHandlerForTermination);
FWIW, I think that we should do this stuff only on SIGINT. I would
imagine that this behavior becomes handy mainly when one wishes to
Ctrl+C the terminal running pg_waldump but still get some
information.
XLogDumpCountRecord(&config, &stats, xlogreader_state);
+ stats.endptr = xlogreader_state->currRecPtr;
Isn't what you are looking for here EndRecPtr rather than currRecPtr,
to track the end of the last record read?
+ When <option>--follow</option> is used with <option>--stats</option> and
+ the <application>pg_waldump</application> is terminated or interrupted
+ with signal <systemitem>SIGINT</systemitem> or <systemitem>SIGTERM</systemitem>
+ or <systemitem>SIGQUIT</systemitem>, the summary statistics computed
+ as of the termination will be displayed.
This description is not completely correct, as the set of stats would
show up only by using --stats, in non-quiet mode. Rather than
describing this behavior at the end of the docs, I think that it would
be better to add a new paragraph in the description of --stats.
--
Michael
On Fri, Nov 26, 2021 at 11:50 AM Michael Paquier <michael@paquier.xyz> wrote:
On Sat, Nov 20, 2021 at 11:46:35AM +0530, Bharath Rupireddy wrote:
Thanks. Here's the v3 patch, a much simpler one. Please review it.
+ pqsignal(SIGINT, SignalHandlerForTermination); + pqsignal(SIGTERM, SignalHandlerForTermination); + pqsignal(SIGQUIT, SignalHandlerForTermination); FWIW, I think that we should do this stuff only on SIGINT. I would imagine that this behavior becomes handy mainly when one wishes to Ctrl+C the terminal running pg_waldump but still get some information.
Done.
XLogDumpCountRecord(&config, &stats, xlogreader_state);
+ stats.endptr = xlogreader_state->currRecPtr;
Isn't what you are looking for here EndRecPtr rather than currRecPtr,
to track the end of the last record read?
You are right.
+ When <option>--follow</option> is used with <option>--stats</option> and + the <application>pg_waldump</application> is terminated or interrupted + with signal <systemitem>SIGINT</systemitem> or <systemitem>SIGTERM</systemitem> + or <systemitem>SIGQUIT</systemitem>, the summary statistics computed + as of the termination will be displayed. This description is not completely correct, as the set of stats would show up only by using --stats, in non-quiet mode. Rather than describing this behavior at the end of the docs, I think that it would be better to add a new paragraph in the description of --stats.
Done.
PSA v4 patch.
Regards,
Bharath Rupireddy.
Attachments:
v4-0001-pg_waldump-emit-stats-while-terminating.patchapplication/octet-stream; name=v4-0001-pg_waldump-emit-stats-while-terminating.patchDownload
From 7cb7a06b7b32eaa2c76696eab8933e67e6789b92 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Fri, 26 Nov 2021 10:15:58 +0000
Subject: [PATCH v4] pg_waldump: emit stats while terminating
Currently, pg_waldump keeps accumulating the stats with options
--follow and --stats, but outputs nothing while terminating/exiting.
This patch adds signal handler for SIGINT to display the stats
computed as of the termination.
---
doc/src/sgml/ref/pg_waldump.sgml | 9 +++++++++
src/bin/pg_waldump/pg_waldump.c | 30 ++++++++++++++++++++++++++++++
2 files changed, 39 insertions(+)
diff --git a/doc/src/sgml/ref/pg_waldump.sgml b/doc/src/sgml/ref/pg_waldump.sgml
index 432254d2d5..5fd6e1ef52 100644
--- a/doc/src/sgml/ref/pg_waldump.sgml
+++ b/doc/src/sgml/ref/pg_waldump.sgml
@@ -202,6 +202,15 @@ PostgreSQL documentation
full-page images) instead of individual records. Optionally
generate statistics per-record instead of per-rmgr.
</para>
+
+ <para>
+ When <option>--follow</option> is used with <option>--stats</option>
+ and the <application>pg_waldump</application> is terminated or
+ interrupted with signal <systemitem>SIGINT</systemitem>, the summary
+ statistics computed as of the termination will be displayed in
+ non-quiet mode or without <option>--quiet</option>.
+ </para>
+
</listitem>
</varlistentry>
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 4690e0f515..699b7d567b 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -13,6 +13,7 @@
#include "postgres.h"
#include <dirent.h>
+#include <signal.h>
#include <sys/stat.h>
#include <unistd.h>
@@ -28,6 +29,7 @@
static const char *progname;
static int WalSegSz;
+static volatile sig_atomic_t time_to_stop = false;
typedef struct XLogDumpPrivate
{
@@ -67,12 +69,20 @@ typedef struct Stats
typedef struct XLogDumpStats
{
uint64 count;
+ XLogRecPtr startptr;
+ XLogRecPtr endptr;
Stats rmgr_stats[RM_NEXT_ID];
Stats record_stats[RM_NEXT_ID][MAX_XLINFO_TYPES];
} XLogDumpStats;
#define fatal_error(...) do { pg_log_fatal(__VA_ARGS__); exit(EXIT_FAILURE); } while(0)
+static void
+sigint_handler(int signum)
+{
+ time_to_stop = true;
+}
+
static void
print_rmgr_list(void)
{
@@ -645,6 +655,9 @@ XLogDumpDisplayStats(XLogDumpConfig *config, XLogDumpStats *stats)
}
total_len = total_rec_len + total_fpi_len;
+ printf("Summary of the WAL statistics computed between LSN %X/%X and LSN %X/%X is:\n",
+ LSN_FORMAT_ARGS(stats->startptr), LSN_FORMAT_ARGS(stats->endptr));
+
/*
* 27 is strlen("Transaction/COMMIT_PREPARED"), 20 is strlen(2^64), 8 is
* strlen("(100.00%)")
@@ -794,6 +807,8 @@ main(int argc, char **argv)
int option;
int optindex = 0;
+ pqsignal(SIGINT, sigint_handler);
+
pg_logging_init(argv[0]);
set_pglocale_pgservice(argv[0], PG_TEXTDOMAIN("pg_waldump"));
progname = get_progname(argv[0]);
@@ -833,6 +848,9 @@ main(int argc, char **argv)
config.stats = false;
config.stats_per_record = false;
+ stats.startptr = InvalidXLogRecPtr;
+ stats.endptr = InvalidXLogRecPtr;
+
if (argc <= 1)
{
pg_log_error("no arguments specified");
@@ -1084,8 +1102,14 @@ main(int argc, char **argv)
LSN_FORMAT_ARGS(first_record),
(uint32) (first_record - private.startptr));
+ if (config.stats == true && !config.quiet)
+ stats.startptr = first_record;
+
for (;;)
{
+ if (time_to_stop)
+ break;
+
/* try to read the next record */
record = XLogReadRecord(xlogreader_state, &errormsg);
if (!record)
@@ -1112,7 +1136,10 @@ main(int argc, char **argv)
if (!config.quiet)
{
if (config.stats == true)
+ {
XLogDumpCountRecord(&config, &stats, xlogreader_state);
+ stats.endptr = xlogreader_state->EndRecPtr;
+ }
else
XLogDumpDisplayRecord(&config, xlogreader_state);
}
@@ -1127,6 +1154,9 @@ main(int argc, char **argv)
if (config.stats == true && !config.quiet)
XLogDumpDisplayStats(&config, &stats);
+ if (time_to_stop)
+ return EXIT_FAILURE;
+
if (errormsg)
fatal_error("error in WAL record at %X/%X: %s",
LSN_FORMAT_ARGS(xlogreader_state->ReadRecPtr),
--
2.25.1
On Fri, Nov 26, 2021 at 03:47:30PM +0530, Bharath Rupireddy wrote:
On Fri, Nov 26, 2021 at 11:50 AM Michael Paquier <michael@paquier.xyz> wrote:
+ When <option>--follow</option> is used with <option>--stats</option> and + the <application>pg_waldump</application> is terminated or interrupted + with signal <systemitem>SIGINT</systemitem> or <systemitem>SIGTERM</systemitem> + or <systemitem>SIGQUIT</systemitem>, the summary statistics computed + as of the termination will be displayed. This description is not completely correct, as the set of stats would show up only by using --stats, in non-quiet mode. Rather than describing this behavior at the end of the docs, I think that it would be better to add a new paragraph in the description of --stats.Done.
v4 is just moving this paragraph in its correct subsection. My
wording may have been confusing here, sorry about that. What I meant
is that there is no need to mention --follow at all, as an
interruption done before reaching the end LSN or the end of WAL would
still print out the stats accumulated up to the interrupted point.
--
Michael
On Sun, Nov 28, 2021 at 9:50 AM Michael Paquier <michael@paquier.xyz> wrote:
v4 is just moving this paragraph in its correct subsection. My
wording may have been confusing here, sorry about that. What I meant
is that there is no need to mention --follow at all, as an
interruption done before reaching the end LSN or the end of WAL would
still print out the stats accumulated up to the interrupted point.
Thanks. Here's the v5.
Regards,
Bharath Rupireddy.
Attachments:
v5-0001-pg_waldump-emit-stats-when-interrupted-with-SIGIN.patchapplication/octet-stream; name=v5-0001-pg_waldump-emit-stats-when-interrupted-with-SIGIN.patchDownload
From 29c3daacd8032e25e671422648cf4afc839a9c76 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Sun, 28 Nov 2021 06:27:05 +0000
Subject: [PATCH v5] pg_waldump: emit stats when interrupted with SIGINT
Currently, pg_waldump doesn't display WAL statistics if it gets
interrupted/terminated with SIGINT before reaching the end of WAL
although it accumulates statistics as of termination. This
behavior makes --stats with --follow option to appear as
blocking or hang even though the statistics are computed in the
background by the pg_waldump.
This patch traps the SIGINT signal and prints the accumulated
statistics as of the termination.
---
doc/src/sgml/ref/pg_waldump.sgml | 9 +++++++++
src/bin/pg_waldump/pg_waldump.c | 30 ++++++++++++++++++++++++++++++
2 files changed, 39 insertions(+)
diff --git a/doc/src/sgml/ref/pg_waldump.sgml b/doc/src/sgml/ref/pg_waldump.sgml
index 432254d2d5..1a48288041 100644
--- a/doc/src/sgml/ref/pg_waldump.sgml
+++ b/doc/src/sgml/ref/pg_waldump.sgml
@@ -202,6 +202,15 @@ PostgreSQL documentation
full-page images) instead of individual records. Optionally
generate statistics per-record instead of per-rmgr.
</para>
+
+ <para>
+ If the <application>pg_waldump</application> is terminated or
+ interrupted with signal <systemitem>SIGINT</systemitem> while
+ calculating the statistics, the summary statistics computed as
+ of the termination will be displayed in non-quiet mode or without
+ <option>--quiet</option>.
+ </para>
+
</listitem>
</varlistentry>
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 4690e0f515..699b7d567b 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -13,6 +13,7 @@
#include "postgres.h"
#include <dirent.h>
+#include <signal.h>
#include <sys/stat.h>
#include <unistd.h>
@@ -28,6 +29,7 @@
static const char *progname;
static int WalSegSz;
+static volatile sig_atomic_t time_to_stop = false;
typedef struct XLogDumpPrivate
{
@@ -67,12 +69,20 @@ typedef struct Stats
typedef struct XLogDumpStats
{
uint64 count;
+ XLogRecPtr startptr;
+ XLogRecPtr endptr;
Stats rmgr_stats[RM_NEXT_ID];
Stats record_stats[RM_NEXT_ID][MAX_XLINFO_TYPES];
} XLogDumpStats;
#define fatal_error(...) do { pg_log_fatal(__VA_ARGS__); exit(EXIT_FAILURE); } while(0)
+static void
+sigint_handler(int signum)
+{
+ time_to_stop = true;
+}
+
static void
print_rmgr_list(void)
{
@@ -645,6 +655,9 @@ XLogDumpDisplayStats(XLogDumpConfig *config, XLogDumpStats *stats)
}
total_len = total_rec_len + total_fpi_len;
+ printf("Summary of the WAL statistics computed between LSN %X/%X and LSN %X/%X is:\n",
+ LSN_FORMAT_ARGS(stats->startptr), LSN_FORMAT_ARGS(stats->endptr));
+
/*
* 27 is strlen("Transaction/COMMIT_PREPARED"), 20 is strlen(2^64), 8 is
* strlen("(100.00%)")
@@ -794,6 +807,8 @@ main(int argc, char **argv)
int option;
int optindex = 0;
+ pqsignal(SIGINT, sigint_handler);
+
pg_logging_init(argv[0]);
set_pglocale_pgservice(argv[0], PG_TEXTDOMAIN("pg_waldump"));
progname = get_progname(argv[0]);
@@ -833,6 +848,9 @@ main(int argc, char **argv)
config.stats = false;
config.stats_per_record = false;
+ stats.startptr = InvalidXLogRecPtr;
+ stats.endptr = InvalidXLogRecPtr;
+
if (argc <= 1)
{
pg_log_error("no arguments specified");
@@ -1084,8 +1102,14 @@ main(int argc, char **argv)
LSN_FORMAT_ARGS(first_record),
(uint32) (first_record - private.startptr));
+ if (config.stats == true && !config.quiet)
+ stats.startptr = first_record;
+
for (;;)
{
+ if (time_to_stop)
+ break;
+
/* try to read the next record */
record = XLogReadRecord(xlogreader_state, &errormsg);
if (!record)
@@ -1112,7 +1136,10 @@ main(int argc, char **argv)
if (!config.quiet)
{
if (config.stats == true)
+ {
XLogDumpCountRecord(&config, &stats, xlogreader_state);
+ stats.endptr = xlogreader_state->EndRecPtr;
+ }
else
XLogDumpDisplayRecord(&config, xlogreader_state);
}
@@ -1127,6 +1154,9 @@ main(int argc, char **argv)
if (config.stats == true && !config.quiet)
XLogDumpDisplayStats(&config, &stats);
+ if (time_to_stop)
+ return EXIT_FAILURE;
+
if (errormsg)
fatal_error("error in WAL record at %X/%X: %s",
LSN_FORMAT_ARGS(xlogreader_state->ReadRecPtr),
--
2.25.1
On Sun, Nov 28, 2021 at 12:13:08PM +0530, Bharath Rupireddy wrote:
Thanks. Here's the v5.
By the way, one thing that I completely forgot here is that SIGINT is
not handled on Windows. If we want to make that work for a WIN32
terminal, we would need to do something similar to
src/fe_utils/cancel.c where we need to use SetConsoleCtrlHandler() and
handle the stats print when facing CTRL_C_EVENT or CTRL_BREAK_EVENT as
events. Perhaps we should try to think harder and have a more
centralized facility for the handler part between a WIN32 terminal and
SIGINT, as it is not the first time that we need this level of
handling. Or we could just discard this issue, document its WIN32
limitation and paint some "#ifdef WIN32" around all the handler
portions of the patch.
I would be fine with just doing the latter for now, as this stuff is
still useful for most users, but that's worth mentioning. Any
opinions?
--
Michael
On Mon, Nov 29, 2021 at 11:09 AM Michael Paquier <michael@paquier.xyz> wrote:
On Sun, Nov 28, 2021 at 12:13:08PM +0530, Bharath Rupireddy wrote:
Thanks. Here's the v5.
By the way, one thing that I completely forgot here is that SIGINT is
not handled on Windows. If we want to make that work for a WIN32
terminal, we would need to do something similar to
src/fe_utils/cancel.c where we need to use SetConsoleCtrlHandler() and
handle the stats print when facing CTRL_C_EVENT or CTRL_BREAK_EVENT as
events. Perhaps we should try to think harder and have a more
centralized facility for the handler part between a WIN32 terminal and
SIGINT, as it is not the first time that we need this level of
handling. Or we could just discard this issue, document its WIN32
limitation and paint some "#ifdef WIN32" around all the handler
portions of the patch.I would be fine with just doing the latter for now, as this stuff is
still useful for most users, but that's worth mentioning. Any
opinions?
I'm okay to have the same behaviour as pg_receivewal and pg_recvlogical tools.
On my Windows 64-bit setup (see below), the CTRL+C works and the
summary stats are printed as with Linux. I believe on 32-bit platforms
we don't have CTRL+C with SIGINT handling right? If yes, I'm not sure
how we go about mentioning it in the pg_waldump documentation. I see
that the pg_receivewal and pg_recvlogical don't mention anything in
the documentation even though their SIGINT handler is defined for
#ifndef WIN32 platforms.
postgres=# select version();
version
---------------------------------------------------------------
PostgreSQL 15devel, compiled by Visual C++ build 1916, 64-bit
(1 row)
postgres=#
Regards,
Bharath Rupireddy.
On Mon, Nov 29, 2021 at 1:16 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
On Mon, Nov 29, 2021 at 11:09 AM Michael Paquier <michael@paquier.xyz> wrote:
On Sun, Nov 28, 2021 at 12:13:08PM +0530, Bharath Rupireddy wrote:
Thanks. Here's the v5.
By the way, one thing that I completely forgot here is that SIGINT is
not handled on Windows. If we want to make that work for a WIN32
terminal, we would need to do something similar to
src/fe_utils/cancel.c where we need to use SetConsoleCtrlHandler() and
handle the stats print when facing CTRL_C_EVENT or CTRL_BREAK_EVENT as
events. Perhaps we should try to think harder and have a more
centralized facility for the handler part between a WIN32 terminal and
SIGINT, as it is not the first time that we need this level of
handling. Or we could just discard this issue, document its WIN32
limitation and paint some "#ifdef WIN32" around all the handler
portions of the patch.I would be fine with just doing the latter for now, as this stuff is
still useful for most users, but that's worth mentioning. Any
opinions?I'm okay to have the same behaviour as pg_receivewal and pg_recvlogical tools.
Here's the v6 patch that has the SIGINT handling enabled for non-WIN32
platforms (note that I haven't specified anything in the
documentation) much like pg_receivewal and pg_recvlogical.
Regards,
Bharath Rupireddy.
Attachments:
v6-0001-pg_waldump-emit-stats-when-interrupted-with-SIGIN.patchapplication/octet-stream; name=v6-0001-pg_waldump-emit-stats-when-interrupted-with-SIGIN.patchDownload
From 446aa222483af66f9c19b1d81e0195b7637c6bfc Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Wed, 1 Dec 2021 06:10:54 +0000
Subject: [PATCH v6] pg_waldump: emit stats when interrupted with SIGINT
Currently, pg_waldump doesn't display WAL statistics if it gets
interrupted/terminated with SIGINT before reaching the end of WAL
although it accumulates statistics as of termination. This
behavior makes --stats with --follow option to appear as
blocking or hang even though the statistics are computed in the
background by the pg_waldump.
This patch traps the SIGINT signal and prints the accumulated
statistics as of the termination.
---
doc/src/sgml/ref/pg_waldump.sgml | 9 +++++++++
src/bin/pg_waldump/pg_waldump.c | 34 ++++++++++++++++++++++++++++++++
2 files changed, 43 insertions(+)
diff --git a/doc/src/sgml/ref/pg_waldump.sgml b/doc/src/sgml/ref/pg_waldump.sgml
index 432254d2d5..1a48288041 100644
--- a/doc/src/sgml/ref/pg_waldump.sgml
+++ b/doc/src/sgml/ref/pg_waldump.sgml
@@ -202,6 +202,15 @@ PostgreSQL documentation
full-page images) instead of individual records. Optionally
generate statistics per-record instead of per-rmgr.
</para>
+
+ <para>
+ If the <application>pg_waldump</application> is terminated or
+ interrupted with signal <systemitem>SIGINT</systemitem> while
+ calculating the statistics, the summary statistics computed as
+ of the termination will be displayed in non-quiet mode or without
+ <option>--quiet</option>.
+ </para>
+
</listitem>
</varlistentry>
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 4690e0f515..98bfc820ab 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -13,6 +13,7 @@
#include "postgres.h"
#include <dirent.h>
+#include <signal.h>
#include <sys/stat.h>
#include <unistd.h>
@@ -28,6 +29,7 @@
static const char *progname;
static int WalSegSz;
+static volatile sig_atomic_t time_to_stop = false;
typedef struct XLogDumpPrivate
{
@@ -67,12 +69,22 @@ typedef struct Stats
typedef struct XLogDumpStats
{
uint64 count;
+ XLogRecPtr startptr;
+ XLogRecPtr endptr;
Stats rmgr_stats[RM_NEXT_ID];
Stats record_stats[RM_NEXT_ID][MAX_XLINFO_TYPES];
} XLogDumpStats;
#define fatal_error(...) do { pg_log_fatal(__VA_ARGS__); exit(EXIT_FAILURE); } while(0)
+#ifndef WIN32
+static void
+sigint_handler(int signum)
+{
+ time_to_stop = true;
+}
+#endif
+
static void
print_rmgr_list(void)
{
@@ -645,6 +657,9 @@ XLogDumpDisplayStats(XLogDumpConfig *config, XLogDumpStats *stats)
}
total_len = total_rec_len + total_fpi_len;
+ printf("Summary of the WAL statistics computed between LSN %X/%X and LSN %X/%X is:\n",
+ LSN_FORMAT_ARGS(stats->startptr), LSN_FORMAT_ARGS(stats->endptr));
+
/*
* 27 is strlen("Transaction/COMMIT_PREPARED"), 20 is strlen(2^64), 8 is
* strlen("(100.00%)")
@@ -794,6 +809,10 @@ main(int argc, char **argv)
int option;
int optindex = 0;
+#ifndef WIN32
+ pqsignal(SIGINT, sigint_handler);
+#endif
+
pg_logging_init(argv[0]);
set_pglocale_pgservice(argv[0], PG_TEXTDOMAIN("pg_waldump"));
progname = get_progname(argv[0]);
@@ -833,6 +852,9 @@ main(int argc, char **argv)
config.stats = false;
config.stats_per_record = false;
+ stats.startptr = InvalidXLogRecPtr;
+ stats.endptr = InvalidXLogRecPtr;
+
if (argc <= 1)
{
pg_log_error("no arguments specified");
@@ -1084,8 +1106,14 @@ main(int argc, char **argv)
LSN_FORMAT_ARGS(first_record),
(uint32) (first_record - private.startptr));
+ if (config.stats == true && !config.quiet)
+ stats.startptr = first_record;
+
for (;;)
{
+ if (time_to_stop)
+ break;
+
/* try to read the next record */
record = XLogReadRecord(xlogreader_state, &errormsg);
if (!record)
@@ -1112,7 +1140,10 @@ main(int argc, char **argv)
if (!config.quiet)
{
if (config.stats == true)
+ {
XLogDumpCountRecord(&config, &stats, xlogreader_state);
+ stats.endptr = xlogreader_state->EndRecPtr;
+ }
else
XLogDumpDisplayRecord(&config, xlogreader_state);
}
@@ -1127,6 +1158,9 @@ main(int argc, char **argv)
if (config.stats == true && !config.quiet)
XLogDumpDisplayStats(&config, &stats);
+ if (time_to_stop)
+ return EXIT_FAILURE;
+
if (errormsg)
fatal_error("error in WAL record at %X/%X: %s",
LSN_FORMAT_ARGS(xlogreader_state->ReadRecPtr),
--
2.25.1
On Wed, Dec 01, 2021 at 11:44:02AM +0530, Bharath Rupireddy wrote:
Here's the v6 patch that has the SIGINT handling enabled for non-WIN32
platforms (note that I haven't specified anything in the
documentation) much like pg_receivewal and pg_recvlogical.
I have added a safeguard in XLogDumpDisplayStats() to not print any
stats if the end LSN is not set yet, which would be possible if
pg_waldump is signaled at a very early stage, and I have added some
more comments. That looked fine after that, so applied.
--
Michael