Multiple pg_waldump --rmgr options
I wanted to dump all heap WAL records with pg_waldump, so I did this:
$ pg_waldump --rmgr=heap --rmgr=heap2 data/pg_wal/000000010000000000000001 --stat=record
Type N (%) Record size (%) FPI size (%) Combined size (%)
---- - --- ----------- --- -------- --- ------------- ---
Heap2/PRUNE 268 ( 8.74) 18192 ( 2.73) 0 ( 0.00) 18192 ( 1.74)
Heap2/VACUUM 55 ( 1.79) 4940 ( 0.74) 0 ( 0.00) 4940 ( 0.47)
Heap2/FREEZE_PAGE 277 ( 9.03) 186868 ( 28.03) 0 ( 0.00) 186868 ( 17.86)
Heap2/VISIBLE 467 ( 15.23) 27783 ( 4.17) 376832 ( 99.34) 404615 ( 38.68)
Heap2/MULTI_INSERT 1944 ( 63.38) 354800 ( 53.21) 2520 ( 0.66) 357320 ( 34.16)
Heap2/MULTI_INSERT+INIT 56 ( 1.83) 74152 ( 11.12) 0 ( 0.00) 74152 ( 7.09)
-------- -------- -------- --------
Total 3067 666735 [63.74%] 379352 [36.26%] 1046087 [100%]
pg_waldump: fatal: error in WAL record at 0/1680118: invalid record length at 0/1680150: wanted 24, got 0
That didn't do what I wanted. It only printed the Heap2 records, not
Heap, even though I specified both. The reason is that if you specify
multiple --rmgr options, only the last one takes effect.
I propose the attached to allow selecting multiple rmgrs, by giving
multiple --rmgr options. With that, it works the way I expected:
$ pg_waldump --rmgr=heap --rmgr=heap2 data/pg_wal/000000010000000000000001 --stat=record
Type N (%) Record size (%) FPI size (%) Combined size (%)
---- - --- ----------- --- -------- --- ------------- ---
Heap2/PRUNE 268 ( 1.77) 18192 ( 0.71) 0 ( 0.00) 18192 ( 0.55)
Heap2/VACUUM 55 ( 0.36) 4940 ( 0.19) 0 ( 0.00) 4940 ( 0.15)
Heap2/FREEZE_PAGE 277 ( 1.83) 186868 ( 7.33) 0 ( 0.00) 186868 ( 5.67)
Heap2/VISIBLE 467 ( 3.09) 27783 ( 1.09) 376832 ( 50.37) 404615 ( 12.27)
Heap2/MULTI_INSERT 1944 ( 12.86) 354800 ( 13.91) 2520 ( 0.34) 357320 ( 10.83)
Heap2/MULTI_INSERT+INIT 56 ( 0.37) 74152 ( 2.91) 0 ( 0.00) 74152 ( 2.25)
Heap/INSERT 9948 ( 65.80) 1433891 ( 56.22) 8612 ( 1.15) 1442503 ( 43.73)
Heap/DELETE 942 ( 6.23) 50868 ( 1.99) 0 ( 0.00) 50868 ( 1.54)
Heap/UPDATE 193 ( 1.28) 101265 ( 3.97) 9556 ( 1.28) 110821 ( 3.36)
Heap/HOT_UPDATE 349 ( 2.31) 36910 ( 1.45) 1300 ( 0.17) 38210 ( 1.16)
Heap/LOCK 209 ( 1.38) 11481 ( 0.45) 316828 ( 42.35) 328309 ( 9.95)
Heap/INPLACE 212 ( 1.40) 44279 ( 1.74) 32444 ( 4.34) 76723 ( 2.33)
Heap/INSERT+INIT 184 ( 1.22) 188803 ( 7.40) 0 ( 0.00) 188803 ( 5.72)
Heap/UPDATE+INIT 15 ( 0.10) 16273 ( 0.64) 0 ( 0.00) 16273 ( 0.49)
-------- -------- -------- --------
Total 15119 2550505 [77.32%] 748092 [22.68%] 3298597 [100%]
pg_waldump: fatal: error in WAL record at 0/1680150: invalid record length at 0/16801C8: wanted 24, got 0
- Heikki
Attachments:
0001-Allow-specifying-pg_waldump-rmgr-option-multiple-tim.patchtext/x-patch; charset=UTF-8; name=0001-Allow-specifying-pg_waldump-rmgr-option-multiple-tim.patchDownload
From 991296f690d79a12670f1dca341ecffccf78f907 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date: Tue, 18 May 2021 16:46:53 +0300
Subject: [PATCH 1/1] Allow specifying pg_waldump --rmgr option multiple times.
Before, if you specified multiple --rmgr options, only the last one took
effect. It seems more sensible to select all the specfied resource
managers.
---
doc/src/sgml/ref/pg_waldump.sgml | 3 ++-
src/bin/pg_waldump/pg_waldump.c | 16 ++++++++++------
2 files changed, 12 insertions(+), 7 deletions(-)
diff --git a/doc/src/sgml/ref/pg_waldump.sgml b/doc/src/sgml/ref/pg_waldump.sgml
index 5fcdfe210ac..432254d2d5d 100644
--- a/doc/src/sgml/ref/pg_waldump.sgml
+++ b/doc/src/sgml/ref/pg_waldump.sgml
@@ -142,7 +142,8 @@ PostgreSQL documentation
<term><option>--rmgr=<replaceable>rmgr</replaceable></option></term>
<listitem>
<para>
- Only display records generated by the specified resource manager.
+ Only display records generated by the specified resource manager. You can
+ specify the option multiple times to select multiple resource managers.
If <literal>list</literal> is passed as name, print a list of valid resource manager
names, and exit.
</para>
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index f8b8afe4a7b..5c6aa0df03e 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -49,7 +49,8 @@ typedef struct XLogDumpConfig
bool stats_per_record;
/* filter options */
- int filter_by_rmgr;
+ bool filter_by_rmgr[RM_MAX_ID + 1];
+ bool filter_by_rmgr_enabled;
TransactionId filter_by_xid;
bool filter_by_xid_enabled;
} XLogDumpConfig;
@@ -814,7 +815,8 @@ main(int argc, char **argv)
config.stop_after_records = -1;
config.already_displayed_records = 0;
config.follow = false;
- config.filter_by_rmgr = -1;
+ /* filter_by_rmgr array was zeroed by memset above */
+ config.filter_by_rmgr_enabled = false;
config.filter_by_xid = InvalidTransactionId;
config.filter_by_xid_enabled = false;
config.stats = false;
@@ -869,16 +871,17 @@ main(int argc, char **argv)
exit(EXIT_SUCCESS);
}
+ config.filter_by_rmgr_enabled = true;
for (i = 0; i <= RM_MAX_ID; i++)
{
if (pg_strcasecmp(optarg, RmgrDescTable[i].rm_name) == 0)
{
- config.filter_by_rmgr = i;
+ config.filter_by_rmgr[i] = true;
break;
}
}
- if (config.filter_by_rmgr == -1)
+ if (i > RM_MAX_ID)
{
pg_log_error("resource manager \"%s\" does not exist",
optarg);
@@ -1087,8 +1090,9 @@ main(int argc, char **argv)
}
/* apply all specified filters */
- if (config.filter_by_rmgr != -1 &&
- config.filter_by_rmgr != record->xl_rmid)
+ if (config.filter_by_rmgr_enabled &&
+ (record->xl_rmid < 0 || record->xl_rmid > RM_MAX_ID ||
+ !config.filter_by_rmgr[record->xl_rmid]))
continue;
if (config.filter_by_xid_enabled &&
--
2.30.2
On Tue, May 18, 2021 at 04:50:31PM +0300, Heikki Linnakangas wrote:
I wanted to dump all heap WAL records with pg_waldump, so I did this:
$ pg_waldump --rmgr=heap --rmgr=heap2 data/pg_wal/000000010000000000000001 --stat=record
Type N (%) Record size (%) FPI size (%) Combined size (%)
---- - --- ----------- --- -------- --- ------------- ---
Heap2/PRUNE 268 ( 8.74) 18192 ( 2.73) 0 ( 0.00) 18192 ( 1.74)
Heap2/VACUUM 55 ( 1.79) 4940 ( 0.74) 0 ( 0.00) 4940 ( 0.47)
Heap2/FREEZE_PAGE 277 ( 9.03) 186868 ( 28.03) 0 ( 0.00) 186868 ( 17.86)
Heap2/VISIBLE 467 ( 15.23) 27783 ( 4.17) 376832 ( 99.34) 404615 ( 38.68)
Heap2/MULTI_INSERT 1944 ( 63.38) 354800 ( 53.21) 2520 ( 0.66) 357320 ( 34.16)
Heap2/MULTI_INSERT+INIT 56 ( 1.83) 74152 ( 11.12) 0 ( 0.00) 74152 ( 7.09)
-------- -------- -------- --------
Total 3067 666735 [63.74%] 379352 [36.26%] 1046087 [100%]
pg_waldump: fatal: error in WAL record at 0/1680118: invalid record length at 0/1680150: wanted 24, got 0That didn't do what I wanted. It only printed the Heap2 records, not Heap,
even though I specified both. The reason is that if you specify multiple
--rmgr options, only the last one takes effect.I propose the attached to allow selecting multiple rmgrs, by giving multiple
--rmgr options. With that, it works the way I expected:
The change and the patch look sensible to me.
At Tue, 18 May 2021 23:23:02 +0800, Julien Rouhaud <rjuju123@gmail.com> wrote in
On Tue, May 18, 2021 at 04:50:31PM +0300, Heikki Linnakangas wrote:
That didn't do what I wanted. It only printed the Heap2 records, not Heap,
even though I specified both. The reason is that if you specify multiple
--rmgr options, only the last one takes effect.I propose the attached to allow selecting multiple rmgrs, by giving multiple
--rmgr options. With that, it works the way I expected:The change and the patch look sensible to me.
+1.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
On Wed, May 19, 2021 at 11:50:52AM +0900, Kyotaro Horiguchi wrote:
At Tue, 18 May 2021 23:23:02 +0800, Julien Rouhaud <rjuju123@gmail.com> wrote in
The change and the patch look sensible to me.
+1.
Agreed.
--
Michael
On 18 May 2021, at 15:50, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
The reason is that if you specify multiple --rmgr options, only the last one takes effect.
That's in line with how options are handled for most binaries, so this will go
against that. That being said, I don't think thats a problem here really given
what this tool is and it's intended usecase.
This patch makes the special case "--rmgr=list" a bit more awkward than before
IMO, as it breaks the list processing, but nothing we can't live with.
I propose the attached to allow selecting multiple rmgrs
I agree with the other +1's in this thread, and am marking this as ready for
committer.
As a tiny nitpick for readability, I would move this line inside the string
comparison case where the rmgr is selected. Not that it makes any difference
in practice, but since that's where the filtering is set it seems a hair
tidier.
+ config.filter_by_rmgr_enabled = true;
--
Daniel Gustafsson https://vmware.com/
On 28/06/2021 13:34, Daniel Gustafsson wrote:
On 18 May 2021, at 15:50, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
The reason is that if you specify multiple --rmgr options, only the last one takes effect.
That's in line with how options are handled for most binaries, so this will go
against that. That being said, I don't think thats a problem here really given
what this tool is and it's intended usecase.
There is some precedent for this with the pg_dump --table option, for
example.
In general, I think it's weird that the latest option wins. If you
specify the same option multiple times, and it's not something like
--rmgr or --table where it makes sense, it's most likely user error.
Printing an error would be nicer than ignoring all but the last
instance. But I'm not going to try changing that now.
I propose the attached to allow selecting multiple rmgrs
I agree with the other +1's in this thread, and am marking this as ready for
committer.As a tiny nitpick for readability, I would move this line inside the string
comparison case where the rmgr is selected. Not that it makes any difference
in practice, but since that's where the filtering is set it seems a hair
tidier.
+ config.filter_by_rmgr_enabled = true;
Ok, changed it that way.
I tried to be defensive against WAL records with bogus xl_rmid values here:
@@ -1098,8 +1100,9 @@ main(int argc, char **argv)
}/* apply all specified filters */ - if (config.filter_by_rmgr != -1 && - config.filter_by_rmgr != record->xl_rmid) + if (config.filter_by_rmgr_enabled && + (record->xl_rmid < 0 || record->xl_rmid > RM_MAX_ID || + !config.filter_by_rmgr[record->xl_rmid])) continue;
But looking closer, that's pointless. We use record->xl_rmid directly as
array index elsewhere, and that's OK because ValidXLogRecordHeader()
checks that xl_rmid <= RM_MAX_ID. And the 'xl_rmid < 0' check is
unnecessary because the field is unsigned. So I'll remove those, and
commit this tomorrow.
- Heikki
On 30 Jun 2021, at 22:39, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
In general, I think it's weird that the latest option wins. If you specify the same option multiple times, and it's not something like --rmgr or --table where it makes sense, it's most likely user error. Printing an error would be nicer than ignoring all but the last instance. But I'm not going to try changing that now.
AFAIK, the traditional "defense" for it when building a commandline with
scripts which loop over input, to avoid the need for any data structure holding
the options for deduplication. No idea how common that is these days, but I've
seen it in production in the past for sure.
I tried to be defensive against WAL records with bogus xl_rmid values here:
@@ -1098,8 +1100,9 @@ main(int argc, char **argv) } /* apply all specified filters */ - if (config.filter_by_rmgr != -1 && - config.filter_by_rmgr != record->xl_rmid) + if (config.filter_by_rmgr_enabled && + (record->xl_rmid < 0 || record->xl_rmid > RM_MAX_ID || + !config.filter_by_rmgr[record->xl_rmid])) continue;But looking closer, that's pointless. We use record->xl_rmid directly as array index elsewhere, and that's OK because ValidXLogRecordHeader() checks that xl_rmid <= RM_MAX_ID. And the 'xl_rmid < 0' check is unnecessary because the field is unsigned. So I'll remove those, and commit this tomorrow.
+1
--
Daniel Gustafsson https://vmware.com/