Add --system-identifier / -s option to pg_resetwal

Started by Nikolay Samokhvalov11 months ago10 messageshackers
Jump to latest
#1Nikolay Samokhvalov
samokhvalov@gmail.com

hi hackers,

I was just involved in a DR case, where we needed to change system ID in
control data, and noticed that there is no such capability, officially – so
I thought, it would be good to have it

the attached patch adds a new -s / --system-identifier option to
pg_resetwal that allows users to change the database cluster's system
identifier; it can be useful when you need to make a restored cluster
distinct from the original, or when cloning for testing

some aspects about the patch:
- accepts positive 64-bit integers only (zero not allowed)
- requires interactive confirmation or --force flag for safety
- detects non-TTY environments and requires --force in scripts
- included input validation and error handling
- updated docs with clear warnings about compatibility
- added tests, including some edge cases

looking forward to hearing feedback!

Nik

Attachments:

0001-pg_resetwal--system-identifier.patchapplication/octet-stream; name=0001-pg_resetwal--system-identifier.patchDownload+279-8
#2Kirk Wolak
wolakk@gmail.com
In reply to: Nikolay Samokhvalov (#1)
Re: Add --system-identifier / -s option to pg_resetwal

On Sat, May 31, 2025 at 2:52 PM Nikolay Samokhvalov <nik@postgres.ai> wrote:

I was just involved in a DR case, where we needed to change system ID in
control data, and noticed that there is no such capability, officially – so
I thought, it would be good to have it

the attached patch adds a new -s / --system-identifier option to
pg_resetwal that allows users to change the database cluster's system
identifier; it can be useful when you need to make a restored cluster
distinct from the original, or when cloning for testing

some aspects about the patch:
- accepts positive 64-bit integers only (zero not allowed)
- requires interactive confirmation or --force flag for safety
- detects non-TTY environments and requires --force in scripts
- included input validation and error handling
- updated docs with clear warnings about compatibility
- added tests, including some edge cases

looking forward to hearing feedback!

+1

Like the idea. I did a manual code review. It's a small patch, very
focused, and includes the proper docs.
It's very well thought out (I Iike the --force requirement for scripts).

The only thing I saw was that the code basically accepts Y or y, and all
else is considered "N". for that reason,
I suggest that this line:
printf(_("Continue? (y/n) "));

Would be more clear as:
printf(_("Continue? (y/n) [n]"));

indicating the N is the default answer. if you put anything other than Y
or y... N will be picked.

But this is almost trivial, but I wanted to be thorough.

Regards!

#3Peter Eisentraut
peter_e@gmx.net
In reply to: Nikolay Samokhvalov (#1)
Re: Add --system-identifier / -s option to pg_resetwal

On 31.05.25 20:52, Nikolay Samokhvalov wrote:

the attached patch adds a new -s / --system-identifier option to
pg_resetwal that allows users to change the database cluster's system
identifier; it can be useful when you need to make a restored cluster
distinct from the original, or when cloning for testing

Maybe we can stop eating up short options? A long option seems
sufficient for this.

- requires interactive confirmation or --force flag for safety

I don't see the need for interactive confirmation here. The user would
have explicitly chosen the option, so they get what they asked for.

#4Michael Paquier
michael@paquier.xyz
In reply to: Peter Eisentraut (#3)
Re: Add --system-identifier / -s option to pg_resetwal

On Wed, Jun 04, 2025 at 06:49:31AM +0200, Peter Eisentraut wrote:

Maybe we can stop eating up short options? A long option seems sufficient
for this.

Agreed.

I don't see the need for interactive confirmation here. The user would have
explicitly chosen the option, so they get what they asked for.

Yes, let's remove that. The patch is making --force hold a different
meaning that it currently has. And the tests will have less bloat
with their TTY parts gone.

Honestly, I don't think that we need that many tests in the scope of
this patch. Let's drop the hexa test, the empty test, the test with a
value of 1, the very-large-but-valid-value, and the second update.
These overlap with other tests or just rely on internals of getopt()
or of strtou64(), which eat cycles without really any added value.

+   The system identifier is a unique 64-bit number that identifies the
+   database cluster. It is used by replication systems and backup tools
+   to ensure they are working with the correct cluster. Changing the
+   system identifier makes the cluster incompatible with existing
+   backups, standby servers, and replication setups.

The first sentence of this paragraph is a close copy-paste of what's
in protocol.sgml. It would be nice to avoid these duplications.

Note the report generated by `git diff --check`, while on it, with a
couple of whitespaces reported.

FWIW, I've wanted that in the past when err.. Performing some
chirurgy on clusters.
--
Michael

#5Nikolay Samokhvalov
samokhvalov@gmail.com
In reply to: Michael Paquier (#4)
Re: Add --system-identifier / -s option to pg_resetwal

Thank you, Peter and Michael, for the reviews

Attached is v2, simplified as suggested:
- Removed short option -s
- Removed interactive confirmation and --force
- Simplified tests leaving only essential ones

Additionally, there was an off-list review done by Andrey Borodin, so his
comments also addressed:
- Simplified logic, getting rid of '-' check (negative numbers) -- decided
to accept negative input values (they wrap to valid positive uint64)

Nik

Attachments:

0002-pg_resetwal--system-identifier.patchapplication/octet-stream; name=0002-pg_resetwal--system-identifier.patchDownload+77-1
#6Peter Eisentraut
peter_e@gmx.net
In reply to: Nikolay Samokhvalov (#5)
Re: Add --system-identifier / -s option to pg_resetwal

On 04.06.25 20:46, Nikolay Samokhvalov wrote:

- Simplified logic, getting rid of '-' check (negative numbers) --
decided to accept negative input values (they wrap to valid positive uint64)

That doesn't seem like a good idea to me.

#7Fujii Masao
masao.fujii@gmail.com
In reply to: Nikolay Samokhvalov (#5)
Re: Add --system-identifier / -s option to pg_resetwal

On 2025/06/05 3:46, Nikolay Samokhvalov wrote:

Thank you, Peter and Michael, for the reviews

Attached is v2, simplified as suggested:
- Removed short option -s
- Removed interactive confirmation and --force
- Simplified tests leaving only essential ones

In my environment, the test failed with the following output:

# Failed test 'system identifier was changed correctly'
# at t/001_basic.pl line 203.
# got: '-8570200862721897295'
# expected: '9876543210987654321'
# Looks like you failed 1 test of 84.
t/001_basic.pl ......
Dubious, test returned 1 (wstat 256, 0x100)
Failed 1/84 subtests

I think this happens because the system_identifier returned
by pg_control_system() is a bigint, not an unsigned one.
So either we need to use a different method to retrieve
the value correctly, or we should fix pg_control_system() to
report the system identifier properly (though that would be
a separate issue).

+	if (set_sysid != 0)
+	{
+		printf(_("System identifier:                    " UINT64_FORMAT "\n"),
+			   ControlFile.system_identifier);
+	}

For consistency with PrintControlValues() and pg_controldata,
the label should be "Database system identifier", and we should
use PRIu64 instead of UINT64_FORMAT for gettext()?

+ {"system-identifier", required_argument, NULL, 3},
{"next-transaction-id", required_argument, NULL, 'x'},
{"wal-segsize", required_argument, NULL, 1},
{"char-signedness", required_argument, NULL, 2},

It would be more consistent to place the "system-identifier"
entry just after "char-signedness".

One question about this feature: why do we need to allow
explicitly setting the system identifier? If the goal is
simply to ensure it's different from the original,
wouldn't it be sufficient to let pg_resetwal generate
a new (hopefully unique) value using existing logic,
like what's done in GuessControlValues() or BootStrapXLOG()?

Regards,

--
Fujii Masao
NTT DATA Japan Corporation

#8Nikolay Samokhvalov
samokhvalov@gmail.com
In reply to: Fujii Masao (#7)
Re: Add --system-identifier / -s option to pg_resetwal

Thank you for the review!

Attached is v3 addressing your feedback:

- Changed label to "Database system identifier" for consistency
- Use PRIu64 instead of UINT64_FORMAT for gettext
- Moved option entry after "char-signedness" in getopt array
- Fixed test to use pg_controldata instead of pg_control_system()

On Wed, Jun 4, 2025 at 4:56 PM Fujii Masao <masao.fujii@oss.nttdata.com>
wrote:

One question about this feature: why do we need to allow
explicitly setting the system identifier? If the goal is
simply to ensure it's different from the original,
wouldn't it be sufficient to let pg_resetwal generate
a new (hopefully unique) value using existing logic,
like what's done in GuessControlValues() or BootStrapXLOG()?

Auto-generation would cover most cases. But explicit setting is a superset
-- users who want random can generate one themselves. Explicit control
costs nothing extra and gives full flexibility for edge cases (eg,
reproducible test environments, coordinating multiple clones). pg_resetwal
users are expected to be experts, so giving more control can cover more
possible scenarios.

Nik

Attachments:

v3-0001-Add-system-identifier-option-to-pg_resetwal.patchtext/plain; charset=US-ASCII; name=v3-0001-Add-system-identifier-option-to-pg_resetwal.patchDownload+80-4
#9Fujii Masao
masao.fujii@gmail.com
In reply to: Nikolay Samokhvalov (#8)
Re: Add --system-identifier / -s option to pg_resetwal

On Sun, Feb 1, 2026 at 6:33 AM Nikolay Samokhvalov <nik@postgres.ai> wrote:

Thank you for the review!

Attached is v3 addressing your feedback:

Thanks for updating the patch!

With this patch, the test failed on my laptop as follows.
Could you please take a look and fix it?

$ make -s check -C src/bin/pg_resetwal/ PROVE_TESTS="t/001*"
# +++ tap check in src/bin/pg_resetwal +++
t/001_basic.pl .. 15/? # Tests were run but no plan was declared and
done_testing() was not seen.
# Looks like your test exited with 4 just after 82.
t/001_basic.pl .. Dubious, test returned 4 (wstat 1024, 0x400)
All 82 subtests passed

Test Summary Report
-------------------
t/001_basic.pl (Wstat: 1024 Tests: 82 Failed: 0)
Non-zero exit status: 4
Parse errors: No plan found in TAP output
Files=1, Tests=82, 4 wallclock secs ( 0.03 usr 0.01 sys + 0.31 cusr
0.65 csys = 1.00 CPU)
Result: FAIL
make: *** [check] Error 1

Regards,

--
Fujii Masao

#10Nikolay Samokhvalov
samokhvalov@gmail.com
In reply to: Fujii Masao (#9)
Re: Add --system-identifier / -s option to pg_resetwal

On Wed, Feb 4, 2026 at 8:21 AM Fujii Masao <masao.fujii@gmail.com> wrote:

On Sun, Feb 1, 2026 at 6:33 AM Nikolay Samokhvalov <nik@postgres.ai>
wrote:

Thank you for the review!

Attached is v3 addressing your feedback:

Thanks for updating the patch!

With this patch, the test failed on my laptop as follows.
Could you please take a look and fix it?

$ make -s check -C src/bin/pg_resetwal/ PROVE_TESTS="t/001*"
# +++ tap check in src/bin/pg_resetwal +++
t/001_basic.pl .. 15/? # Tests were run but no plan was declared and
done_testing() was not seen.
# Looks like your test exited with 4 just after 82.
t/001_basic.pl .. Dubious, test returned 4 (wstat 1024, 0x400)
All 82 subtests passed

Test Summary Report
-------------------
t/001_basic.pl (Wstat: 1024 Tests: 82 Failed: 0)
Non-zero exit status: 4
Parse errors: No plan found in TAP output
Files=1, Tests=82, 4 wallclock secs ( 0.03 usr 0.01 sys + 0.31 cusr
0.65 csys = 1.00 CPU)
Result: FAIL
make: *** [check] Error 1

v4 attached: rebased, fixed the failing test + added coverage
for --dry-run.

Attachments:

v4-0001-Add-system-identifier-option-to-pg_resetwal.patchapplication/x-patch; name=v4-0001-Add-system-identifier-option-to-pg_resetwal.patchDownload+87-7