Making pg_rewind faster

Started by vignesh ravichandranover 3 years ago51 messages
Jump to latest
#1vignesh ravichandran
admin@viggy28.dev

Hi Hackers,

I have been using pg_rewind in production for 2 years. One of the things that I noticed in pg_rewind is if it doesn't know what to do with a file "it copies". I understand it's the more safer option. After all, the alternative, pg_basebackup copies all the files from source to target.

However, this is making pg_rewind inefficient when we have a high number of WAL files. Majority of the data (in most of my cases 95%+) that it copies are WAL files which are anyway same between the source and target. Skipping those same WAL files from copying will improve the speed of pg_rewind a lot.

1. Does pg_rewind need to copy WAL files before the WAL that contains the last common check point?

Heikki's presentation https://pgsessions.com/assets/archives/pg_rewind-presentation-paris.pdf gave me a good overview and also explained the behavior what I mentioned.

Thanks,

Vignesh

#2Justin Kwan
justinpkwan@outlook.com
In reply to: vignesh ravichandran (#1)
Re: Making pg_rewind faster

Hi everyone!

Here's the attached patch submission to optimize pg_rewind performance when many WAL files are retained on server. This patch avoids replaying (copying over) older WAL segment files that fall before the point of divergence between the source and target servers.

Thanks,
Justin
________________________________
From: Justin Kwan <jkwan@cloudflare.com>
Sent: July 15, 2022 6:13 PM
To: vignesh ravichandran <admin@viggy28.dev>
Cc: pgsql-hackers <pgsql-hackers@postgresql.org>; vignesh <vignesh@cloudflare.com>; justinpkwan@outlook.com <justinpkwan@outlook.com>
Subject: Re: Making pg_rewind faster

Looping in my other email.

On Thu, Jun 30, 2022 at 6:22 AM vignesh ravichandran <admin@viggy28.dev<mailto:admin@viggy28.dev>> wrote:
Hi Hackers,

I have been using pg_rewind in production for 2 years. One of the things that I noticed in pg_rewind is if it doesn't know what to do with a file "it copies". I understand it's the more safer option. After all, the alternative, pg_basebackup copies all the files from source to target.

However, this is making pg_rewind inefficient when we have a high number of WAL files. Majority of the data (in most of my cases 95%+) that it copies are WAL files which are anyway same between the source and target. Skipping those same WAL files from copying will improve the speed of pg_rewind a lot.

1. Does pg_rewind need to copy WAL files before the WAL that contains the last common check point?

Heikki's presentation https://pgsessions.com/assets/archives/pg_rewind-presentation-paris.pdf gave me a good overview and also explained the behavior what I mentioned.

Thanks,
Vignesh

Attachments:

v1-0001-Avoid-copying-WAL-segments-before-divergence-to-spee.patchapplication/octet-stream; name=v1-0001-Avoid-copying-WAL-segments-before-divergence-to-spee.patchDownload+193-8
#3Justin Kwan
justinpkwan@outlook.com
In reply to: vignesh ravichandran (#1)
Re: Making pg_rewind faster

Hi everyone!

I've also attached the pg_rewind optimization patch file for Postgres version 14.4. The previous patch file targets version Postgres version 15 Beta 1/2.

Thanks,
Justin
________________________________
From: Justin Kwan <jkwan@cloudflare.com>
Sent: July 15, 2022 6:13 PM
To: vignesh ravichandran <admin@viggy28.dev>
Cc: pgsql-hackers <pgsql-hackers@postgresql.org>; vignesh <vignesh@cloudflare.com>; justinpkwan@outlook.com <justinpkwan@outlook.com>
Subject: Re: Making pg_rewind faster

Looping in my other email.

On Thu, Jun 30, 2022 at 6:22 AM vignesh ravichandran <admin@viggy28.dev<mailto:admin@viggy28.dev>> wrote:
Hi Hackers,

I have been using pg_rewind in production for 2 years. One of the things that I noticed in pg_rewind is if it doesn't know what to do with a file "it copies". I understand it's the more safer option. After all, the alternative, pg_basebackup copies all the files from source to target.

However, this is making pg_rewind inefficient when we have a high number of WAL files. Majority of the data (in most of my cases 95%+) that it copies are WAL files which are anyway same between the source and target. Skipping those same WAL files from copying will improve the speed of pg_rewind a lot.

1. Does pg_rewind need to copy WAL files before the WAL that contains the last common check point?

Heikki's presentation https://pgsessions.com/assets/archives/pg_rewind-presentation-paris.pdf gave me a good overview and also explained the behavior what I mentioned.

Thanks,
Vignesh

Attachments:

v1-pg14.4-0001-Avoid-copying-WAL-segments-before-divergence-to-spee.patchapplication/octet-stream; name=v1-pg14.4-0001-Avoid-copying-WAL-segments-before-divergence-to-spee.patchDownload+196-10
#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Justin Kwan (#3)
Re: Making pg_rewind faster

Justin Kwan <justinpkwan@outlook.com> writes:

I've also attached the pg_rewind optimization patch file for Postgres version 14.4. The previous patch file targets version Postgres version 15 Beta 1/2.

It's very unlikely that we would consider committing such changes into
released branches. In fact, it's too late even for v15. You should
be submitting non-bug-fix patches against master (v16-to-be).

regards, tom lane

#5Justin Kwan
justinpkwan@outlook.com
In reply to: Tom Lane (#4)
Re: Making pg_rewind faster

Hi Tom,

Thank you for taking a look at this and that sounds good. I will send over a patch compatible with Postgres v16.

Justin
________________________________
From: Tom Lane <tgl@sss.pgh.pa.us>
Sent: July 17, 2022 2:40 PM
To: Justin Kwan <justinpkwan@outlook.com>
Cc: pgsql-hackers <pgsql-hackers@postgresql.org>; vignesh <vignesh@cloudflare.com>; jkwan@cloudflare.com <jkwan@cloudflare.com>; vignesh ravichandran <admin@viggy28.dev>; hlinnaka@iki.fi <hlinnaka@iki.fi>
Subject: Re: Making pg_rewind faster

Justin Kwan <justinpkwan@outlook.com> writes:

I've also attached the pg_rewind optimization patch file for Postgres version 14.4. The previous patch file targets version Postgres version 15 Beta 1/2.

It's very unlikely that we would consider committing such changes into
released branches. In fact, it's too late even for v15. You should
be submitting non-bug-fix patches against master (v16-to-be).

regards, tom lane

#6Michael Paquier
michael@paquier.xyz
In reply to: Justin Kwan (#5)
Re: Making pg_rewind faster

On Mon, Jul 18, 2022 at 05:14:00PM +0000, Justin Kwan wrote:

Thank you for taking a look at this and that sounds good. I will
send over a patch compatible with Postgres v16.

+$node_2->psql(
+       'postgres',
+       "SELECT extract(epoch from modification) FROM pg_stat_file('pg_wal/000000010000000000000003');",
+       stdout => \my $last_common_tli1_wal_last_modified_at);
Please note that you should not rely on the FS-level stats for
anything that touches the WAL segments.  A rough guess about what you
could here to make sure that only the set of WAL segments you are
looking for is being copied over would be to either:
- Scan the logs produced by pg_rewind and see if the segments are
copied or not, depending on the divergence point (aka the last
checkpoint before WAL forked).
- Clean up pg_wal/ in the target node before running pg_rewind,
checking that only the segments you want are available once the
operation completes.
--
Michael
#7Justin Kwan
justinpkwan@outlook.com
In reply to: Michael Paquier (#6)
Re: Making pg_rewind faster

Hi Michael,

Thank you for your feedback, I've incorporated your suggestions by scanning the logs produced from pg_rewind when asserting that certain WAL segment files were skipped from being copied over to the target server.

I've also updated the pg_rewind patch file to target the Postgres master branch (version 16 to be). Please see attached.

Thanks,
Justin

________________________________
From: Michael Paquier
Sent: Tuesday, July 19, 2022 1:36 AM
To: Justin Kwan
Cc: Tom Lane; pgsql-hackers; vignesh; jkwan@cloudflare.com; vignesh ravichandran; hlinnaka@iki.fi
Subject: Re: Making pg_rewind faster

On Mon, Jul 18, 2022 at 05:14:00PM +0000, Justin Kwan wrote:

Thank you for taking a look at this and that sounds good. I will
send over a patch compatible with Postgres v16.

+$node_2->psql(
+       'postgres',
+       "SELECT extract(epoch from modification) FROM pg_stat_file('pg_wal/000000010000000000000003');",
+       stdout => \my $last_common_tli1_wal_last_modified_at);
Please note that you should not rely on the FS-level stats for
anything that touches the WAL segments.  A rough guess about what you
could here to make sure that only the set of WAL segments you are
looking for is being copied over would be to either:
- Scan the logs produced by pg_rewind and see if the segments are
copied or not, depending on the divergence point (aka the last
checkpoint before WAL forked).
- Clean up pg_wal/ in the target node before running pg_rewind,
checking that only the segments you want are available once the
operation completes.
--
Michael

Attachments:

v2-pg16-0001-Avoid-copying-WAL-segments-before-divergence-to-spee.patchapplication/octet-stream; name=v2-pg16-0001-Avoid-copying-WAL-segments-before-divergence-to-spee.patchDownload+183-3
#8Justin Kwan
justinpkwan@outlook.com
In reply to: Justin Kwan (#5)
Re: Making pg_rewind faster

Hi Michael,

Not sure if this email went through previously but thank you for your feedback, I've incorporated your suggestions by scanning the logs produced from pg_rewind when asserting that certain WAL segment files were skipped from being copied over to the target server.

I've also updated the pg_rewind patch file to target the Postgres master branch (version 16 to be). Please see attached.

Thanks,
Justin

________________________________
From: Justin Kwan <justinpkwan@outlook.com>
Sent: July 18, 2022 1:14 PM
To: Tom Lane <tgl@sss.pgh.pa.us>
Cc: pgsql-hackers <pgsql-hackers@postgresql.org>; vignesh <vignesh@cloudflare.com>; jkwan@cloudflare.com <jkwan@cloudflare.com>; vignesh ravichandran <admin@viggy28.dev>; hlinnaka@iki.fi <hlinnaka@iki.fi>
Subject: Re: Making pg_rewind faster

Hi Tom,

Thank you for taking a look at this and that sounds good. I will send over a patch compatible with Postgres v16.

Justin
________________________________
From: Tom Lane <tgl@sss.pgh.pa.us>
Sent: July 17, 2022 2:40 PM
To: Justin Kwan <justinpkwan@outlook.com>
Cc: pgsql-hackers <pgsql-hackers@postgresql.org>; vignesh <vignesh@cloudflare.com>; jkwan@cloudflare.com <jkwan@cloudflare.com>; vignesh ravichandran <admin@viggy28.dev>; hlinnaka@iki.fi <hlinnaka@iki.fi>
Subject: Re: Making pg_rewind faster

Justin Kwan <justinpkwan@outlook.com> writes:

I've also attached the pg_rewind optimization patch file for Postgres version 14.4. The previous patch file targets version Postgres version 15 Beta 1/2.

It's very unlikely that we would consider committing such changes into
released branches. In fact, it's too late even for v15. You should
be submitting non-bug-fix patches against master (v16-to-be).

regards, tom lane

Attachments:

v2-pg16-0001-Avoid-copying-WAL-segments-before-divergence-to-spee.patchapplication/octet-stream; name=v2-pg16-0001-Avoid-copying-WAL-segments-before-divergence-to-spee.patchDownload+183-3
#9Alexander Korotkov
aekorotkov@gmail.com
In reply to: Justin Kwan (#8)
Re: Making pg_rewind faster

Hi, Justin!

On Fri, Jul 29, 2022 at 1:05 PM Justin Kwan <justinpkwan@outlook.com> wrote:

Not sure if this email went through previously but thank you for your feedback, I've incorporated your suggestions by scanning the logs produced from pg_rewind when asserting that certain WAL segment files were skipped from being copied over to the target server.

I've also updated the pg_rewind patch file to target the Postgres master branch (version 16 to be). Please see attached.

Thank you for the revision.

I've taken a look at this patch. Overall it looks good to me. I also
don't see any design objections in the thread.

A couple of points from me:
1) I would prefer to evade hard-coded names for WAL segments in the
tap tests. Could we calculate the names in the tap tests based on the
diverge point, etc.?
2) Patch contains some indentation with spaces, which should be done
in tabs. Please consider either manually fixing this or running
pgindent over modified files.

------
Regards,
Alexander Korotkov

#10Andres Freund
andres@anarazel.de
In reply to: Alexander Korotkov (#9)
Re: Making pg_rewind faster

Hi,

On 2022-09-13 20:50:20 +0300, Alexander Korotkov wrote:

On Fri, Jul 29, 2022 at 1:05 PM Justin Kwan <justinpkwan@outlook.com> wrote:

Not sure if this email went through previously but thank you for your feedback, I've incorporated your suggestions by scanning the logs produced from pg_rewind when asserting that certain WAL segment files were skipped from being copied over to the target server.

I've also updated the pg_rewind patch file to target the Postgres master branch (version 16 to be). Please see attached.

Thank you for the revision.

I've taken a look at this patch. Overall it looks good to me. I also
don't see any design objections in the thread.

A couple of points from me:
1) I would prefer to evade hard-coded names for WAL segments in the
tap tests. Could we calculate the names in the tap tests based on the
diverge point, etc.?
2) Patch contains some indentation with spaces, which should be done
in tabs. Please consider either manually fixing this or running
pgindent over modified files.

This patch currently fails because it hasn't been adjusted for
commit c47885bd8b6
Author: Andres Freund <andres@anarazel.de>
Date: 2022-09-19 18:03:17 -0700

Split TESTDIR into TESTLOGDIR and TESTDATADIR

The adjustment is trivial. Attached, together with also producing an error
message rather than just dying wordlessly.

It doesn't seem quite right to read pg_rewind's logs by reading
regress_log_001_basic. Too easy to confuse different runs of pg_rewind
etc. I'd suggest trying to redirect the log to a different file.

With regard to Alexander's point about whitespace:

.git/rebase-apply/patch:25: indent with spaces.
/* Handle WAL segment file. */
.git/rebase-apply/patch:26: indent with spaces.
const char *fname;
.git/rebase-apply/patch:27: indent with spaces.
char *slash;
.git/rebase-apply/patch:29: indent with spaces.
/* Split filepath into directory & filename. */
.git/rebase-apply/patch:30: indent with spaces.
slash = strrchr(path, '/');
warning: squelched 29 whitespace errors
warning: 34 lines add whitespace errors.

Greetings,

Andres Freund

Attachments:

v3-0001-Avoid-copying-WAL-segments-before-divergence-to-s.patchtext/x-diff; charset=us-asciiDownload+183-3
v3-0002-Fix-log-file-names-used-in-tests.patchtext/x-diff; charset=us-asciiDownload+4-5
#11Michael Paquier
michael@paquier.xyz
In reply to: Andres Freund (#10)
Re: Making pg_rewind faster

On Sun, Oct 02, 2022 at 10:44:25AM -0700, Andres Freund wrote:

It doesn't seem quite right to read pg_rewind's logs by reading
regress_log_001_basic. Too easy to confuse different runs of pg_rewind
etc. I'd suggest trying to redirect the log to a different file.

Hardcoding log file names in the test increases the overall
maintenance, even if renaming these would be easy to track and fix if
the naming convention is changed. Anyway, I think that what this
patch should do is to use command_checks_all() in RewindTest.pm as it
is the test API able to check after a status and multiple expected
outputs, which is what the changes in 001 and 008 are doing.
RewindTest::run_pg_rewind() needs to be a bit tweaked to accept these
regex patterns in input.

+    if (file_segno < last_common_segno)
+    {
+        pg_log_debug("WAL file entry \"%s\" not copied to target", fname);
+        return FILE_ACTION_NONE;
+    }
There may be something I am missing here, but there is no need to care
about segments with a TLI older than lastcommontliIndex, no?

decide_wal_file_action() assumes that the WAL segment exists on the
target and the source. This looks bug-prone to me without at least an
assertion.

file_entry_t has an entry to track if a file is a relation file. I
think that it would be much cleaner to track if we are handling a WAL
segment when inserting an entry in insert_filehash_entry(), so
isrelfile could be replaced by an enum with three values: relation
file, WAL segment and the rest.
--
Michael

#12Michael Paquier
michael@paquier.xyz
In reply to: Michael Paquier (#11)
Re: Making pg_rewind faster

On Thu, Oct 06, 2022 at 04:08:45PM +0900, Michael Paquier wrote:

file_entry_t has an entry to track if a file is a relation file. I
think that it would be much cleaner to track if we are handling a WAL
segment when inserting an entry in insert_filehash_entry(), so
isrelfile could be replaced by an enum with three values: relation
file, WAL segment and the rest.

This review has been done a few weeks ago, and there has been no
update since, so I am marking this entry as returned with feedback in
the CF app.
--
Michael

#13John H
johnhyvr@gmail.com
In reply to: Michael Paquier (#12)
Re: Making pg_rewind faster

Hi,

I've attached an updated version of the patch against master with the changes
suggested.

On Tue, Nov 29, 2022 at 10:03 PM Michael Paquier <michael@paquier.xyz> wrote:

On Thu, Oct 06, 2022 at 04:08:45PM +0900, Michael Paquier wrote:

There may be something I am missing here, but there is no need to care
about segments with a TLI older than lastcommontliIndex, no?

Hard to say. pg_rewind is intended to make the same "copy" of the cluster which
implies pg_wal/ should look the same. There might be use cases around logical
replication where you would want these WAL files to still exist even
across promotions?

decide_wal_file_action() assumes that the WAL segment exists on the
target and the source. This looks bug-prone to me without at least an
assertion.

From previous refactors there is now an Assertion in filemap.c
decide_file_action that handles this.

Assert(entry->target_exists && entry->source_exists);

decide_wal_file_action is called after the assertion.

Thanks,

--
John Hsu - Amazon Web Services

Attachments:

0004-Avoid-copying-WAL-segments-before-divergence-to-spee.patchapplication/x-patch; name=0004-Avoid-copying-WAL-segments-before-divergence-to-spee.patchDownload+104-7
#14Japin Li
japinli@hotmail.com
In reply to: John H (#13)
Re: Making pg_rewind faster

On Tue, 01 Jul 2025 at 11:13, John H <johnhyvr@gmail.com> wrote:

Hi,

I've attached an updated version of the patch against master with the changes
suggested.

On Tue, Nov 29, 2022 at 10:03 PM Michael Paquier <michael@paquier.xyz> wrote:

On Thu, Oct 06, 2022 at 04:08:45PM +0900, Michael Paquier wrote:

There may be something I am missing here, but there is no need to care
about segments with a TLI older than lastcommontliIndex, no?

Hard to say. pg_rewind is intended to make the same "copy" of the cluster which
implies pg_wal/ should look the same. There might be use cases around logical
replication where you would want these WAL files to still exist even
across promotions?

decide_wal_file_action() assumes that the WAL segment exists on the
target and the source. This looks bug-prone to me without at least an
assertion.

From previous refactors there is now an Assertion in filemap.c
decide_file_action that handles this.

Assert(entry->target_exists && entry->source_exists);

decide_wal_file_action is called after the assertion.

Hi, John

Thanks for updating the patch.

1.
+/* Determine the type of file content (relation, WAL, or other) */
+static file_content_type_t
+getFileType(const char *path)

Considering the existence of file_type_t, would getFileContentType() be a
suitable function for handling file content types?

2.
Perhaps decide_wal_file_action() could be defined in filemap.c.

While this is unrelated to WAL logging, it could also contribute to faster
pg_rewind operations. Should we consider ignoring log files under PGDATA
(e.g., those in the default log/ directory)?

--
Regards,
Japin Li

#15wenhui qiu
qiuwenhuifx@gmail.com
In reply to: Japin Li (#14)
Re: Making pg_rewind faster

HI

2.
Perhaps decide_wal_file_action() could be defined in filemap.c.

While this is unrelated to WAL logging, it could also contribute to

faster

pg_rewind operations. Should we consider ignoring log files under PGDATA
(e.g., those in the default log/ directory)?

Agree ,Usually the log file directory also takes up a lot of space, and the
number of log files is quite large

On Wed, Jul 2, 2025 at 10:21 AM Japin Li <japinli@hotmail.com> wrote:

Show quoted text

On Tue, 01 Jul 2025 at 11:13, John H <johnhyvr@gmail.com> wrote:

Hi,

I've attached an updated version of the patch against master with the

changes

suggested.

On Tue, Nov 29, 2022 at 10:03 PM Michael Paquier <michael@paquier.xyz>

wrote:

On Thu, Oct 06, 2022 at 04:08:45PM +0900, Michael Paquier wrote:

There may be something I am missing here, but there is no need to care
about segments with a TLI older than lastcommontliIndex, no?

Hard to say. pg_rewind is intended to make the same "copy" of the

cluster which

implies pg_wal/ should look the same. There might be use cases around

logical

replication where you would want these WAL files to still exist even
across promotions?

decide_wal_file_action() assumes that the WAL segment exists on the
target and the source. This looks bug-prone to me without at least an
assertion.

From previous refactors there is now an Assertion in filemap.c
decide_file_action that handles this.

Assert(entry->target_exists && entry->source_exists);

decide_wal_file_action is called after the assertion.

Hi, John

Thanks for updating the patch.

1.
+/* Determine the type of file content (relation, WAL, or other) */
+static file_content_type_t
+getFileType(const char *path)

Considering the existence of file_type_t, would getFileContentType() be a
suitable function for handling file content types?

2.
Perhaps decide_wal_file_action() could be defined in filemap.c.

While this is unrelated to WAL logging, it could also contribute to faster
pg_rewind operations. Should we consider ignoring log files under PGDATA
(e.g., those in the default log/ directory)?

--
Regards,
Japin Li

#16John H
johnhyvr@gmail.com
In reply to: wenhui qiu (#15)
Re: Making pg_rewind faster

Hi,

Thanks for the quick review.

On Tue, Jul 1, 2025 at 8:16 PM wenhui qiu <qiuwenhuifx@gmail.com> wrote:

Perhaps decide_wal_file_action() could be defined in filemap.c.

That's a good point. I updated the patch to reflect that.

While this is unrelated to WAL logging, it could also contribute to faster
pg_rewind operations. Should we consider ignoring log files under PGDATA
(e.g., those in the default log/ directory)?

Agree ,Usually the log file directory also takes up a lot of space, and the number of log files is quite large

Should we handle this use case? I do agree that for the more common
use-cases of pg_rewind which is synchronizing an old writer to the new
leader after failover, avoiding syncing the logging directory is
useful.
At the same time, pg_rewind is intended to make the same copy of the
source cluster as efficiently as possible which would include "all"
directories if they exist in $PGDATA. By default logging_collector is
off as well. I'd also think you would want to avoid putting the logs
in $PGDATA to have smaller backups if you are using tools like
pg_basebackup.

On Wed, Jul 2, 2025 at 10:21 AM Japin Li <japinli@hotmail.com> wrote:

Hi, John

Thanks for updating the patch.

1.
+/* Determine the type of file content (relation, WAL, or other) */
+static file_content_type_t
+getFileType(const char *path)

Considering the existence of file_type_t, would getFileContentType() be a
suitable function for handling file content types?

Do you mean naming getFileType to getFileContentType?

Thanks,

--
John Hsu - Amazon Web Services

Attachments:

0005-Avoid-copying-WAL-segments-before-divergence-to-spee.patchapplication/octet-stream; name=0005-Avoid-copying-WAL-segments-before-divergence-to-spee.patchDownload+102-12
#17Japin Li
japinli@hotmail.com
In reply to: John H (#16)
Re: Making pg_rewind faster

On Wed, 02 Jul 2025 at 11:21, John H <johnhyvr@gmail.com> wrote:

Hi,

Thanks for the quick review.

On Tue, Jul 1, 2025 at 8:16 PM wenhui qiu <qiuwenhuifx@gmail.com> wrote:

Perhaps decide_wal_file_action() could be defined in filemap.c.

That's a good point. I updated the patch to reflect that.

Thanks for updating the patch.

While this is unrelated to WAL logging, it could also contribute to faster
pg_rewind operations. Should we consider ignoring log files under PGDATA
(e.g., those in the default log/ directory)?

Agree ,Usually the log file directory also takes up a lot of space, and the number of log files is quite large

Should we handle this use case? I do agree that for the more common
use-cases of pg_rewind which is synchronizing an old writer to the new
leader after failover, avoiding syncing the logging directory is
useful.
At the same time, pg_rewind is intended to make the same copy of the
source cluster as efficiently as possible which would include "all"
directories if they exist in $PGDATA. By default logging_collector is
off as well. I'd also think you would want to avoid putting the logs
in $PGDATA to have smaller backups if you are using tools like
pg_basebackup.

Splitting the logs from $PGDATA is definitely better. The question is whether
it's worth implementing this directly in core or if a prominent note in the
documentation would suffice.

On Wed, Jul 2, 2025 at 10:21 AM Japin Li <japinli@hotmail.com> wrote:

Hi, John

Thanks for updating the patch.

1.
+/* Determine the type of file content (relation, WAL, or other) */
+static file_content_type_t
+getFileType(const char *path)

Considering the existence of file_type_t, would getFileContentType() be a
suitable function for handling file content types?

Do you mean naming getFileType to getFileContentType?

Exactly! It's confusing that getFileType() returns file_content_type_t
instead of file_type_t.

For v5 patch:

1.
We could simply use the global WalSegSz variable within decide_file_action(),
eliminating the need to pass wal_segsz_bytes as an argument.

2.
For last_common_segno, we could implement it similarly to WalSegSz, avoiding a
signature change for decide_file_actions() and decide_file_action(). I'm not
insisting on this approach, however.

--
Regards,
Japin Li

#18John H
johnhyvr@gmail.com
In reply to: Japin Li (#17)
Re: Making pg_rewind faster

Hi,

On Wed, Jul 2, 2025 at 6:40 PM Japin Li <japinli@hotmail.com> wrote:

Splitting the logs from $PGDATA is definitely better. The question is whether
it's worth implementing this directly in core or if a prominent note in the
documentation would suffice.

I can work on the documentation update as a separate patch if folks
think this is worthwhile.

On Wed, Jul 2, 2025 at 10:21 AM Japin Li <japinli@hotmail.com> wrote:

Exactly! It's confusing that getFileType() returns file_content_type_t
instead of file_type_t.

Ah yes that is confusing, updated in patch.

For v5 patch:

1.
We could simply use the global WalSegSz variable within decide_file_action(),
eliminating the need to pass wal_segsz_bytes as an argument.

Good point.

2.
For last_common_segno, we could implement it similarly to WalSegSz, avoiding a
signature change for decide_file_actions() and decide_file_action(). I'm not
insisting on this approach, however.

I made it a global as well, and had to include access/xlog_internal.h
in pg_rewind.h but I don't feel strongly about it either way.

Thanks,

--
John Hsu - Amazon Web Services

Attachments:

0006-Avoid-copying-WAL-segments-before-divergence-to-spee.patchapplication/octet-stream; name=0006-Avoid-copying-WAL-segments-before-divergence-to-spee.patchDownload+101-8
#19Japin Li
japinli@hotmail.com
In reply to: John H (#18)
Re: Making pg_rewind faster

On Thu, 03 Jul 2025 at 12:59, John H <johnhyvr@gmail.com> wrote:

Hi,

On Wed, Jul 2, 2025 at 6:40 PM Japin Li <japinli@hotmail.com> wrote:

Splitting the logs from $PGDATA is definitely better. The question is whether
it's worth implementing this directly in core or if a prominent note in the
documentation would suffice.

I can work on the documentation update as a separate patch if folks
think this is worthwhile.

On Wed, Jul 2, 2025 at 10:21 AM Japin Li <japinli@hotmail.com> wrote:

Exactly! It's confusing that getFileType() returns file_content_type_t
instead of file_type_t.

Ah yes that is confusing, updated in patch.

For v5 patch:

1.
We could simply use the global WalSegSz variable within decide_file_action(),
eliminating the need to pass wal_segsz_bytes as an argument.

Good point.

2.
For last_common_segno, we could implement it similarly to WalSegSz, avoiding a
signature change for decide_file_actions() and decide_file_action(). I'm not
insisting on this approach, however.

I made it a global as well, and had to include access/xlog_internal.h
in pg_rewind.h but I don't feel strongly about it either way.

Thanks, LGTM.

--
Regards,
Japin Li

#20wenhui qiu
qiuwenhuifx@gmail.com
In reply to: Japin Li (#19)
Re: Making pg_rewind faster

Hi

Thanks, LGTM.

I think it's possible to register to https://commitfest.postgresql.org/54,
https://commitfest.postgresql.org/53 will closed soon

Thanks

On Fri, Jul 4, 2025 at 10:50 AM Japin Li <japinli@hotmail.com> wrote:

Show quoted text

On Thu, 03 Jul 2025 at 12:59, John H <johnhyvr@gmail.com> wrote:

Hi,

On Wed, Jul 2, 2025 at 6:40 PM Japin Li <japinli@hotmail.com> wrote:

Splitting the logs from $PGDATA is definitely better. The question is

whether

it's worth implementing this directly in core or if a prominent note in

the

documentation would suffice.

I can work on the documentation update as a separate patch if folks
think this is worthwhile.

On Wed, Jul 2, 2025 at 10:21 AM Japin Li <japinli@hotmail.com>

wrote:

Exactly! It's confusing that getFileType() returns file_content_type_t
instead of file_type_t.

Ah yes that is confusing, updated in patch.

For v5 patch:

1.
We could simply use the global WalSegSz variable within

decide_file_action(),

eliminating the need to pass wal_segsz_bytes as an argument.

Good point.

2.
For last_common_segno, we could implement it similarly to WalSegSz,

avoiding a

signature change for decide_file_actions() and decide_file_action().

I'm not

insisting on this approach, however.

I made it a global as well, and had to include access/xlog_internal.h
in pg_rewind.h but I don't feel strongly about it either way.

Thanks, LGTM.

--
Regards,
Japin Li

#21John H
johnhyvr@gmail.com
In reply to: wenhui qiu (#20)
#22Robert Haas
robertmhaas@gmail.com
In reply to: John H (#21)
#23John H
johnhyvr@gmail.com
In reply to: Robert Haas (#22)
#24Srinath Reddy Sadipiralla
srinath2133@gmail.com
In reply to: John H (#23)
#25John H
johnhyvr@gmail.com
In reply to: Srinath Reddy Sadipiralla (#24)
#26Srinath Reddy Sadipiralla
srinath2133@gmail.com
In reply to: John H (#23)
#27John H
johnhyvr@gmail.com
In reply to: Srinath Reddy Sadipiralla (#26)
#28Srinath Reddy Sadipiralla
srinath2133@gmail.com
In reply to: John H (#27)
#29Srinath Reddy Sadipiralla
srinath2133@gmail.com
In reply to: Srinath Reddy Sadipiralla (#28)
#30Robert Haas
robertmhaas@gmail.com
In reply to: Srinath Reddy Sadipiralla (#24)
#31Robert Haas
robertmhaas@gmail.com
In reply to: Srinath Reddy Sadipiralla (#28)
#32John H
johnhyvr@gmail.com
In reply to: Srinath Reddy Sadipiralla (#28)
#33John H
johnhyvr@gmail.com
In reply to: Robert Haas (#31)
#34Justin Kwan
justinpkwan@outlook.com
In reply to: John H (#33)
#35Srinath Reddy Sadipiralla
srinath2133@gmail.com
In reply to: Robert Haas (#30)
#36Srinath Reddy Sadipiralla
srinath2133@gmail.com
In reply to: John H (#32)
#37John H
johnhyvr@gmail.com
In reply to: Srinath Reddy Sadipiralla (#36)
#38Srinath Reddy Sadipiralla
srinath2133@gmail.com
In reply to: John H (#37)
#39John H
johnhyvr@gmail.com
In reply to: Srinath Reddy Sadipiralla (#38)
#40Michael Paquier
michael@paquier.xyz
In reply to: Robert Haas (#22)
#41Michael Paquier
michael@paquier.xyz
In reply to: John H (#39)
#42Robert Haas
robertmhaas@gmail.com
In reply to: Michael Paquier (#41)
#43Srinath Reddy Sadipiralla
srinath2133@gmail.com
In reply to: Michael Paquier (#41)
#44John H
johnhyvr@gmail.com
In reply to: Srinath Reddy Sadipiralla (#43)
#45Michael Paquier
michael@paquier.xyz
In reply to: John H (#44)
#46Michael Paquier
michael@paquier.xyz
In reply to: Michael Paquier (#45)
#47Srinath Reddy Sadipiralla
srinath2133@gmail.com
In reply to: Michael Paquier (#46)
#48Michael Paquier
michael@paquier.xyz
In reply to: Srinath Reddy Sadipiralla (#47)
#49Srinath Reddy Sadipiralla
srinath2133@gmail.com
In reply to: Michael Paquier (#48)
#50Michael Paquier
michael@paquier.xyz
In reply to: Robert Haas (#42)
#51Robert Haas
robertmhaas@gmail.com
In reply to: Michael Paquier (#50)