Move backup-related code to xlogbackup.c/.h

Started by Bharath Rupireddyover 3 years ago29 messageshackers
Jump to latest
#1Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com

Hi,

xlog.c currently has ~9000 LOC, out of which ~700 LOC is backup
related, making the file really unmanageable. The commit
7d708093b7400327658a30d1aa1d5e284d37622c added new files
xlogbackup.c/.h for hosting all backup related code eventually. I
propose to move all the backp related code from xlog.c and xlogfuncs.c
to xlogbackup.c/.h. In doing so, I had to add a few Get/Set functions
for XLogCtl variables so that xlogbackup.c can use them.

I'm attaching a patch set where 0001 and 0002 move backup code from
xlogfuncs.c and xlog.c to xlogbackup.c/.h respectively. The advantage
is that all the core's backup code is in one single file making it
more readable and manageable while reducing the xlog.c's file size.

Thoughts?

Thanks Michael Paquier for suggesting to have new files for backup related code.

[1]: /messages/by-id/CALj2ACX0wjo+49hbUmvc_zT1zwdqFOQyhorN0Ox-Rk6v97Nejw@mail.gmail.com

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v1-0001-Move-backup-related-code-from-xlogfuncs.c-to-xlog.patchapplication/x-patch; name=v1-0001-Move-backup-related-code-from-xlogfuncs.c-to-xlog.patchDownload+132-132
v1-0002-Move-backup-related-code-from-xlog.c-to-xlogbacku.patchapplication/x-patch; name=v1-0002-Move-backup-related-code-from-xlog.c-to-xlogbacku.patchDownload+880-746
#2Nathan Bossart
nathandbossart@gmail.com
In reply to: Bharath Rupireddy (#1)
Re: Move backup-related code to xlogbackup.c/.h

On Wed, Sep 28, 2022 at 08:16:08PM +0530, Bharath Rupireddy wrote:

In doing so, I had to add a few Get/Set functions
for XLogCtl variables so that xlogbackup.c can use them.

I would suggest moving this to a separate prerequisite patch that can be
reviewed independently from the patches that simply move code to a
different file.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

#3Michael Paquier
michael@paquier.xyz
In reply to: Nathan Bossart (#2)
Re: Move backup-related code to xlogbackup.c/.h

On Tue, Oct 04, 2022 at 03:54:20PM -0700, Nathan Bossart wrote:

I would suggest moving this to a separate prerequisite patch that can be
reviewed independently from the patches that simply move code to a
different file.

And FWIW, the SQL interfaces for pg_backup_start() and
pg_backup_stop() could stay in xlogfuncs.c. This has the advantage to
centralize in the same file all the SQL-function-specific checks.
--
Michael

#4Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Michael Paquier (#3)
Re: Move backup-related code to xlogbackup.c/.h

On Wed, Oct 5, 2022 at 1:20 PM Michael Paquier <michael@paquier.xyz> wrote:

On Tue, Oct 04, 2022 at 03:54:20PM -0700, Nathan Bossart wrote:

I would suggest moving this to a separate prerequisite patch that can be
reviewed independently from the patches that simply move code to a
different file.

I added the new functions in 0001 patch for ease of review.

And FWIW, the SQL interfaces for pg_backup_start() and
pg_backup_stop() could stay in xlogfuncs.c. This has the advantage to
centralize in the same file all the SQL-function-specific checks.

Agreed.

+extern void WALInsertLockAcquire(void);
+extern void WALInsertLockAcquireExclusive(void);
+extern void WALInsertLockRelease(void);
+extern void WALInsertLockUpdateInsertingAt(XLogRecPtr insertingAt);

Note that I had moved all WAL insert lock related functions to xlog.h
despite xlogbackup.c using 2 of them. This is done to keep all the
functions together.

Please review the attached v2 patch set.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v2-0001-Add-functions-for-xlogbackup.c-to-call-back-into-.patchapplication/x-patch; name=v2-0001-Add-functions-for-xlogbackup.c-to-call-back-into-.patchDownload+128-9
v2-0002-Move-backup-related-code-from-xlog.c-to-xlogbacku.patchapplication/x-patch; name=v2-0002-Move-backup-related-code-from-xlog.c-to-xlogbacku.patchDownload+756-738
#5Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Michael Paquier (#3)
Re: Move backup-related code to xlogbackup.c/.h

On 2022-Oct-05, Michael Paquier wrote:

And FWIW, the SQL interfaces for pg_backup_start() and
pg_backup_stop() could stay in xlogfuncs.c. This has the advantage to
centralize in the same file all the SQL-function-specific checks.

As I recall, that has the disadvantage that the API exposure is a bit
higher -- I mean, with the patch as originally posted, there was less
cross-inclusion of header files, but that is gone in the version Bharat
posted as reply to this. I'm not sure if that's caused by *this*
comment, or even that it's terribly significant, but it seems worth
considering at least.

xlog.h is included by a lot of stuff, so it would be great if it
itself included the smallest set of other files possible.

... that said, looking at the chart in
https://doxygen.postgresql.org//xlog_8h.html looks like the only file
we'd avoid indirectly including is pgtime.h (in addition to xlogbackup.h
itself).

(It's strange that xlog.h seems to have become included into rel.h by
commit 848ef42bb8c7 that did not otherwise touch either rel.h nor xlog.h.)

--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
"Nunca se desea ardientemente lo que solo se desea por razón" (F. Alexandre)

#6Nathan Bossart
nathandbossart@gmail.com
In reply to: Bharath Rupireddy (#4)
Re: Move backup-related code to xlogbackup.c/.h

On Wed, Oct 05, 2022 at 03:22:01PM +0530, Bharath Rupireddy wrote:

On Tue, Oct 04, 2022 at 03:54:20PM -0700, Nathan Bossart wrote:

I would suggest moving this to a separate prerequisite patch that can be
reviewed independently from the patches that simply move code to a
different file.

I added the new functions in 0001 patch for ease of review.

Can we also replace the relevant code with calls to these functions in
0001? That way, we can more easily review the changes you are making to
this code separately from the large patch that just moves the code.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

#7Andres Freund
andres@anarazel.de
In reply to: Bharath Rupireddy (#4)
Re: Move backup-related code to xlogbackup.c/.h

Hi,

On 2022-10-05 15:22:01 +0530, Bharath Rupireddy wrote:

+extern void WALInsertLockAcquire(void);
+extern void WALInsertLockAcquireExclusive(void);
+extern void WALInsertLockRelease(void);
+extern void WALInsertLockUpdateInsertingAt(XLogRecPtr insertingAt);

Note that I had moved all WAL insert lock related functions to xlog.h
despite xlogbackup.c using 2 of them. This is done to keep all the
functions together.

Please review the attached v2 patch set.

I'm doubtful it's a good idea to expose these to outside of xlog.c - they are
very low level, and it's very easy to break stuff by using them wrongly. IMO,
if that's necessary, the split isn't right.

Greetings,

Andres Freund

#8Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Andres Freund (#7)
Re: Move backup-related code to xlogbackup.c/.h

On Thu, Oct 6, 2022 at 4:50 AM Andres Freund <andres@anarazel.de> wrote:

I'm doubtful it's a good idea to expose these to outside of xlog.c - they are
very low level, and it's very easy to break stuff by using them wrongly.

Hm. Here's the v3 patch set without exposing WAL insert lock related
functions. Please have a look.

On Thu, Oct 6, 2022 at 4:22 AM Nathan Bossart <nathandbossart@gmail.com> wrote:

Can we also replace the relevant code with calls to these functions in
0001? That way, we can more easily review the changes you are making to
this code separately from the large patch that just moves the code.

Done. Please have a look at 0001.

On Wed, Oct 5, 2022 at 11:28 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

On 2022-Oct-05, Michael Paquier wrote:

And FWIW, the SQL interfaces for pg_backup_start() and
pg_backup_stop() could stay in xlogfuncs.c. This has the advantage to
centralize in the same file all the SQL-function-specific checks.

As I recall, that has the disadvantage that the API exposure is a bit
higher -- I mean, with the patch as originally posted, there was less
cross-inclusion of header files, but that is gone in the version Bharat
posted as reply to this. I'm not sure if that's caused by *this*
comment, or even that it's terribly significant, but it seems worth
considering at least.

FWIW, I'm attaching 0003 patch for moving backup functions from
xlogfuncs.c to xlogbackup.c. It's natural to have them there when
we're moving backup related things, this also reduces backup code
footprint. We can leave xlogfuncs.c for WAL related SQL-callable
functions.

Please review the attached v3 patch set.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v3-0001-Add-functions-for-xlogbackup.c-to-call-back-into-.patchapplication/x-patch; name=v3-0001-Add-functions-for-xlogbackup.c-to-call-back-into-.patchDownload+142-65
v3-0002-Move-backup-related-code-from-xlog.c-to-xlogbacku.patchapplication/x-patch; name=v3-0002-Move-backup-related-code-from-xlog.c-to-xlogbacku.patchDownload+717-703
v3-0003-Move-backup-related-code-from-xlogfuncs.c-to-xlog.patchapplication/x-patch; name=v3-0003-Move-backup-related-code-from-xlogfuncs.c-to-xlog.patchDownload+133-131
#9Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Bharath Rupireddy (#8)
Re: Move backup-related code to xlogbackup.c/.h

On 2022-Oct-06, Bharath Rupireddy wrote:

On Thu, Oct 6, 2022 at 4:50 AM Andres Freund <andres@anarazel.de> wrote:

I'm doubtful it's a good idea to expose these to outside of xlog.c - they are
very low level, and it's very easy to break stuff by using them wrongly.

Hm. Here's the v3 patch set without exposing WAL insert lock related
functions. Please have a look.

Hmm, I don't like your 0001 very much. This sort of thing:

+/*
+ * Get the ControlFile.
+ */
+ControlFileData *
+GetControlFile(void)
+{
+       return ControlFile;
+}

looks too easy to misuse; what about locking? Also, isn't the addition
of ControlFile as a variable in do_pg_backup_start going to cause shadow
variable warnings? Given the locking requirements, I think it would be
feasible to copy stuff out of ControlFile under lock, then return the
copies.

+/*
+ * Increment runningBackups and forcePageWrites.
+ *
+ * NOTE: This function is tailor-made for use in xlogbackup.c. It doesn't set
+ * the respective XLogCtl members directly, and acquires and releases locks.
+ * Hence be careful when using it elsewhere.
+ */
+void
+SetXLogBackupRelatedInfo(void)

I understand that naming is difficult, but I think "Set foo Related
Info" seems way too vague. And the comment says "it doesn't set stuff
directly", and then it goes and sets stuff directly. What gives?

You added some commentary that these functions are tailor-made for
internal operations, and then declared them in the most public header
function that xlog has? I think at the bare minimum, these prototypes
should be in xlog_internal.h, not xlog.h.

I didn't look at 0002 and 0003 other than to notice that xlogbackup.h is
no longer removed from xlog.h. So what is the point of all this?

--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/

#10Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Alvaro Herrera (#9)
Re: Move backup-related code to xlogbackup.c/.h

On Wed, Oct 12, 2022 at 1:04 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

Hm. Here's the v3 patch set without exposing WAL insert lock related
functions. Please have a look.

Hmm, I don't like your 0001 very much. This sort of thing:

Thanks for reviewing.

+ControlFileData *
+GetControlFile(void)

looks too easy to misuse; what about locking? Also, isn't the addition
of ControlFile as a variable in do_pg_backup_start going to cause shadow
variable warnings? Given the locking requirements, I think it would be
feasible to copy stuff out of ControlFile under lock, then return the
copies.

+1. Done that way.

+/*
+ * Increment runningBackups and forcePageWrites.
+ *
+ * NOTE: This function is tailor-made for use in xlogbackup.c. It doesn't set
+ * the respective XLogCtl members directly, and acquires and releases locks.
+ * Hence be careful when using it elsewhere.
+ */
+void
+SetXLogBackupRelatedInfo(void)

I understand that naming is difficult, but I think "Set foo Related
Info" seems way too vague.

I've used SetXLogBackupActivity() and ResetXLogBackupActivity()
because they match with the members that these functions deal with.

And the comment says "it doesn't set stuff
directly", and then it goes and sets stuff directly. What gives?

My bad. That comment was meant for the reset function above. However,
I've removed it entirely now because one can look at the function and
infer that the forcePageWrites isn't set directly but only when
runningBackups is 0.

You added some commentary that these functions are tailor-made for
internal operations, and then declared them in the most public header
function that xlog has? I think at the bare minimum, these prototypes
should be in xlog_internal.h, not xlog.h.

I removed such comments. These are functions used by xlogbackup.c to
call back into xlog.c similar to the call back functions defined in
xlog.h for xlogrecovery.c. And, most of the XLogCtl set/get sort of
function declarations are in xlog.h. So, I'm retaining them in xlog.h.

I didn't look at 0002 and 0003 other than to notice that xlogbackup.h is
no longer removed from xlog.h. So what is the point of all this?

The whole idea is to move as much as possible backup related code to
xlogbackup.c/.h because xlog.c has already grown.

I've earlier moved macros BACKUP_LABEL_FILE, TABLESPACE_MAP to
xlogbackup.h, but I think they're good to stay in xlog.h as they're
being used in xlog.c and xlogrecovery.c. This reduces the xlogbackup.h
footprint a bit - we don't need xlogbackup.h in xlogrecovery.c.

Another reason we need xlogbackup.h in xlog.h is for
SessionBackupState and it needs to be set before we release WAL insert
locks, see the comment [1]* You might think that WALInsertLockRelease() can be called before * cleaning up session-level lock because session-level lock doesn't need * to be protected with WAL insertion lock. But since * CHECK_FOR_INTERRUPTS() can occur in it, session-level lock must be * cleaned up before it. */. Well, for this reason, should we move all
xlogbackup.c callbacks for xlog.c to xlog_internal.h? Or should we
just remove the SessionBackupState enum and convert
SESSION_BACKUP_NONE and SESSION_BACKUP_RUNNING to just macros in
xlogbackup.h and use integer type to pass the state across? I don't
know what's better here. Thoughts?

I'm attaching the v4 patch set, please review it further.

[1]: * You might think that WALInsertLockRelease() can be called before * cleaning up session-level lock because session-level lock doesn't need * to be protected with WAL insertion lock. But since * CHECK_FOR_INTERRUPTS() can occur in it, session-level lock must be * cleaned up before it. */
* You might think that WALInsertLockRelease() can be called before
* cleaning up session-level lock because session-level lock doesn't need
* to be protected with WAL insertion lock. But since
* CHECK_FOR_INTERRUPTS() can occur in it, session-level lock must be
* cleaned up before it.
*/

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v4-0001-Add-functions-for-xlogbackup.c-to-call-back-into-.patchapplication/x-patch; name=v4-0001-Add-functions-for-xlogbackup.c-to-call-back-into-.patchDownload+145-76
v4-0002-Move-backup-related-code-from-xlog.c-to-xlogbacku.patchapplication/x-patch; name=v4-0002-Move-backup-related-code-from-xlog.c-to-xlogbacku.patchDownload+694-683
v4-0003-Move-backup-related-code-from-xlogfuncs.c-to-xlog.patchapplication/x-patch; name=v4-0003-Move-backup-related-code-from-xlogfuncs.c-to-xlog.patchDownload+135-132
#11Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Bharath Rupireddy (#10)
Re: Move backup-related code to xlogbackup.c/.h

On 2022-Oct-13, Bharath Rupireddy wrote:

On Wed, Oct 12, 2022 at 1:04 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

You added some commentary that these functions are tailor-made for
internal operations, and then declared them in the most public header
function that xlog has? I think at the bare minimum, these prototypes
should be in xlog_internal.h, not xlog.h.

I removed such comments. These are functions used by xlogbackup.c to
call back into xlog.c similar to the call back functions defined in
xlog.h for xlogrecovery.c. And, most of the XLogCtl set/get sort of
function declarations are in xlog.h. So, I'm retaining them in xlog.h.

As I see it, xlog.h is a header that exports XLOG manipulations to the
outside world (everything that produces WAL, as well as stuff that
controls operation); xlog_internal is the header that exports xlog*.c
internal stuff for other xlog*.c files and specialized frontends to use.
These new functions are internal to xlogbackup.c and xlog.c, so IMO they
belong in xlog_internal.h.

Stuff that is used from xlog.c only by xlogrecovery.c should also appear
in xlog_internal.h only, not xlog.h, so I suggest not to take that as
precedent. Also, that file (xlogrecovery.c) is pretty new so we haven't
had time to nail down the .h layout yet.

--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/

#12Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Alvaro Herrera (#11)
Re: Move backup-related code to xlogbackup.c/.h

On Thu, Oct 13, 2022 at 3:12 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

As I see it, xlog.h is a header that exports XLOG manipulations to the
outside world (everything that produces WAL, as well as stuff that
controls operation); xlog_internal is the header that exports xlog*.c
internal stuff for other xlog*.c files and specialized frontends to use.
These new functions are internal to xlogbackup.c and xlog.c, so IMO they
belong in xlog_internal.h.

Stuff that is used from xlog.c only by xlogrecovery.c should also appear
in xlog_internal.h only, not xlog.h, so I suggest not to take that as
precedent. Also, that file (xlogrecovery.c) is pretty new so we haven't
had time to nail down the .h layout yet.

Hm. Agree. But, that requires us to include xlogbackup.h in
xlog_internal.h for SessionBackupState enum in
ResetXLogBackupActivity(). Is that okay?

SessionBackupState and it needs to be set before we release WAL insert
locks, see the comment [1]* You might think that WALInsertLockRelease() can be called before * cleaning up session-level lock because session-level lock doesn't need * to be protected with WAL insertion lock. But since * CHECK_FOR_INTERRUPTS() can occur in it, session-level lock must be * cleaned up before it. */. Should we just remove the
SessionBackupState enum and convert SESSION_BACKUP_NONE and
SESSION_BACKUP_RUNNING to just macros in xlogbackup.h and use integer
type to pass the state across? I don't know what's better here. Do you
have any thoughts on this?

[1]: * You might think that WALInsertLockRelease() can be called before * cleaning up session-level lock because session-level lock doesn't need * to be protected with WAL insertion lock. But since * CHECK_FOR_INTERRUPTS() can occur in it, session-level lock must be * cleaned up before it. */
* You might think that WALInsertLockRelease() can be called before
* cleaning up session-level lock because session-level lock doesn't need
* to be protected with WAL insertion lock. But since
* CHECK_FOR_INTERRUPTS() can occur in it, session-level lock must be
* cleaned up before it.
*/

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#13Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Bharath Rupireddy (#12)
Re: Move backup-related code to xlogbackup.c/.h

On 2022-Oct-13, Bharath Rupireddy wrote:

Hm. Agree. But, that requires us to include xlogbackup.h in
xlog_internal.h for SessionBackupState enum in
ResetXLogBackupActivity(). Is that okay?

It's not great, but it's not *that* bad, ISTM, mainly because
xlog_internal.h will affect less stuff than xlog.h.

SessionBackupState and it needs to be set before we release WAL insert
locks, see the comment [1].

I see. Maybe we could keep that enum in xlog.h, instead.

While looking at how that works: I think calling a local variable
"session_backup_state" is super confusing, seeing that we have a
file-global variable called sessionBackupState. I recommend naming the
local "newstate" or something along those lines instead.

I wonder why does pg_backup_start_callback() not change the backup state
before your patch. This seems a gratuitous difference, or is it? If
you change that code so that it also sets the status to BACKUP_NONE,
then you can pass a bare SessionBackupState to ResetXLogBackupActivity
rather than a pointer to one, which is a very strange arrangement that
exists only so that you can have a third state (NULL) meaning "don't
change state" -- that looks quite weird.

Alternatively, if you don't want or can't change
pg_backup_start_callback to pass a valid state value, another solution
might be to pass a separate boolean "change state".

But I would look at having another patch before your series that changes
pg_backup_start_callback to make the code identical for the three
callers, then you can simplify the patched code.

Should we just remove the
SessionBackupState enum and convert SESSION_BACKUP_NONE and
SESSION_BACKUP_RUNNING to just macros in xlogbackup.h and use integer
type to pass the state across? I don't know what's better here. Do you
have any thoughts on this?

No, please, no passing of unadorned magic numbers.

--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/

#14Robert Haas
robertmhaas@gmail.com
In reply to: Alvaro Herrera (#13)
Re: Move backup-related code to xlogbackup.c/.h

On Thu, Oct 13, 2022 at 7:13 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

On 2022-Oct-13, Bharath Rupireddy wrote:

Hm. Agree. But, that requires us to include xlogbackup.h in
xlog_internal.h for SessionBackupState enum in
ResetXLogBackupActivity(). Is that okay?

It's not great, but it's not *that* bad, ISTM, mainly because
xlog_internal.h will affect less stuff than xlog.h.

This is unfortunately a lot less true than I would like. I count 75
places where we #include "access/xlog.h" and 53 where we #include
"access/xlog_internal.h". And many of those are in frontend code. I
feel like the contents of xlog_internal.h are a bit too eclectic.
Maybe stuff that has to do with the on-disk directory structure, like
XLOGDIR and XLOG_FNAME_LEN, as well as stuff that has to do with where
bytes are located, like XLByteToSeg, should move to another file.
Besides that, which is the biggest part of the file, there's also
stuff that has to do with the page and record format generally (like
XLOG_PAGE_MAGIC and SizeOfXLogShortPHD) and stuff that is used for
certain specific WAL record types (like xl_parameter_change and
xl_restore_point) and some random rmgr-related things (like RmgrData
and the stuff that folllows) and the usual assortment of random GUCs
and global variables (like RecoveryTargetAction and
ArchiveRecoveryRequested). Maybe it doesn't make sense to split this
up into a thousand tiny little header files, but I think some
rethinking would be a good idea, because it really doesn't make much
sense to me to mix stuff that has to do with file-naming conventions,
which a bunch of frontend code needs to know about, together with a
bunch of backend-only things.

--
Robert Haas
EDB: http://www.enterprisedb.com

#15Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Alvaro Herrera (#13)
Re: Move backup-related code to xlogbackup.c/.h

On Thu, Oct 13, 2022 at 4:43 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

Thanks for reviewing.

Hm. Agree. But, that requires us to include xlogbackup.h in
xlog_internal.h for SessionBackupState enum in
ResetXLogBackupActivity(). Is that okay?

It's not great, but it's not *that* bad, ISTM, mainly because
xlog_internal.h will affect less stuff than xlog.h.

Moved them to xlog_internal.h without xlogbackup.h included, please see below.

SessionBackupState and it needs to be set before we release WAL insert
locks, see the comment [1].

I see. Maybe we could keep that enum in xlog.h, instead.

It's not required now, please see below.

While looking at how that works: I think calling a local variable
"session_backup_state" is super confusing, seeing that we have a
file-global variable called sessionBackupState. I recommend naming the
local "newstate" or something along those lines instead.

I wonder why does pg_backup_start_callback() not change the backup state
before your patch. This seems a gratuitous difference, or is it? If
you change that code so that it also sets the status to BACKUP_NONE,
then you can pass a bare SessionBackupState to ResetXLogBackupActivity
rather than a pointer to one, which is a very strange arrangement that
exists only so that you can have a third state (NULL) meaning "don't
change state" -- that looks quite weird.

Alternatively, if you don't want or can't change
pg_backup_start_callback to pass a valid state value, another solution
might be to pass a separate boolean "change state".

But I would look at having another patch before your series that changes
pg_backup_start_callback to make the code identical for the three
callers, then you can simplify the patched code.

The pg_backup_start_callback() can just go ahead and reset
sessionBackupState. However, it leads us to the complete removal of
pg_backup_start_callback() itself and use do_pg_abort_backup()
consistently across, saving 20 LOC attached as v5-0001.

With this, the other patches would get simplified a bit too,
xlogbackup.h footprint got reduced now.

Please find the v5 patch-set. 0002-0004 moves the backup code to
xlogbackup.c/.h.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v5-0001-Use-do_pg_abort_backup-consistently-across-the-ba.patchapplication/x-patch; name=v5-0001-Use-do_pg_abort_backup-consistently-across-the-ba.patchDownload+2-22
v5-0002-Add-functions-for-xlogbackup.c-to-call-back-into-.patchapplication/x-patch; name=v5-0002-Add-functions-for-xlogbackup.c-to-call-back-into-.patchDownload+141-66
v5-0003-Move-backup-related-code-from-xlog.c-to-xlogbacku.patchapplication/x-patch; name=v5-0003-Move-backup-related-code-from-xlog.c-to-xlogbacku.patchDownload+685-672
v5-0004-Move-backup-related-code-from-xlogfuncs.c-to-xlog.patchapplication/x-patch; name=v5-0004-Move-backup-related-code-from-xlogfuncs.c-to-xlog.patchDownload+133-133
#16Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Bharath Rupireddy (#15)
Re: Move backup-related code to xlogbackup.c/.h

On 2022-Oct-13, Bharath Rupireddy wrote:

The pg_backup_start_callback() can just go ahead and reset
sessionBackupState. However, it leads us to the complete removal of
pg_backup_start_callback() itself and use do_pg_abort_backup()
consistently across, saving 20 LOC attached as v5-0001.

OK, that's not bad -- but there is a fatal flaw here: do_pg_backup_start
only sets sessionBackupState *after* it has finished setting things up,
so if you only change it like this, do_pg_abort_backup will indeed run,
but it'll do nothing because it hits the "quick exit" test. Therefore,
if a backup aborts while setting up, you'll keep running with forced
page writes until next postmaster crash or restart. Not good.

ISTM we need to give another flag to the callback function besides
emit_warning: one that says whether to test sessionBackupState or not.
I suppose the easiest way to do it with no other changes is to turn
'arg' into a bitmask.
But alternatively, we could just remove emit_warning as a flag and have
the warning be emitted always; then we can use the boolean for the other
purpose. I don't think the extra WARNING thrown during backup set-up is
going to be a problem, since it will mostly never be seen anyway (and if
you do see it, it's not a lie.)

However, what's most problematic about this patch is that it introduces
a pretty serious bug, yet that bug goes unnoticed if you just run the
builtin test suites. I only noticed because I added an elog(ERROR,
"oops") in the area protected by ENSURE_ERROR_CLEANUP and a debug
elog(WARNING) in the cleanup area, then examined the server log after
the pg_basebackup test filed; but this is not very workable. I wonder
what would be a good way to keep this in check. The naive way seems to
be to run a pg_basebackup, have it abort partway through (how?), then
test the server and see if forced page writes are enabled or not.

--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
"The problem with the facetime model is not just that it's demoralizing, but
that the people pretending to work interrupt the ones actually working."
(Paul Graham)

#17Michael Paquier
michael@paquier.xyz
In reply to: Alvaro Herrera (#16)
Re: Move backup-related code to xlogbackup.c/.h

On Fri, Oct 14, 2022 at 10:24:41AM +0200, Alvaro Herrera wrote:

However, what's most problematic about this patch is that it introduces
a pretty serious bug, yet that bug goes unnoticed if you just run the
builtin test suites. I only noticed because I added an elog(ERROR,
"oops") in the area protected by ENSURE_ERROR_CLEANUP and a debug
elog(WARNING) in the cleanup area, then examined the server log after
the pg_basebackup test filed; but this is not very workable. I wonder
what would be a good way to keep this in check. The naive way seems to
be to run a pg_basebackup, have it abort partway through (how?), then
test the server and see if forced page writes are enabled or not.

See around the bottom of 010_pg_basebackup.pl, where a combination of
IPC::Run::start('pg_basebackup') with --max-rate and
pg_terminate_backend() is able to achieve that.
--
Michael

#18Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Alvaro Herrera (#16)
Re: Move backup-related code to xlogbackup.c/.h

On Fri, Oct 14, 2022 at 1:54 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

On 2022-Oct-13, Bharath Rupireddy wrote:

The pg_backup_start_callback() can just go ahead and reset
sessionBackupState. However, it leads us to the complete removal of
pg_backup_start_callback() itself and use do_pg_abort_backup()
consistently across, saving 20 LOC attached as v5-0001.

OK, that's not bad -- but there is a fatal flaw here: do_pg_backup_start
only sets sessionBackupState *after* it has finished setting things up,
so if you only change it like this, do_pg_abort_backup will indeed run,
but it'll do nothing because it hits the "quick exit" test. Therefore,
if a backup aborts while setting up, you'll keep running with forced
page writes until next postmaster crash or restart. Not good.

Ugh.

ISTM we need to give another flag to the callback function besides
emit_warning: one that says whether to test sessionBackupState or not.

I think this needs a new structure, something like below, which makes
things complex.
typedef struct pg_abort_backup_params
{
/* This tells whether or not the do_pg_abort_backup callback can
quickly exit. */
bool can_quick_exit;
/* This tells whether or not the do_pg_abort_backup callback can
emit a warning. */
bool emit_warning;
} pg_abort_backup_params;

I suppose the easiest way to do it with no other changes is to turn
'arg' into a bitmask.

This one too isn't good IMO.

But alternatively, we could just remove emit_warning as a flag and have
the warning be emitted always; then we can use the boolean for the other
purpose. I don't think the extra WARNING thrown during backup set-up is
going to be a problem, since it will mostly never be seen anyway (and if
you do see it, it's not a lie.)

+1 for this.

Please review the v6 patch-set further.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v6-0001-Use-do_pg_abort_backup-consistently-across-the-ba.patchapplication/octet-stream; name=v6-0001-Use-do_pg_abort_backup-consistently-across-the-ba.patchDownload+11-30
v6-0002-Add-functions-for-xlogbackup.c-to-call-back-into-.patchapplication/octet-stream; name=v6-0002-Add-functions-for-xlogbackup.c-to-call-back-into-.patchDownload+141-66
v6-0003-Move-backup-related-code-from-xlog.c-to-xlogbacku.patchapplication/octet-stream; name=v6-0003-Move-backup-related-code-from-xlog.c-to-xlogbacku.patchDownload+686-673
v6-0004-Move-backup-related-code-from-xlogfuncs.c-to-xlog.patchapplication/octet-stream; name=v6-0004-Move-backup-related-code-from-xlogfuncs.c-to-xlog.patchDownload+133-133
#19Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Bharath Rupireddy (#18)
Re: Move backup-related code to xlogbackup.c/.h

On 2022-Oct-14, Bharath Rupireddy wrote:

On Fri, Oct 14, 2022 at 1:54 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

But alternatively, we could just remove emit_warning as a flag and have
the warning be emitted always; then we can use the boolean for the other
purpose. I don't think the extra WARNING thrown during backup set-up is
going to be a problem, since it will mostly never be seen anyway (and if
you do see it, it's not a lie.)

+1 for this.

OK, pushed 0001, but I modified it some more, because the flag is not
really a "quick exit" optimization but actually critical for
correctness; so I reworked the function to have an if block around it
rather than an early return, and I added an assert about the flag and
session backup state. CI was green for it and on manual testing it
seems to work correctly.

--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
"No es bueno caminar con un hombre muerto"

#20Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Alvaro Herrera (#19)
Re: Move backup-related code to xlogbackup.c/.h

Another point before we move on with your 0002 is that forcePageWrites
is no longer useful and we can remove it, as per the attached.

--
Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
"No deja de ser humillante para una persona de ingenio saber
que no hay tonto que no le pueda enseñar algo." (Jean B. Say)

Attachments:

0001-get-rid-of-forcePageWrites.patchtext/x-diff; charset=us-asciiDownload+19-32
#21Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Alvaro Herrera (#20)
#22Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Bharath Rupireddy (#21)
#23Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Alvaro Herrera (#22)
#24Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Bharath Rupireddy (#23)
#25Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Bharath Rupireddy (#24)
#26Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Alvaro Herrera (#25)
#27Michael Paquier
michael@paquier.xyz
In reply to: Bharath Rupireddy (#26)
#28Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Michael Paquier (#27)
#29Gregory Stark (as CFM)
stark.cfm@gmail.com
In reply to: Bharath Rupireddy (#28)