Two pg_rewind patches (auto generate recovery conf and ensure clean shutdown)
Hello, Postgres hackers,
Please see the attached patches.
The first patch adds an option to automatically generate recovery conf
contents in related files, following pg_basebackup. In the patch,
GenerateRecoveryConf(), WriteRecoveryConf() and escape_quotes() are almost
same as them on pg_basebackup. The main difference is due to replication
slot support and code (variables) limit. It seems that we could slightly
refactor later to put some common code into another file after aligning
pg_rewind with pg_basebackup. This was tested manually and was done by
Jimmy (cc-ed), Ashiwin (cc-ed) and me.
Another patch does automatic clean shutdown by running a single mode
postgres instance if the target was not clean shut down since that is
required by pg_rewind. This was manually tested and was done by Jimmy
(cc-ed) and me. I'm not sure if we want a test case for that though.
Thanks.
Attachments:
0001-Ensure-target-clean-shutdown-at-beginning-of-pg_rewi.patchapplication/octet-stream; name=0001-Ensure-target-clean-shutdown-at-beginning-of-pg_rewi.patchDownload+60-1
0001-Auto-generate-recovery-conf-at-the-end-of-pg_re.patchapplication/octet-stream; name=0001-Auto-generate-recovery-conf-at-the-end-of-pg_re.patchDownload+166-3
On Tue, Mar 19, 2019 at 02:09:03PM +0800, Paul Guo wrote:
The first patch adds an option to automatically generate recovery conf
contents in related files, following pg_basebackup. In the patch,
GenerateRecoveryConf(), WriteRecoveryConf() and escape_quotes() are almost
same as them on pg_basebackup. The main difference is due to replication
slot support and code (variables) limit. It seems that we could slightly
refactor later to put some common code into another file after aligning
pg_rewind with pg_basebackup. This was tested manually and was done by
Jimmy (cc-ed), Ashiwin (cc-ed) and me.
Interesting. The two routines have really the same logic, I would
recommend to have a first patch which does the refactoring and have
pg_rewind use it, and then a second patch which writes recovery.conf
and uses the first patch to get the contents. Please note that the
common routine needs to be version-aware as pg_basebackup requires
compatibility with past versions, but you could just pass the version
number from the connection, and have pg_rewind pass the compiled-in
version value.
Another patch does automatic clean shutdown by running a single mode
postgres instance if the target was not clean shut down since that is
required by pg_rewind. This was manually tested and was done by Jimmy
(cc-ed) and me. I'm not sure if we want a test case for that though.
I am not sure that I see the value in that. I'd rather let the
required service start and stop out of pg_rewind and not introduce
dependencies with other binaries. This step can also take quite some
time depending on the amount of WAL to replay post-crash at recovery
and the shutdown checkpoint which is required to reach a consistent
on-disk state.
--
Michael
On Tue, Mar 19, 2019 at 2:18 PM Michael Paquier <michael@paquier.xyz> wrote:
On Tue, Mar 19, 2019 at 02:09:03PM +0800, Paul Guo wrote:
The first patch adds an option to automatically generate recovery conf
contents in related files, following pg_basebackup. In the patch,
GenerateRecoveryConf(), WriteRecoveryConf() and escape_quotes() arealmost
same as them on pg_basebackup. The main difference is due to replication
slot support and code (variables) limit. It seems that we could slightly
refactor later to put some common code into another file after aligning
pg_rewind with pg_basebackup. This was tested manually and was done by
Jimmy (cc-ed), Ashiwin (cc-ed) and me.Interesting. The two routines have really the same logic, I would
recommend to have a first patch which does the refactoring and have
pg_rewind use it, and then a second patch which writes recovery.conf
and uses the first patch to get the contents. Please note that the
This is a good suggestion also. Will do it.
common routine needs to be version-aware as pg_basebackup requires
compatibility with past versions, but you could just pass the version
number from the connection, and have pg_rewind pass the compiled-in
version value.Another patch does automatic clean shutdown by running a single mode
postgres instance if the target was not clean shut down since that is
required by pg_rewind. This was manually tested and was done by Jimmy
(cc-ed) and me. I'm not sure if we want a test case for that though.I am not sure that I see the value in that. I'd rather let the
required service start and stop out of pg_rewind and not introduce
dependencies with other binaries. This step can also take quite some
This makes recovery more automatically. Yes, it will add the dependency on
the postgres
binary, but it seems that most time pg_rewind should be shipped as postgres
in the same install directory. From my experience of manually testing
pg_rewind,
I feel that this besides auto-recovery-conf writing really alleviate my
burden. I'm not sure how
other users usually do before running pg_rewind when the target is not
cleanly shut down,
but probably we can add an argument to pg_rewind to give those people who
want to
handle target separately another chance? default on or off whatever.
time depending on the amount of WAL to replay post-crash at recovery
and the shutdown checkpoint which is required to reach a consistent
on-disk state.
The time is still required for people who want to make the target ready for
pg_rewind in another way.
Thanks.
On Wed, Mar 20, 2019 at 12:48:52PM +0800, Paul Guo wrote:
This is a good suggestion also. Will do it.
Please note also that we don't care about recovery.conf since v12 as
recovery parameters are now GUCs. I would suggest appending those
extra parameters to postgresql.auto.conf, which is what pg_basebackup
does.
--
Michael
On Wed, Mar 20, 2019 at 1:20 PM Michael Paquier <michael@paquier.xyz> wrote:
On Wed, Mar 20, 2019 at 12:48:52PM +0800, Paul Guo wrote:
This is a good suggestion also. Will do it.
Please note also that we don't care about recovery.conf since v12 as
recovery parameters are now GUCs. I would suggest appending those
extra parameters to postgresql.auto.conf, which is what pg_basebackup
does.
Yes, the recovery conf patch in the first email did like this, i.e. writing
postgresql.auto.conf & standby.signal
Hi Michael,
I updated the patches as attached following previous discussions.
The two patches:
v2-0001-Extact-common-functions-from-pg_basebackup-into-s.patch
v2-0002-Add-option-to-write-recovery-configuration-inform.patch
1. 0001 does move those common functions & variables to two new files (one
.c and one .h) for both pg_rewind and pg_basebackup use,
note the functions are slightly modified (e.g. because conn is probably
NULL on pg_rewind). I do not know where is more proper to put the
new files. Currently, they are under pg_basebackup and are used in
pg_rewind (Makefile modified to support that).
2. 0002 adds the option to write recovery conf.
The below patch runs single mode Postgres if needed to make sure the target
is cleanly shutdown. A new option is added (off by default).
v2-0001-Ensure-target-clean-shutdown-at-beginning-of-pg_r.patch
I've manually tested them and installcheck passes.
Thanks.
On Wed, Mar 20, 2019 at 1:23 PM Paul Guo <pguo@pivotal.io> wrote:
Show quoted text
On Wed, Mar 20, 2019 at 1:20 PM Michael Paquier <michael@paquier.xyz>
wrote:On Wed, Mar 20, 2019 at 12:48:52PM +0800, Paul Guo wrote:
This is a good suggestion also. Will do it.
Please note also that we don't care about recovery.conf since v12 as
recovery parameters are now GUCs. I would suggest appending those
extra parameters to postgresql.auto.conf, which is what pg_basebackup
does.Yes, the recovery conf patch in the first email did like this, i.e.
writing postgresql.auto.conf & standby.signal
Attachments:
v2-0001-Ensure-target-clean-shutdown-at-beginning-of-pg_r.patchapplication/octet-stream; name=v2-0001-Ensure-target-clean-shutdown-at-beginning-of-pg_r.patchDownload+68-1
v2-0001-Extact-common-functions-from-pg_basebackup-into-s.patchapplication/octet-stream; name=v2-0001-Extact-common-functions-from-pg_basebackup-into-s.patchDownload+234-183
v2-0002-Add-option-to-write-recovery-configuration-inform.patchapplication/octet-stream; name=v2-0002-Add-option-to-write-recovery-configuration-inform.patchDownload+18-2
On Fri, Apr 19, 2019 at 3:41 PM Paul Guo <pguo@pivotal.io> wrote:
I updated the patches as attached following previous discussions.
Hi Paul,
Could we please have a fresh rebase now that the CF is here?
Thanks,
--
Thomas Munro
https://enterprisedb.com
On 2019-Apr-19, Paul Guo wrote:
The below patch runs single mode Postgres if needed to make sure the target
is cleanly shutdown. A new option is added (off by default).
v2-0001-Ensure-target-clean-shutdown-at-beginning-of-pg_r.patch
Why do we need an option for this? Is there a reason not to do this
unconditionally?
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Rebased, aligned with recent changes in pg_rewind/pg_basebackup and then
retested. Thanks.
On Mon, Jul 1, 2019 at 7:35 PM Thomas Munro <thomas.munro@gmail.com> wrote:
Show quoted text
On Fri, Apr 19, 2019 at 3:41 PM Paul Guo <pguo@pivotal.io> wrote:
I updated the patches as attached following previous discussions.
Hi Paul,
Could we please have a fresh rebase now that the CF is here?
Thanks,
--
Thomas Munro
Attachments:
v3-0001-Extact-common-functions-from-pg_basebackup-into-s.patchapplication/octet-stream; name=v3-0001-Extact-common-functions-from-pg_basebackup-into-s.patchDownload+230-179
v3-0002-Add-option-to-write-recovery-configuration-inform.patchapplication/octet-stream; name=v3-0002-Add-option-to-write-recovery-configuration-inform.patchDownload+18-2
v3-0003-Ensure-target-clean-shutdown-at-the-beginning-of-.patchapplication/octet-stream; name=v3-0003-Ensure-target-clean-shutdown-at-the-beginning-of-.patchDownload+69-2
On Tue, Jul 2, 2019 at 12:35 AM Alvaro Herrera <alvherre@2ndquadrant.com>
wrote:
On 2019-Apr-19, Paul Guo wrote:
The below patch runs single mode Postgres if needed to make sure the
target
is cleanly shutdown. A new option is added (off by default).
v2-0001-Ensure-target-clean-shutdown-at-beginning-of-pg_r.patchWhy do we need an option for this? Is there a reason not to do this
unconditionally?
There is concern about this (see previous emails in this thread). On
greenplum (MPP DB based on Postgres),
we unconditionally do this. I'm not sure about usually how Postgres users
do this when there is an unclean shutdown,
but providing an option seem to be safer to avoid breaking existing
script/service whatever. If many people
think this option is unnecessary, I'm fine to remove the option and keep
the code logic.
On Tue, Jul 2, 2019 at 5:46 PM Paul Guo <pguo@pivotal.io> wrote:
Rebased, aligned with recent changes in pg_rewind/pg_basebackup and then retested. Thanks.
Hi Paul,
A minor build problem on Windows:
src/bin/pg_rewind/pg_rewind.c(32): fatal error C1083: Cannot open
include file: 'backup_common.h': No such file or directory
[C:\projects\postgresql\pg_rewind.vcxproj]
https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.46422
http://cfbot.cputube.org/paul-guo.html
--
Thomas Munro
https://enterprisedb.com
Yes, the patches changed Makefile so that pg_rewind and pg_basebackup could
use some common code, but for Windows build, I'm not sure where are those
window build files. Does anyone know about that? Thanks.
On Tue, Jul 9, 2019 at 6:55 AM Thomas Munro <thomas.munro@gmail.com> wrote:
Show quoted text
On Tue, Jul 2, 2019 at 5:46 PM Paul Guo <pguo@pivotal.io> wrote:
Rebased, aligned with recent changes in pg_rewind/pg_basebackup and then
retested. Thanks.
Hi Paul,
A minor build problem on Windows:
src/bin/pg_rewind/pg_rewind.c(32): fatal error C1083: Cannot open
include file: 'backup_common.h': No such file or directory
[C:\projects\postgresql\pg_rewind.vcxproj]--
Thomas Munro
On Tue, Jul 09, 2019 at 10:48:49PM +0800, Paul Guo wrote:
Yes, the patches changed Makefile so that pg_rewind and pg_basebackup could
use some common code, but for Windows build, I'm not sure where are those
window build files. Does anyone know about that? Thanks.
The VS scripts are located in src/tools/msvc/. You will likely need
to tweak things like $frontend_extraincludes or variables in the same
area for this patch (please see Mkvcbuild.pm).
--
Michael
On Wed, Jul 10, 2019 at 3:28 PM Michael Paquier <michael@paquier.xyz> wrote:
On Tue, Jul 09, 2019 at 10:48:49PM +0800, Paul Guo wrote:
Yes, the patches changed Makefile so that pg_rewind and pg_basebackup
could
use some common code, but for Windows build, I'm not sure where are those
window build files. Does anyone know about that? Thanks.The VS scripts are located in src/tools/msvc/. You will likely need
to tweak things like $frontend_extraincludes or variables in the same
area for this patch (please see Mkvcbuild.pm).
Thanks. Both Mkvcbuild.pm and pg_rewind/Makefile are modified to make
Windows build pass in a
local environment (Hopefully this passes the CI testing), also now
pg_rewind/Makefile does not
create soft link for backup_common.h anymore. Instead -I is used to specify
the header directory.
I also noticed that doc change is needed so modified documents for the two
new options accordingly.
Please see the attached new patches.
Attachments:
v4-0002-Add-option-to-write-recovery-configuration-inform.patchapplication/octet-stream; name=v4-0002-Add-option-to-write-recovery-configuration-inform.patchDownload+29-2
v4-0003-Ensure-target-clean-shutdown-at-the-beginning-of-.patchapplication/octet-stream; name=v4-0003-Ensure-target-clean-shutdown-at-the-beginning-of-.patchDownload+86-2
v4-0001-Extact-common-functions-from-pg_basebackup-into-s.patchapplication/octet-stream; name=v4-0001-Extact-common-functions-from-pg_basebackup-into-s.patchDownload+228-180
On 2019-Jul-15, Paul Guo wrote:
Thanks. Both Mkvcbuild.pm and pg_rewind/Makefile are modified to make
Windows build pass in a
local environment (Hopefully this passes the CI testing), also now
pg_rewind/Makefile does not
create soft link for backup_common.h anymore. Instead -I is used to specify
the header directory.
It seems there's minor breakage in the build, per CFbot. Can you
please rebase this?
Thanks
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Show quoted text
It seems there's minor breakage in the build, per CFbot. Can you
please rebase this?There is a code conflict. See attached for the new version. Thanks.
Attachments:
v5-0001-Extact-common-functions-from-pg_basebackup-into-s.patchapplication/octet-stream; name=v5-0001-Extact-common-functions-from-pg_basebackup-into-s.patchDownload+228-180
v5-0002-Add-option-to-write-recovery-configuration-inform.patchapplication/octet-stream; name=v5-0002-Add-option-to-write-recovery-configuration-inform.patchDownload+29-2
v5-0003-Ensure-target-clean-shutdown-at-the-beginning-of-.patchapplication/octet-stream; name=v5-0003-Ensure-target-clean-shutdown-at-the-beginning-of-.patchDownload+86-2
Thank for rebasing.
I didn't like 0001 very much.
* It seems now would be the time to stop pretending we're using a file
called recovery.conf; I know we still support older server versions that
use that file, but it sounds like we should take the opportunity to
rename the function to be less misleading once those versions vanish out
of existance.
* disconnect_and_exit seems a bit out of place compared to the other
parts of this new module. I think you only put it there so that the
'conn' can be a global, and that you can stop passing 'conn' as a
variable to GenerateRecoveryConf. It seems more modular to me to keep
it as a separate variable in each program and have it passed down to the
routine that writes the file.
* From modularity also seems better to me to avoid a global variable
'recoveryconfcontents' and instead return the string from
GenerateRecoveryConf to pass as a param to WriteRecoveryConf.
(In fact, I wonder why the current structure is like it is, namely to
have ReceiveAndUnpackTarFile write the file; why wouldn't be its caller
be responsible for writing it?)
I wonder about putting this new file in src/fe_utils instead of keeping
it in pg_basebackup and symlinking to pg_rewind. Maybe if we make it a
true module (recovery_config_gen.c) it makes more sense there.
0002 seems okay as far as it goes.
0003:
I still don't understand why we need a command-line option to do this.
Why should it not be the default behavior?
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Thank for rebasing.
I didn't like 0001 very much.
* It seems now would be the time to stop pretending we're using a file
called recovery.conf; I know we still support older server versions that
use that file, but it sounds like we should take the opportunity to
rename the function to be less misleading once those versions vanish out
of existance.
How about renaming the function names to
GenerateRecoveryConf -> GenerateRecoveryConfContents
WriteRecoveryConf -> WriteRecoveryConfInfo <- it writes standby.signal
on pg12+, so function name WriteRecoveryConfContents is not accurate.
and
variable writerecoveryconf -> write_recovery_conf_info?
* disconnect_and_exit seems a bit out of place compared to the other
parts of this new module. I think you only put it there so that the
'conn' can be a global, and that you can stop passing 'conn' as a
variable to GenerateRecoveryConf. It seems more modular to me to keep
it as a separate variable in each program and have it passed down to the
routine that writes the file.* From modularity also seems better to me to avoid a global variable
'recoveryconfcontents' and instead return the string from
GenerateRecoveryConf to pass as a param to WriteRecoveryConf.
(In fact, I wonder why the current structure is like it is, namely to
have ReceiveAndUnpackTarFile write the file; why wouldn't be its caller
be responsible for writing it?)
Reasonable to make common code include less variables. I can try modifying
the patches to remove the previously added variables below in the common
code.
+/* Contents of configuration file to be generated */
+extern PQExpBuffer recoveryconfcontents;
+
+extern bool writerecoveryconf;
+extern char *replication_slot;
+PGconn *conn;
I wonder about putting this new file in src/fe_utils instead of keeping
it in pg_basebackup and symlinking to pg_rewind. Maybe if we make it a
true module (recovery_config_gen.c) it makes more sense there.I thought some about where to put the common code also. It seems pg_rewind
and pg_basebackup are the only consumers of the small common code. I doubt
it deserves a separate file under src/fe_utils.
0003:
I still don't understand why we need a command-line option to do this.
Why should it not be the default behavior?
This was discussed but frankly speaking I do not know how other postgres
users or enterprise providers handle this (probably some have own scripts?).
I could easily remove the option code if more and more people agree on that
or at least we could turn it on by default?
Thanks
On 2019-Sep-09, Paul Guo wrote:
Thank for rebasing.
I didn't like 0001 very much.
* It seems now would be the time to stop pretending we're using a file
called recovery.conf; I know we still support older server versions that
use that file, but it sounds like we should take the opportunity to
rename the function to be less misleading once those versions vanish out
of existance.How about renaming the function names to
GenerateRecoveryConf -> GenerateRecoveryConfContents
WriteRecoveryConf -> WriteRecoveryConfInfo <- it writes standby.signal
on pg12+, so function name WriteRecoveryConfContents is not accurate.
GenerateRecoveryConfig / WriteRecoveryConfig ?
I wonder about putting this new file in src/fe_utils instead of keeping
it in pg_basebackup and symlinking to pg_rewind. Maybe if we make it a
true module (recovery_config_gen.c) it makes more sense there.I thought some about where to put the common code also. It seems pg_rewind
and pg_basebackup are the only consumers of the small common code. I doubt
it deserves a separate file under src/fe_utils.
Hmm, but other things there are also used by only two programs, say
psqlscan.l and conditional.c are just for psql and pgbench.
0003:
I still don't understand why we need a command-line option to do this.
Why should it not be the default behavior?This was discussed but frankly speaking I do not know how other postgres
users or enterprise providers handle this (probably some have own scripts?).
I could easily remove the option code if more and more people agree on that
or at least we could turn it on by default?
Well, I've seen no contrary votes, and frankly I see no use for the
opposite (current) behavior.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
I've updated the patch series following the suggestions. Thanks.