speed up a logical replica setup
Hi,
Logical replication has been used to migration with minimal downtime. However,
if you are dealing with a big database, the amount of required resources (disk
-- due to WAL retention) increases as the backlog (WAL) increases. Unless you
have a generous amount of resources and can wait for long period of time until
the new replica catches up, creating a logical replica is impracticable on
large databases.
The general idea is to create and convert a physical replica or a base backup
(archived WAL files available) into a logical replica. The initial data copy
and catchup tends to be faster on a physical replica. This technique has been
successfully used in pglogical_create_subscriber [1]https://github.com/2ndQuadrant/pglogical.
A new tool called pg_subscriber does this conversion and is tightly integrated
with Postgres.
DESIGN
The conversion requires 8 steps.
1. Check if the target data directory has the same system identifier than the
source data directory.
2. Stop the target server if it is running as a standby server. (Modify
recovery parameters requires a restart.)
3. Create one replication slot per specified database on the source server. One
additional replication slot is created at the end to get the consistent LSN
(This consistent LSN will be used as (a) a stopping point for the recovery
process and (b) a starting point for the subscriptions).
4. Write recovery parameters into the target data directory and start the
target server (Wait until the target server is promoted).
5. Create one publication (FOR ALL TABLES) per specified database on the source
server.
6. Create one subscription per specified database on the target server (Use
replication slot and publication created in a previous step. Don't enable the
subscriptions yet).
7. Sets the replication progress to the consistent LSN that was got in a
previous step.
8. Enable the subscription for each specified database on the target server.
This tool does not take a base backup. It can certainly be included later.
There is already a tool do it: pg_basebackup.
There is a --subscriber-conninfo option to inform the subscriber connection
string, however, we could remove it since this tool runs on the subscriber and
we can build a connection string.
NAME
I'm not sure about the proposed name. I came up with this one because it is not
so long. The last added tools uses pg_ prefix, verb (action) and object.
pg_initsubscriber and pg_createsubscriber are names that I thought but I'm not
excited about it.
DOCUMENTATION
It is available and describes this tool.
TESTS
Basic tests are included. It requires some tests to exercise this tool.
Comments?
[1]: https://github.com/2ndQuadrant/pglogical
--
Euler Taveira
EDB https://www.enterprisedb.com/
Attachments:
v1-0001-Move-readfile-and-free_readfile-to-file_utils.h.patchtext/x-patch; name=v1-0001-Move-readfile-and-free_readfile-to-file_utils.h.patchDownload+136-135
v1-0002-Create-a-new-logical-replica-from-a-base-backup-o.patchtext/x-patch; name=v1-0002-Create-a-new-logical-replica-from-a-base-backup-o.patchDownload+1789-2
Hi,
On 2022-02-21 09:09:12 -0300, Euler Taveira wrote:
Logical replication has been used to migration with minimal downtime. However,
if you are dealing with a big database, the amount of required resources (disk
-- due to WAL retention) increases as the backlog (WAL) increases. Unless you
have a generous amount of resources and can wait for long period of time until
the new replica catches up, creating a logical replica is impracticable on
large databases.
Indeed.
DESIGN
The conversion requires 8 steps.
1. Check if the target data directory has the same system identifier than the
source data directory.
2. Stop the target server if it is running as a standby server. (Modify
recovery parameters requires a restart.)
3. Create one replication slot per specified database on the source server. One
additional replication slot is created at the end to get the consistent LSN
(This consistent LSN will be used as (a) a stopping point for the recovery
process and (b) a starting point for the subscriptions).
4. Write recovery parameters into the target data directory and start the
target server (Wait until the target server is promoted).
5. Create one publication (FOR ALL TABLES) per specified database on the source
server.
6. Create one subscription per specified database on the target server (Use
replication slot and publication created in a previous step. Don't enable the
subscriptions yet).
7. Sets the replication progress to the consistent LSN that was got in a
previous step.
8. Enable the subscription for each specified database on the target server.
I think the system identifier should also be changed, otherwise you can way
too easily get into situations trying to apply WAL from different systems to
each other. Not going to end well, obviously.
This tool does not take a base backup. It can certainly be included later.
There is already a tool do it: pg_basebackup.
It would make sense to allow to call pg_basebackup from the new tool. Perhaps
with a --pg-basebackup-parameters or such.
Greetings,
Andres Freund
On Mon, Feb 21, 2022, at 8:28 PM, Andres Freund wrote:
I think the system identifier should also be changed, otherwise you can way
too easily get into situations trying to apply WAL from different systems to
each other. Not going to end well, obviously.
Good point.
This tool does not take a base backup. It can certainly be included later.
There is already a tool do it: pg_basebackup.It would make sense to allow to call pg_basebackup from the new tool. Perhaps
with a --pg-basebackup-parameters or such.
Yeah. I'm planning to do that in a near future. There are a few questions in my
mind. Should we call the pg_basebackup directly (like
pglogical_create_subscriber does) or use a base backup machinery to obtain the
backup? If we choose the former, it should probably sanitize the
--pg-basebackup-parameters to allow only a subset of the command-line options
(?). AFAICS the latter requires some refactors in the pg_basebackup code --
e.g. expose at least one function (BaseBackup?) that accepts a struct of
command-line options as a parameter and returns success/failure. Another
possibility is to implement a simple BASE_BACKUP command via replication
protocol. The disadvantages are: (a) it could duplicate code and (b) it might
require maintenance if new options are added to the BASE_BACKUP command.
--
Euler Taveira
EDB https://www.enterprisedb.com/
On Mon, Feb 21, 2022 at 5:41 PM Euler Taveira <euler@eulerto.com> wrote:
Logical replication has been used to migration with minimal downtime. However,
if you are dealing with a big database, the amount of required resources (disk
-- due to WAL retention) increases as the backlog (WAL) increases. Unless you
have a generous amount of resources and can wait for long period of time until
the new replica catches up, creating a logical replica is impracticable on
large databases.The general idea is to create and convert a physical replica or a base backup
(archived WAL files available) into a logical replica. The initial data copy
and catchup tends to be faster on a physical replica. This technique has been
successfully used in pglogical_create_subscriber [1].
Sounds like a promising idea.
A new tool called pg_subscriber does this conversion and is tightly integrated
with Postgres.DESIGN
The conversion requires 8 steps.
1. Check if the target data directory has the same system identifier than the
source data directory.
2. Stop the target server if it is running as a standby server. (Modify
recovery parameters requires a restart.)
3. Create one replication slot per specified database on the source server. One
additional replication slot is created at the end to get the consistent LSN
(This consistent LSN will be used as (a) a stopping point for the recovery
process and (b) a starting point for the subscriptions).
What is the need to create an extra slot other than the slot for each
database? Can't we use the largest LSN returned by slots as the
recovery-target-lsn and starting point for subscriptions?
How, these additional slots will get freed or reused when say the
server has crashed/stopped after creating the slots but before
creating the subscriptions? Users won't even know the names of such
slots as they are internally created.
4. Write recovery parameters into the target data directory and start the
target server (Wait until the target server is promoted).
5. Create one publication (FOR ALL TABLES) per specified database on the source
server.
6. Create one subscription per specified database on the target server (Use
replication slot and publication created in a previous step. Don't enable the
subscriptions yet).
7. Sets the replication progress to the consistent LSN that was got in a
previous step.
8. Enable the subscription for each specified database on the target server.This tool does not take a base backup. It can certainly be included later.
There is already a tool do it: pg_basebackup.
The backup will take the backup of all the databases present on the
source server. Do we need to provide the way/recommendation to remove
the databases that are not required?
Can we see some numbers with various sizes of databases (cluster) to
see how it impacts the time for small to large size databases as
compared to the traditional method? This might help giving users
advice on when to use this tool?
--
With Regards,
Amit Kapila.
On 2/21/22 13:09, Euler Taveira wrote:
DESIGN
The conversion requires 8 steps.
1. Check if the target data directory has the same system identifier
than the
source data directory.
2. Stop the target server if it is running as a standby server. (Modify
recovery parameters requires a restart.)
3. Create one replication slot per specified database on the source
server. One
additional replication slot is created at the end to get the consistent LSN
(This consistent LSN will be used as (a) a stopping point for the recovery
process and (b) a starting point for the subscriptions).
4. Write recovery parameters into the target data directory and start the
target server (Wait until the target server is promoted).
5. Create one publication (FOR ALL TABLES) per specified database on the
source
server.
6. Create one subscription per specified database on the target server (Use
replication slot and publication created in a previous step. Don't
enable the
subscriptions yet).
7. Sets the replication progress to the consistent LSN that was got in a
previous step.
8. Enable the subscription for each specified database on the target server.
Very interesting!
I actually just a couple of weeks ago proposed a similar design for
upgrading a database of a customer of mine. We have not tried it yet so
it is not decided if we should go ahead with it.
In our case the goal is a bit different so my idea is that we will use
pg_dump/pg_restore (or pg_upgrade and then some manual cleanup if
pg_dump/pg_restore is too slow) on the target server. The goal of this
design is to get a nice clean logical replica at the new version of
PostgreSQL with indexes with the correct collations, all old invalid
constraints validated, minimal bloat, etc. And all of this without
creating bloat or putting too much load on the old master during the
process. We have plenty of disk space and plenty of time so those are
not limitations in our case. I can go into more detail if there is interest.
It is nice to see that our approach is not entirely unique. :) And I
will take a look at this patch when I find the time.
Andreas
On 21.02.22 13:09, Euler Taveira wrote:
A new tool called pg_subscriber does this conversion and is tightly
integrated
with Postgres.
Are we comfortable with the name pg_subscriber? It seems too general.
Are we planning other subscriber-related operations in the future? If
so, we should at least make this one use a --create option or
something like that.
doc/src/sgml/ref/pg_subscriber.sgml
Attached is a patch that reorganizes the man page a bit. I moved the
description of the steps to the Notes section and formatted it
differently. I think the steps are interesting but not essential for
the using of the program, so I wanted to get them out of the main
description.
src/bin/pg_subscriber/pg_subscriber.c
+ if (made_temp_replslot)
+ {
+ conn = connect_database(dbinfo[0].pubconninfo, true);
+ drop_replication_slot(conn, &dbinfo[0], temp_replslot);
+ disconnect_database(conn);
+ }
Temp slots don't need to be cleaned up.
+/*
+ * Obtain the system identifier from control file. It will be used to
compare
+ * if a data directory is a clone of another one. This routine is used
locally
+ * and avoids a replication connection.
+ */
+static char *
+get_control_from_datadir(const char *datadir)
This could return uint64 directly, without string conversion.
get_sysid_from_conn() could then convert to uint64 internally.
+ {"verbose", no_argument, NULL, 'v'},
I'm not sure if the --verbose option is all that useful.
+ {"stop-subscriber", no_argument, NULL, 1},
This option doesn't seem to be otherwise supported or documented.
+ pub_sysid = pg_malloc(32);
+ pub_sysid = get_sysid_from_conn(dbinfo[0].pubconninfo);
+ sub_sysid = pg_malloc(32);
+ sub_sysid = get_control_from_datadir(subscriber_dir);
These mallocs don't appears to be of any use.
+ dbname_conninfo = pg_malloc(NAMEDATALEN);
This seems wrong.
Overall, this code could use a little bit more structuring. There are
a lot of helper functions that don't seem to do a lot and are mostly
duplicate runs-this-SQL-command calls. But the main() function is
still huge. There is room for refinement.
src/bin/pg_subscriber/t/001_basic.pl
Good start, but obviously, we'll need some real test cases here also.
src/bin/initdb/initdb.c
src/bin/pg_ctl/pg_ctl.c
src/common/file_utils.c
src/include/common/file_utils.h
I recommend skipping this refactoring. The readfile() function from
pg_ctl is not general enough to warrant the pride of place of a
globally available function. Note that it is specifically geared
toward some of pg_ctl's requirements, for example that the underlying
file can change while it is being read.
The requirements of pg_subscriber can be satisfied more easily: Just
call pg_ctl to start the server. You are already using that in
pg_subscriber. Is there a reason it can't be used here as well?
Attachments:
0001-fixup-Create-a-new-logical-replica-from-a-base-backu.patchtext/plain; charset=UTF-8; name=0001-fixup-Create-a-new-logical-replica-from-a-base-backu.patchDownload+99-74
On 3/15/22 09:51, Peter Eisentraut wrote:
On 21.02.22 13:09, Euler Taveira wrote:
A new tool called pg_subscriber does this conversion and is tightly
integrated
with Postgres.Are we comfortable with the name pg_subscriber? It seems too general.
Are we planning other subscriber-related operations in the future? If
so, we should at least make this one use a --create option or
something like that.
Not really sold on the name (and I didn't much like the name
pglogical_create_subscriber either, although it's a cool facility and
I'm happy to see us adopting something like it).
ISTM we should have a name that conveys that we are *converting* a
replica or equivalent to a subscriber.
cheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com
Hi,
On 2022-02-21 09:09:12 -0300, Euler Taveira wrote:
A new tool called pg_subscriber does this conversion and is tightly integrated
with Postgres.
Given that this has been submitted just before the last CF and is a patch of
nontrivial size, has't made significant progress ISTM it should be moved to
the next CF?
It currently fails in cfbot, but that's likely just due to Peter's incremental
patch. Perhaps you could make sure the patch still applies and repost?
Greetings,
Andres Freund
On Fri, 18 Mar 2022 at 19:34 Andrew Dunstan <andrew@dunslane.net> wrote:
On 3/15/22 09:51, Peter Eisentraut wrote:
On 21.02.22 13:09, Euler Taveira wrote:
A new tool called pg_subscriber does this conversion and is tightly
integrated
with Postgres.Are we comfortable with the name pg_subscriber? It seems too general.
Are we planning other subscriber-related operations in the future? If
so, we should at least make this one use a --create option or
something like that.Not really sold on the name (and I didn't much like the name
pglogical_create_subscriber either, although it's a cool facility and
I'm happy to see us adopting something like it).ISTM we should have a name that conveys that we are *converting* a
replica or equivalent to a subscriber.
Some time ago I did a POC on it [1]https://github.com/fabriziomello/pg_create_subscriber -- Fabrízio de Royes Mello and I used the name pg_create_subscriber
[1]: https://github.com/fabriziomello/pg_create_subscriber -- Fabrízio de Royes Mello
https://github.com/fabriziomello/pg_create_subscriber
--
Fabrízio de Royes Mello
On 18.03.22 23:34, Andrew Dunstan wrote:
On 3/15/22 09:51, Peter Eisentraut wrote:
On 21.02.22 13:09, Euler Taveira wrote:
A new tool called pg_subscriber does this conversion and is tightly
integrated
with Postgres.Are we comfortable with the name pg_subscriber? It seems too general.
Are we planning other subscriber-related operations in the future? If
so, we should at least make this one use a --create option or
something like that.Not really sold on the name (and I didn't much like the name
pglogical_create_subscriber either, although it's a cool facility and
I'm happy to see us adopting something like it).ISTM we should have a name that conveys that we are *converting* a
replica or equivalent to a subscriber.
The pglogical tool includes the pg_basebackup run, so it actually
"creates" the subscriber from scratch. Whether this tool is also doing
that is still being discussed.
On 22.03.22 02:25, Andres Freund wrote:
On 2022-02-21 09:09:12 -0300, Euler Taveira wrote:
A new tool called pg_subscriber does this conversion and is tightly integrated
with Postgres.Given that this has been submitted just before the last CF and is a patch of
nontrivial size, has't made significant progress ISTM it should be moved to
the next CF?
done
This entry has been waiting on author input for a while (our current
threshold is roughly two weeks), so I've marked it Returned with
Feedback.
Once you think the patchset is ready for review again, you (or any
interested party) can resurrect the patch entry by visiting
https://commitfest.postgresql.org/38/3556/
and changing the status to "Needs Review", and then changing the
status again to "Move to next CF". (Don't forget the second step;
hopefully we will have streamlined this in the near future!)
Thanks,
--Jacob
On Mon, Feb 21, 2022, at 9:09 AM, Euler Taveira wrote:
A new tool called pg_subscriber does this conversion and is tightly integrated
with Postgres.
After a long period of inactivity, I'm back to this client tool. As suggested
by Andres, I added a new helper function to change the system identifier as the
last step. I also thought about including the pg_basebackup support but decided
to keep it simple (at least for this current version). The user can always
execute pg_basebackup as a preliminary step to create a standby replica and it
will work. (I will post a separate patch that includes the pg_basebackup
support on the top of this one.)
Amit asked if an extra replication slot is required. It is not. The reason I
keep it is to remember that at least the latest replication slot needs to be
created after the pg_basebackup finishes (pg_backup_stop() call). Regarding the
atexit() routine, it tries to do the best to remove all the objects it created,
however, there is no guarantee it can remove them because it depends on
external resources such as connectivity and authorization. I added a new
warning message if it cannot drop the transient replication slot. It is
probably a good idea to add such warning message into the cleanup routine too.
More to this point, another feature that checks and remove all left objects.
The transient replication slot is ok because it should always be removed at the
end. However, the big question is how to detect that you are not removing
objects (publications, subscriptions, replication slots) from a successful
conversion.
Amit also asked about setup a logical replica with m databases where m is less
than the total number of databases. One option is to remove the "extra"
databases in the target server after promoting the physical replica or in one
of the latest steps. Maybe it is time to propose partial physical replica that
contains only a subset of databases on primary. (I'm not volunteering to it.)
Hence, pg_basebackup has an option to remove these "extra" databases so this
tool can take advantage of it.
Let's continue with the bike shedding... I agree with Peter E that this name
does not express what this tool is. At the moment, it only have one action:
create. If I have to suggest other actions I would say that it could support
switchover option too (that removes the infrastructure created by this tool).
If we decide to keep this name, it should be a good idea to add an option to
indicate what action it is executing (similar to pg_recvlogical) as suggested
by Peter.
I included the documentation cleanups that Peter E shared. I also did small
adjustments into the documentation. It probably deserves a warning section that
advertises about the cleanup.
I refactored the transient replication slot code and decided to use a permanent
(instead of temporary) slot to avoid keeping a replication connection open for
a long time until the target server catches up.
The system identifier functions (get_control_from_datadir() and
get_sysid_from_conn()) now returns uint64 as suggested by Peter.
After reflection, the --verbose option should be renamed to --progress. There
are also some messages that should be converted to debug messages.
I fixed the useless malloc. I rearrange the code a bit but the main still has ~
370 lines (without options/validation ~ 255 lines. I'm trying to rearrange the
code to make the code easier to read and at the same time reduce the main size.
I already have a few candidates in mind such as the code that stops the standby
and the part that includes the recovery parameters. I removed the refactor I
proposed in the previous patch and the current code is relying on pg_ctl --wait
behavior. Are there issues with this choice? Well, one annoying situation is
that pg_ctl does not have a "wait forever" option. If one of the pg_ctl calls
fails, you could probably have to start again (unless you understand the
pg_subscriber internals and fix the setup by yourself). You have to choose an
arbitrary timeout value and expect that pg_ctl *does* perform the action less
than the timeout.
Real tests are included. The cleanup code does not have coverage because a
simple reproducible case isn't easy. I'm also not sure if it is worth it. We
can explain it in the warning section that was proposed.
It is still a WIP but I would like to share it and get some feedback.
--
Euler Taveira
EDB https://www.enterprisedb.com/
Attachments:
v2-0001-Creates-a-new-logical-replica-from-a-standby-serv.patchtext/x-patch; name="=?UTF-8?Q?v2-0001-Creates-a-new-logical-replica-from-a-standby-serv.patc?= =?UTF-8?Q?h?="Download+2038-6
On Mon, Oct 23, 2023 at 9:34 AM Euler Taveira <euler@eulerto.com> wrote:
It is still a WIP but I would like to share it and get some feedback.
I have started reviewing the patch. I have just read through all the
code. It's well documented and clear. Next I will review the design in
detail. Here are a couple of minor comments
1.
+tests += {
+ 'name': 'pg_subscriber',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'tap': {
+ 'tests': [
+ 't/001_basic.pl',
COMMENT
Shouldn't we include 002_standby.pl?
2. CreateReplicationSlotLSN, is not used anywhere. Instead I see
create_logical_replication_slot() in pg_subscriber.c. Which of these
two you intend to use finally?
--
Best Wishes,
Ashutosh Bapat
I think this is duplicate with https://commitfest.postgresql.org/45/4637/
The new status of this patch is: Waiting on Author
On Thu, Oct 26, 2023 at 5:17 PM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
On Mon, Oct 23, 2023 at 9:34 AM Euler Taveira <euler@eulerto.com> wrote:
It is still a WIP but I would like to share it and get some feedback.
I have started reviewing the patch. I have just read through all the
code. It's well documented and clear. Next I will review the design in
detail.
Here are some comments about functionality and design.
+ <step>
+ <para>
+ <application>pg_subscriber</application> creates one replication slot for
+ each specified database on the source server. The replication slot name
+ contains a <literal>pg_subscriber</literal> prefix. These replication
+ slots will be used by the subscriptions in a future step. Another
+ replication slot is used to get a consistent start location. This
+ consistent LSN will be used as a stopping point in the <xref
+ linkend="guc-recovery-target-lsn"/> parameter and by the
+ subscriptions as a replication starting point. It guarantees that no
+ transaction will be lost.
+ </para>
+ </step>
CREATE_REPLICATION_SLOT would wait for any incomplete transaction to
complete. So it may not be possible to have an incomplete transaction
on standby when it comes out of recovery. Am I correct? Can we please
have a testcase where we test this scenario? What about a prepared
transactions?
+
+ <step>
+ <para>
+ <application>pg_subscriber</application> writes recovery parameters into
+ the target data directory and start the target server. It specifies a LSN
+ (consistent LSN that was obtained in the previous step) of write-ahead
+ log location up to which recovery will proceed. It also specifies
+ <literal>promote</literal> as the action that the server should take once
+ the recovery target is reached. This step finishes once the server ends
+ standby mode and is accepting read-write operations.
+ </para>
+ </step>
At this stage the standby would have various replication objects like
publications, subscriptions, origins inherited from the upstream
server and possibly very much active. With failover slots, it might
inherit replication slots. Is it intended that the new subscriber also
acts as publisher for source's subscribers OR that the new subscriber
should subscribe to the upstreams of the source? Some use cases like
logical standby might require that but a multi-master multi-node setup
may not. The behaviour should be user configurable.
There may be other objects in this category which need special consideration on
the subscriber. I haven't fully thought through the list of such objects.
+ uses the replication slot that was created in a previous step. The
+ subscription is created but it is not enabled yet. The reason is the
+ replication progress must be set to the consistent LSN but replication
+ origin name contains the subscription oid in its name. Hence, the
Not able to understand the sentence "The reason is ... in its name".
Why is subscription OID in origin name matters?
+ <para>
+ <application>pg_subscriber</application> stops the target server to change
+ its system identifier.
+ </para>
I expected the subscriber to be started after this step.
Why do we need pg_resetwal?
+ appendPQExpBuffer(str, "CREATE_REPLICATION_SLOT \"%s\"", slot_name);
+ appendPQExpBufferStr(str, " LOGICAL \"pgoutput\" NOEXPORT_SNAPSHOT");
Hardcoding output plugin name would limit this utility only to
built-in plugin. Any reason for that limitation?
In its current form the utility creates a logical subscriber which
subscribes to all the tables (and sequences when we have sequence
replication). But it will be useful even in case of selective
replication from a very large database. In such a case the new
subscriber will need to a. remove the unwanted objects b. subscriber
will need to subscribe to publications publishing the "interesting"
objects. We don't need to support this case, but the current
functionality (including the interface) and design shouldn't limit us
from doing so. Have you thought about this case?
I noticed some differences between this and a similar utility
https://github.com/2ndQuadrant/pglogical/blob/REL2_x_STABLE/pglogical_create_subscriber.c.
I will be reviewing these differences next to see if we are missing
anything here.
--
Best Wishes,
Ashutosh Bapat
On Tue, Oct 31, 2023, at 11:46 PM, shihao zhong wrote:
I think this is duplicate with https://commitfest.postgresql.org/45/4637/
The new status of this patch is: Waiting on Author
I withdrew the other entry.
--
Euler Taveira
EDB https://www.enterprisedb.com/
On Wed, Nov 1, 2023 at 7:10 PM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
I noticed some differences between this and a similar utility
https://github.com/2ndQuadrant/pglogical/blob/REL2_x_STABLE/pglogical_create_subscriber.c.
I will be reviewing these differences next to see if we are missing
anything here.
Some more missing things to discuss
Handling signals - The utility cleans up left over objects on exit.
But default signal handlers will make the utility exit without a
proper cleanup [1]NOTEs section in man atexit().. The signal handlers may clean up the objects
themselves or at least report the objects that need tobe cleaned up.
Idempotent behaviour - Given that the utility will be used when very
large amount of data is involved, redoing everything after a network
glitch or a temporary failure should be avoided. This is true when the
users start with base backup. Again, I don't think we should try to be
idempotent in v1 but current design shouldn't stop us from doing so. I
didn't find anything like that in my review. But something to keep in
mind.
That finishes my first round of review. I will wait for your updated
patches before the next round.
[1]: NOTEs section in man atexit().
--
Best Wishes,
Ashutosh Bapat
On 23.10.23 05:53, Euler Taveira wrote:
Let's continue with the bike shedding... I agree with Peter E that this name
does not express what this tool is. At the moment, it only have one action:
create. If I have to suggest other actions I would say that it could support
switchover option too (that removes the infrastructure created by this
tool).
If we decide to keep this name, it should be a good idea to add an option to
indicate what action it is executing (similar to pg_recvlogical) as
suggested
by Peter.
Speaking of which, would it make sense to put this tool (whatever the
name) into the pg_basebackup directory? It's sort of related, and it
also shares some code.
On Tue, Nov 07, 2023 at 10:00:39PM +0100, Peter Eisentraut wrote:
Speaking of which, would it make sense to put this tool (whatever the name)
into the pg_basebackup directory? It's sort of related, and it also shares
some code.
I've read the patch, and the additions to streamutil.h and
streamutil.c make it kind of natural to have it sit in pg_basebackup/.
There's pg_recvlogical already there. I am wondering about two
things, though:
- Should the subdirectory pg_basebackup be renamed into something more
generic at this point? All these things are frontend tools that deal
in some way with the replication protocol to do their work. Say
a replication_tools?
- And if it would be better to refactor some of the code generic to
all these streaming tools to fe_utils. What makes streamutil.h a bit
less pluggable are all its extern variables to control the connection,
but perhaps that can be an advantage, as well, in some cases.
--
Michael