Have pg_basebackup write "dbname" in "primary_conninfo"?

Started by Ian Lawrence Barwickalmost 2 years ago36 messages
#1Ian Lawrence Barwick
barwick@gmail.com
1 attachment(s)

Hi

Hi

With the addition of "pg_sync_replication_slots()", there is now a use-case for
including "dbname" in "primary_conninfo" and the docs have changed from
stating [1]https://www.postgresql.org/docs/16/runtime-config-replication.html#RUNTIME-CONFIG-REPLICATION-STANDBY:

Do not specify a database name in the primary_conninfo string.

to [2]https://www.postgresql.org/docs/devel/runtime-config-replication.html#RUNTIME-CONFIG-REPLICATION-STANDBY:

For replication slot synchronization (see Section 48.2.3), it is also
necessary to specify a valid dbname in the primary_conninfo string. This will
only be used for slot synchronization. It is ignored for streaming.

[1]: https://www.postgresql.org/docs/16/runtime-config-replication.html#RUNTIME-CONFIG-REPLICATION-STANDBY
[2]: https://www.postgresql.org/docs/devel/runtime-config-replication.html#RUNTIME-CONFIG-REPLICATION-STANDBY

However, when setting up a standby (with the intent of creating a logical
standby) with pg_basebackup, providing the -R/-write-recovery-conf option
results in a "primary_conninfo" string being generated without a "dbname"
parameter, which requires a manual change should one intend to use
"pg_sync_replication_slots()".

I can't see any reason for continuing to omit "dbname", so suggest it should
only continue to be omitted for 16 and earlier; see attached patch.

Note that this does mean that if the conninfo string provided to pg_basebackup
does not contain "dbname", the generated "primary_conninfo" GUC will default to
"dbname=replication". It would be easy enough to suppress this, but AFAICS
there's no way to tell if it was explicitly supplied by the user, in which case
it would be surprising if it were omitted from the final "primary_conninfo"
string.

The only other place where GenerateRecoveryConfig() is called is pg_rewind;
I can't see any adverse affects from the proposed change. But it's perfectly
possible there's something blindingly obvious I'm overlooking.

Regards

Ian Barwick

Attachments:

pg_basebackup-write-dbname.v0001.patchtext/x-patch; charset=US-ASCII; name=pg_basebackup-write-dbname.v0001.patchDownload
diff --git a/src/fe_utils/recovery_gen.c b/src/fe_utils/recovery_gen.c
index 2585f11939..de463f3956 100644
--- a/src/fe_utils/recovery_gen.c
+++ b/src/fe_utils/recovery_gen.c
@@ -49,12 +49,16 @@ GenerateRecoveryConfig(PGconn *pgconn, char *replication_slot)
 	{
 		/* Omit empty settings and those libpqwalreceiver overrides. */
 		if (strcmp(opt->keyword, "replication") == 0 ||
-			strcmp(opt->keyword, "dbname") == 0 ||
 			strcmp(opt->keyword, "fallback_application_name") == 0 ||
 			(opt->val == NULL) ||
 			(opt->val != NULL && opt->val[0] == '\0'))
 			continue;
 
+		/* Omit "dbname" in PostgreSQL 16 and earlier. */
+		if (strcmp(opt->keyword, "dbname") == 0 &&
+			PQserverVersion(pgconn) < MINIMUM_VERSION_FOR_DBNAME_PARAM)
+			continue;
+
 		/* Separate key-value pairs with spaces */
 		if (conninfo_buf.len != 0)
 			appendPQExpBufferChar(&conninfo_buf, ' ');
diff --git a/src/include/fe_utils/recovery_gen.h b/src/include/fe_utils/recovery_gen.h
index ca2c4800d0..740d90c9d8 100644
--- a/src/include/fe_utils/recovery_gen.h
+++ b/src/include/fe_utils/recovery_gen.h
@@ -20,6 +20,12 @@
  */
 #define MINIMUM_VERSION_FOR_RECOVERY_GUC 120000
 
+/*
+ * from version 17, there is a use-case for adding the dbname parameter
+ * to the generated "primary_conninfo" value
+ */
+#define MINIMUM_VERSION_FOR_DBNAME_PARAM 170000
+
 extern PQExpBuffer GenerateRecoveryConfig(PGconn *pgconn,
 										  char *replication_slot);
 extern void WriteRecoveryConfig(PGconn *pgconn, char *target_dir,
#2Jelte Fennema-Nio
postgres@jeltef.nl
In reply to: Ian Lawrence Barwick (#1)
Re: Have pg_basebackup write "dbname" in "primary_conninfo"?

On Tue, 20 Feb 2024 at 00:34, Ian Lawrence Barwick <barwick@gmail.com> wrote:

With the addition of "pg_sync_replication_slots()", there is now a use-case for
including "dbname" in "primary_conninfo" and the docs have changed from
stating [1]:

Do not specify a database name in the primary_conninfo string.

to [2]:

For replication slot synchronization (see Section 48.2.3), it is also
necessary to specify a valid dbname in the primary_conninfo string. This will
only be used for slot synchronization. It is ignored for streaming.

[1] https://www.postgresql.org/docs/16/runtime-config-replication.html#RUNTIME-CONFIG-REPLICATION-STANDBY
[2] https://www.postgresql.org/docs/devel/runtime-config-replication.html#RUNTIME-CONFIG-REPLICATION-STANDBY

Sounds like that documentation should be updated in the same way as
was done for pg_basebackup/pg_receivewal in commit cca97ce6a665. When
considering middleware/proxies having dbname in there can be useful
even for older PG versions.

I can't see any reason for continuing to omit "dbname", so suggest it should
only continue to be omitted for 16 and earlier; see attached patch.

Yeah, that seems like a good change. Though, I'm wondering if the
version check is actually necessary.

#3Amit Kapila
amit.kapila16@gmail.com
In reply to: Ian Lawrence Barwick (#1)
Re: Have pg_basebackup write "dbname" in "primary_conninfo"?

On Tue, Feb 20, 2024 at 5:04 AM Ian Lawrence Barwick <barwick@gmail.com> wrote:

With the addition of "pg_sync_replication_slots()", there is now a use-case for
including "dbname" in "primary_conninfo" and the docs have changed from
stating [1]:

Do not specify a database name in the primary_conninfo string.

to [2]:

For replication slot synchronization (see Section 48.2.3), it is also
necessary to specify a valid dbname in the primary_conninfo string. This will
only be used for slot synchronization. It is ignored for streaming.

[1] https://www.postgresql.org/docs/16/runtime-config-replication.html#RUNTIME-CONFIG-REPLICATION-STANDBY
[2] https://www.postgresql.org/docs/devel/runtime-config-replication.html#RUNTIME-CONFIG-REPLICATION-STANDBY

However, when setting up a standby (with the intent of creating a logical
standby) with pg_basebackup, providing the -R/-write-recovery-conf option
results in a "primary_conninfo" string being generated without a "dbname"
parameter, which requires a manual change should one intend to use
"pg_sync_replication_slots()".

I can't see any reason for continuing to omit "dbname", so suggest it should
only continue to be omitted for 16 and earlier; see attached patch.

Note that this does mean that if the conninfo string provided to pg_basebackup
does not contain "dbname", the generated "primary_conninfo" GUC will default to
"dbname=replication".

When I tried, it defaulted to user name:
Default case connection string:
primary_conninfo = 'user=KapilaAm
passfile=''C:\\\\Users\\\\kapilaam\\\\AppData\\\\Roaming/postgresql/pgpass.conf''
channel_binding=disable dbname=KapilaAm port=5432 sslmode=disable
sslcompression=0 sslcertmode=disable sslsni=1
ssl_min_protocol_version=TLSv1.2 gssencmode=disable
krbsrvname=postgres gssdelegation=0 target_session_attrs=any
load_balance_hosts=disable'

When I specified -d "dbname = postgres" during backup:
primary_conninfo = 'user=KapilaAm
passfile=''C:\\\\Users\\\\kapilaam\\\\AppData\\\\Roaming/postgresql/pgpass.conf''
channel_binding=disable dbname=KapilaAm port=5432 sslmode=disable
sslcompression=0 sslcertmode=disable sslsni=1
ssl_min_protocol_version=TLSv1.2 gssencmode=disable
krbsrvname=postgres gssdelegation=0 target_session_attrs=any
load_balance_hosts=disable'

I think it makes sense to include dbname in the connection string
programmatically (with some option) for slot sync functionality but it
is not clear to me if there is any impact of the same when the standby
is not set up to sync up logical slots. I haven't checked but if there
is a way to distinguish the case where we write dbname only when
specified by the user then that would be great but this is worth
considering even without that.

--
With Regards,
Amit Kapila.

#4Hayato Kuroda (Fujitsu)
kuroda.hayato@fujitsu.com
In reply to: Ian Lawrence Barwick (#1)
RE: Have pg_basebackup write "dbname" in "primary_conninfo"?

Dear Ian,

Thanks for making the patch.

With the addition of "pg_sync_replication_slots()", there is now a use-case for
including "dbname" in "primary_conninfo" and the docs have changed from
stating [1]:

Do not specify a database name in the primary_conninfo string.

to [2]:

For replication slot synchronization (see Section 48.2.3), it is also
necessary to specify a valid dbname in the primary_conninfo string. This will
only be used for slot synchronization. It is ignored for streaming.

[1]
https://www.postgresql.org/docs/16/runtime-config-replication.html#RUNTIME
-CONFIG-REPLICATION-STANDBY
[2]
https://www.postgresql.org/docs/devel/runtime-config-replication.html#RUNTI
ME-CONFIG-REPLICATION-STANDBY

However, when setting up a standby (with the intent of creating a logical
standby) with pg_basebackup, providing the -R/-write-recovery-conf option
results in a "primary_conninfo" string being generated without a "dbname"
parameter, which requires a manual change should one intend to use
"pg_sync_replication_slots()".

I can't see any reason for continuing to omit "dbname", so suggest it should
only continue to be omitted for 16 and earlier; see attached patch.

Hmm, I also cannot find a reason, but we can document the change.

Note that this does mean that if the conninfo string provided to pg_basebackup
does not contain "dbname", the generated "primary_conninfo" GUC will default to
"dbname=replication". It would be easy enough to suppress this, but AFAICS
there's no way to tell if it was explicitly supplied by the user, in which case
it would be surprising if it were omitted from the final "primary_conninfo"
string.

I found an inconsistency. When I ran ` pg_basebackup -D data_N2 -U postgres -R`,
dbname would be set as username.

```
primary_conninfo = 'user=postgres ... dbname=postgres
```

However, when I ran `pg_basebackup -D data_N2 -d "user=postgres" -R`,
dbname would be set as "replication". Is it an intentional item?

```
primary_conninfo = 'user=postgres ... dbname=replication...
```

The only other place where GenerateRecoveryConfig() is called is pg_rewind;
I can't see any adverse affects from the proposed change. But it's perfectly
possible there's something blindingly obvious I'm overlooking.

On-going feature pg_createsubscriber[1]https://commitfest.postgresql.org/47/4637/ also uses GenerateRecoveryConfig(), but
I can't see any bad effects. The output is being used to make consistent standby
from the primary. Even if dbname is set in the primary_conninfo, it would be ignored.

[1]: https://commitfest.postgresql.org/47/4637/

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/

#5Robert Haas
robertmhaas@gmail.com
In reply to: Hayato Kuroda (Fujitsu) (#4)
Re: Have pg_basebackup write "dbname" in "primary_conninfo"?

On Tue, Feb 20, 2024 at 4:18 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:

I found an inconsistency. When I ran ` pg_basebackup -D data_N2 -U postgres -R`,
dbname would be set as username.

```
primary_conninfo = 'user=postgres ... dbname=postgres
```

However, when I ran `pg_basebackup -D data_N2 -d "user=postgres" -R`,
dbname would be set as "replication". Is it an intentional item?

```
primary_conninfo = 'user=postgres ... dbname=replication...
```

Seems weird to me. You don't use dbname=replication to ask for a
replication connection, so why would we ever end up with that
anywhere? And especially in only one of two such closely related
cases?

--
Robert Haas
EDB: http://www.enterprisedb.com

#6Hayato Kuroda (Fujitsu)
kuroda.hayato@fujitsu.com
In reply to: Robert Haas (#5)
RE: Have pg_basebackup write "dbname" in "primary_conninfo"?

Dear Robert,

Seems weird to me. You don't use dbname=replication to ask for a
replication connection, so why would we ever end up with that
anywhere? And especially in only one of two such closely related
cases?

Just FYI - here is an extreme case. And note that I have applied proposed patch.

When `pg_basebackup -D data_N2 -R` is used:
```
primary_conninfo = 'user=hayato ... dbname=hayato ...
```

But when `pg_basebackup -d "" -D data_N2 -R` is used:
```
primary_conninfo = 'user=hayato ... dbname=replication
```

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/

#7Robert Haas
robertmhaas@gmail.com
In reply to: Hayato Kuroda (Fujitsu) (#6)
Re: Have pg_basebackup write "dbname" in "primary_conninfo"?

On Tue, Feb 20, 2024 at 5:58 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:

Seems weird to me. You don't use dbname=replication to ask for a
replication connection, so why would we ever end up with that
anywhere? And especially in only one of two such closely related
cases?

Just FYI - here is an extreme case. And note that I have applied proposed patch.

When `pg_basebackup -D data_N2 -R` is used:
```
primary_conninfo = 'user=hayato ... dbname=hayato ...
```

But when `pg_basebackup -d "" -D data_N2 -R` is used:
```
primary_conninfo = 'user=hayato ... dbname=replication
```

It seems like maybe somebody should look into why this is happening,
and perhaps fix it.

--
Robert Haas
EDB: http://www.enterprisedb.com

#8Hayato Kuroda (Fujitsu)
kuroda.hayato@fujitsu.com
In reply to: Robert Haas (#7)
1 attachment(s)
RE: Have pg_basebackup write "dbname" in "primary_conninfo"?

Dear Robert,

Just FYI - here is an extreme case. And note that I have applied proposed patch.

When `pg_basebackup -D data_N2 -R` is used:
```
primary_conninfo = 'user=hayato ... dbname=hayato ...
```

But when `pg_basebackup -d "" -D data_N2 -R` is used:
```
primary_conninfo = 'user=hayato ... dbname=replication
```

It seems like maybe somebody should look into why this is happening,
and perhaps fix it.

I think this caused from below part [1]``` else { keywords = pg_malloc0((argcount + 1) * sizeof(*keywords)); values = pg_malloc0((argcount + 1) * sizeof(*values)); keywords[i] = "dbname"; values[i] = dbname; i++; } ``` in GetConnection().

If both dbname and connection_string are the NULL, we will enter the else part
and NULL would be substituted - {"dbnmae", NULL} key-value pair is generated
only here.

Then, in PQconnectdbParams()->PQconnectStartParams->pqConnectOptions2(),
the strange part would be found and replaced to the username [2]``` /* * If database name was not given, default it to equal user name */ if (conn->dbName == NULL || conn->dbName[0] == '\0') { free(conn->dbName); conn->dbName = strdup(conn->pguser); if (!conn->dbName) goto oom_error; } ```.

I think if both the connection string and the dbname are NULL, it should be
considered as the physical replication connection. here is a patch to fix it.
After the application, below two examples can output "dbname=replication".
You can also confirm.

```
pg_basebackup -D data_N2 -U postgres
pg_basebackup -D data_N2 -R -v

-> primary_conninfo = 'user=postgres ... dbname=replication ...
```

[1]: ``` else { keywords = pg_malloc0((argcount + 1) * sizeof(*keywords)); values = pg_malloc0((argcount + 1) * sizeof(*values)); keywords[i] = "dbname"; values[i] = dbname; i++; } ```
```
else
{
keywords = pg_malloc0((argcount + 1) * sizeof(*keywords));
values = pg_malloc0((argcount + 1) * sizeof(*values));
keywords[i] = "dbname";
values[i] = dbname;
i++;
}
```

[2]: ``` /* * If database name was not given, default it to equal user name */ if (conn->dbName == NULL || conn->dbName[0] == '\0') { free(conn->dbName); conn->dbName = strdup(conn->pguser); if (!conn->dbName) goto oom_error; } ```
```
/*
* If database name was not given, default it to equal user name
*/
if (conn->dbName == NULL || conn->dbName[0] == '\0')
{
free(conn->dbName);
conn->dbName = strdup(conn->pguser);
if (!conn->dbName)
goto oom_error;
}
```

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/

Attachments:

add_NULL_check.txttext/plain; name=add_NULL_check.txtDownload
diff --git a/src/bin/pg_basebackup/streamutil.c b/src/bin/pg_basebackup/streamutil.c
index 56d1b15951..aea1ccba36 100644
--- a/src/bin/pg_basebackup/streamutil.c
+++ b/src/bin/pg_basebackup/streamutil.c
@@ -119,7 +119,7 @@ GetConnection(void)
 		keywords = pg_malloc0((argcount + 1) * sizeof(*keywords));
 		values = pg_malloc0((argcount + 1) * sizeof(*values));
 		keywords[i] = "dbname";
-		values[i] = dbname;
+		values[i] = dbname == NULL ? "replication" : dbname;
 		i++;
 	}
 
#9Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Robert Haas (#7)
1 attachment(s)
Re: Have pg_basebackup write "dbname" in "primary_conninfo"?

At Tue, 20 Feb 2024 19:56:10 +0530, Robert Haas <robertmhaas@gmail.com> wrote in

It seems like maybe somebody should look into why this is happening,
and perhaps fix it.

GetConnection()@streamutil.c wants to ensure conninfo has a fallback
database name ("replication"). However, the function seems to be
ignoring the case where neither dbname nor connection string is given,
which is the first case Kuroda-san raised. The second case is the
intended behavior of the function.

/* pg_recvlogical uses dbname only; others use connection_string only. */
Assert(dbname == NULL || connection_string == NULL);

And the function incorrectly assumes that the connection string
requires "dbname=replication".

* Merge the connection info inputs given in form of connection string,
* options and default values (dbname=replication, replication=true, etc.)

But the name is a pseudo database name only used by pg_hba.conf
(maybe) , which cannot be used as an actual database name (without
quotation marks, or unless it is actually created). The function
should not add the fallback database name because the connection
string for physical replication connection doesn't require the dbname
parameter. (attached patch)

About the proposed patch, pg_basebackup cannot verify the validity of
the dbname. It could be problematic.

Although I haven't looked the original thread, it seems that the
dbname is used only by pg_sync_replication_slots(). If it is true,
couldn't we make the SQL function require a database name to make a
connection, instead of requiring it in physical-replication conninfo?

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachments:

do_not_set_fallback_dbname.txttext/plain; charset=us-asciiDownload
diff --git a/src/bin/pg_basebackup/streamutil.c b/src/bin/pg_basebackup/streamutil.c
index 56d1b15951..da63a04de4 100644
--- a/src/bin/pg_basebackup/streamutil.c
+++ b/src/bin/pg_basebackup/streamutil.c
@@ -78,7 +78,7 @@ GetConnection(void)
 
 	/*
 	 * Merge the connection info inputs given in form of connection string,
-	 * options and default values (dbname=replication, replication=true, etc.)
+	 * options and default values (replication=true, etc.)
 	 */
 	i = 0;
 	if (connection_string)
@@ -96,14 +96,6 @@ GetConnection(void)
 		keywords = pg_malloc0((argcount + 1) * sizeof(*keywords));
 		values = pg_malloc0((argcount + 1) * sizeof(*values));
 
-		/*
-		 * Set dbname here already, so it can be overridden by a dbname in the
-		 * connection string.
-		 */
-		keywords[i] = "dbname";
-		values[i] = "replication";
-		i++;
-
 		for (conn_opt = conn_opts; conn_opt->keyword != NULL; conn_opt++)
 		{
 			if (conn_opt->val != NULL && conn_opt->val[0] != '\0')
#10Ajin Cherian
itsajin@gmail.com
In reply to: Kyotaro Horiguchi (#9)
Re: Have pg_basebackup write "dbname" in "primary_conninfo"?

On Wed, Feb 21, 2024 at 2:04 PM Kyotaro Horiguchi <horikyota.ntt@gmail.com>
wrote:

About the proposed patch, pg_basebackup cannot verify the validity of
the dbname. It could be problematic.

Although I haven't looked the original thread, it seems that the
dbname is used only by pg_sync_replication_slots(). If it is true,
couldn't we make the SQL function require a database name to make a
connection, instead of requiring it in physical-replication conninfo?

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

I agree. If the intention is to meet the new requirement of the sync-slot
patch which requires a dbname in the primary_conninfo, then pseudo dbnames
will not work, whether it be the username or just "replication". I feel if
the user does not specify dbname explicitly in pg_basebackup it should be
left blank in the generated primary_conninfo string as well.

regards,
Ajin Cherian
Fujitsu Australia

#11Ajin Cherian
itsajin@gmail.com
In reply to: Kyotaro Horiguchi (#9)
Re: Have pg_basebackup write "dbname" in "primary_conninfo"?

On Wed, Feb 21, 2024 at 2:04 PM Kyotaro Horiguchi <horikyota.ntt@gmail.com>
wrote:

Although I haven't looked the original thread, it seems that the
dbname is used only by pg_sync_replication_slots(). If it is true,
couldn't we make the SQL function require a database name to make a
connection, instead of requiring it in physical-replication conninfo?

In the original thread, the intention is to not just provide this
functionality using the function pg_sync_replication_slots(), but provide
a GUC option on standbys to sync logical replication slots periodically
even without calling that function. This requires connecting to a database.

regards,
Ajin Cherian
Fujitsu Australia

#12Hayato Kuroda (Fujitsu)
kuroda.hayato@fujitsu.com
In reply to: Kyotaro Horiguchi (#9)
RE: Have pg_basebackup write "dbname" in "primary_conninfo"?

Dear Horiguchi-san,

GetConnection()@streamutil.c wants to ensure conninfo has a fallback
database name ("replication"). However, the function seems to be
ignoring the case where neither dbname nor connection string is given,
which is the first case Kuroda-san raised. The second case is the
intended behavior of the function.

/* pg_recvlogical uses dbname only; others use connection_string only.

*/

Assert(dbname == NULL || connection_string == NULL);

And the function incorrectly assumes that the connection string
requires "dbname=replication".

* Merge the connection info inputs given in form of connection string,
* options and default values (dbname=replication, replication=true,

etc.)

But the name is a pseudo database name only used by pg_hba.conf
(maybe) , which cannot be used as an actual database name (without
quotation marks, or unless it is actually created). The function
should not add the fallback database name because the connection
string for physical replication connection doesn't require the dbname
parameter. (attached patch)

I was also missing, but the requirement was that the dbname should be included
only when the dbname option was explicitly specified [1]/messages/by-id/CAA4eK1KH1d1J5giPMZVOtMe0iqncf1CpNwkBKoYAmXdC-kEGZg@mail.gmail.com. Even mine and yours
cannot handle like that. Libpq function PQconnectdbParams()->pqConnectOptions2()
fills all the parameter to PGconn, at that time the information whether it is
intentionally specified or not is discarded. Then, GenerateRecoveryConfig() would
just write down all the connection parameters from PGconn, they cannot recognize.

One approach is that based on Horiguchi-san's one and initial patch, we can
avoid writing when the dbname is same as the username.

[1]: /messages/by-id/CAA4eK1KH1d1J5giPMZVOtMe0iqncf1CpNwkBKoYAmXdC-kEGZg@mail.gmail.com

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/

#13Ajin Cherian
itsajin@gmail.com
In reply to: Hayato Kuroda (Fujitsu) (#12)
Re: Have pg_basebackup write "dbname" in "primary_conninfo"?

On Wed, Feb 21, 2024 at 4:09 PM Hayato Kuroda (Fujitsu) <
kuroda.hayato@fujitsu.com> wrote:

Dear Horiguchi-san,

GetConnection()@streamutil.c wants to ensure conninfo has a fallback
database name ("replication"). However, the function seems to be
ignoring the case where neither dbname nor connection string is given,
which is the first case Kuroda-san raised. The second case is the
intended behavior of the function.

/* pg_recvlogical uses dbname only; others use connection_string

only.

*/

Assert(dbname == NULL || connection_string == NULL);

And the function incorrectly assumes that the connection string
requires "dbname=replication".

* Merge the connection info inputs given in form of connection

string,

* options and default values (dbname=replication,

replication=true,

etc.)

But the name is a pseudo database name only used by pg_hba.conf
(maybe) , which cannot be used as an actual database name (without
quotation marks, or unless it is actually created). The function
should not add the fallback database name because the connection
string for physical replication connection doesn't require the dbname
parameter. (attached patch)

I was also missing, but the requirement was that the dbname should be
included
only when the dbname option was explicitly specified [1]. Even mine and
yours
cannot handle like that. Libpq function
PQconnectdbParams()->pqConnectOptions2()
fills all the parameter to PGconn, at that time the information whether it
is
intentionally specified or not is discarded. Then,
GenerateRecoveryConfig() would
just write down all the connection parameters from PGconn, they cannot
recognize.

Well, one option is that if a default dbname should be written to the

configuration file, then "postgres' is a better option than "replication"
or "username" as the default option, at least a db of that name exists.

regards,
Ajin Cherian
Fujitsu Australia

#14Amit Kapila
amit.kapila16@gmail.com
In reply to: Hayato Kuroda (Fujitsu) (#8)
Re: Have pg_basebackup write "dbname" in "primary_conninfo"?

On Wed, Feb 21, 2024 at 7:46 AM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:

Just FYI - here is an extreme case. And note that I have applied proposed patch.

When `pg_basebackup -D data_N2 -R` is used:
```
primary_conninfo = 'user=hayato ... dbname=hayato ...
```

But when `pg_basebackup -d "" -D data_N2 -R` is used:
```
primary_conninfo = 'user=hayato ... dbname=replication
```

It seems like maybe somebody should look into why this is happening,
and perhaps fix it.

I think this caused from below part [1] in GetConnection().

If both dbname and connection_string are the NULL, we will enter the else part
and NULL would be substituted - {"dbnmae", NULL} key-value pair is generated
only here.

Then, in PQconnectdbParams()->PQconnectStartParams->pqConnectOptions2(),
the strange part would be found and replaced to the username [2].

I think if both the connection string and the dbname are NULL, it should be
considered as the physical replication connection. here is a patch to fix it.

When dbname is NULL or not given, it defaults to username. This
follows the specs of the connection string. See (dbname #
The database name. Defaults to be the same as the user name...) [1]https://www.postgresql.org/docs/devel/libpq-connect.html#LIBPQ-CONNSTRING.
Your patch breaks that specs, so I don't think it is correct.

[1]: https://www.postgresql.org/docs/devel/libpq-connect.html#LIBPQ-CONNSTRING

--
With Regards,
Amit Kapila.

#15Amit Kapila
amit.kapila16@gmail.com
In reply to: Kyotaro Horiguchi (#9)
Re: Have pg_basebackup write "dbname" in "primary_conninfo"?

On Wed, Feb 21, 2024 at 8:34 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:

At Tue, 20 Feb 2024 19:56:10 +0530, Robert Haas <robertmhaas@gmail.com> wrote in

It seems like maybe somebody should look into why this is happening,
and perhaps fix it.

GetConnection()@streamutil.c wants to ensure conninfo has a fallback
database name ("replication"). However, the function seems to be
ignoring the case where neither dbname nor connection string is given,
which is the first case Kuroda-san raised. The second case is the
intended behavior of the function.

/* pg_recvlogical uses dbname only; others use connection_string only. */
Assert(dbname == NULL || connection_string == NULL);

And the function incorrectly assumes that the connection string
requires "dbname=replication".

* Merge the connection info inputs given in form of connection string,
* options and default values (dbname=replication, replication=true, etc.)

But the name is a pseudo database name only used by pg_hba.conf
(maybe) , which cannot be used as an actual database name (without
quotation marks, or unless it is actually created). The function
should not add the fallback database name because the connection
string for physical replication connection doesn't require the dbname
parameter. (attached patch)

We do append dbname=replication even in libpqrcv_connect for .pgpass
lookup as mentioned in comments. See below:
libpqrcv_connect()
{
....
else
{
/*
* The database name is ignored by the server in replication mode,
* but specify "replication" for .pgpass lookup.
*/
keys[++i] = "dbname";
vals[i] = "replication";
}
...
}

I think as part of this effort we should just add dbname to
primary_conninfo written in postgresql.auto.conf file. As above, the
question is valid whether we should do it just for 17 or for prior
versions. Let's discuss more on that. I am not sure of the use case
for versions before 17 but commit cca97ce6a665 mentioned that some
middleware or proxies might however need to know the dbname to make
the correct routing decision for the connection. Does that apply here
as well? If so, we can do it and update the docs, otherwise, let's do
it just for 17.

--
With Regards,
Amit Kapila.

#16Hayato Kuroda (Fujitsu)
kuroda.hayato@fujitsu.com
In reply to: Amit Kapila (#14)
RE: Have pg_basebackup write "dbname" in "primary_conninfo"?

Dear Amit,

When dbname is NULL or not given, it defaults to username. This
follows the specs of the connection string. See (dbname #
The database name. Defaults to be the same as the user name...) [1].
Your patch breaks that specs, so I don't think it is correct.

I have proposed the point because I thought pg_basebackup basically wanted to do
a physical replication. But if the general libpq rule is stronger than it, we
should not apply my add_NULL_check.txt. Let's forget it.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/

#17Hayato Kuroda (Fujitsu)
kuroda.hayato@fujitsu.com
In reply to: Amit Kapila (#15)
1 attachment(s)
RE: Have pg_basebackup write "dbname" in "primary_conninfo"?

Dear Amit,

We do append dbname=replication even in libpqrcv_connect for .pgpass
lookup as mentioned in comments. See below:
libpqrcv_connect()
{
....
else
{
/*
* The database name is ignored by the server in replication mode,
* but specify "replication" for .pgpass lookup.
*/
keys[++i] = "dbname";
vals[i] = "replication";
}
...
}

OK. So we must add the value for the authorization, right?
I think it should be described even in GetConnection().

I think as part of this effort we should just add dbname to
primary_conninfo written in postgresql.auto.conf file. As above, the
question is valid whether we should do it just for 17 or for prior
versions. Let's discuss more on that. I am not sure of the use case
for versions before 17 but commit cca97ce6a665 mentioned that some
middleware or proxies might however need to know the dbname to make
the correct routing decision for the connection. Does that apply here
as well? If so, we can do it and update the docs, otherwise, let's do
it just for 17.

Hmm, I might lose your requirements again. So, we must satisfy all of below
ones, right?
1) add {"dbname", "replication"} key-value pair to look up .pgpass file correctly.
2) If the -R is given, output dbname=xxx value to be used by slotsync worker.
3) If the dbname is not given in the connection string, the same string as
username must be used to keep the libpq connection rule.
4) No need to add dbname=replication pair

PSA the patch for implementing it. It is basically same as Ian's one.
However, this patch still cannot satisfy the condition 3).

`pg_basebackup -D data_N2 -d "user=postgres" -R`
-> dbname would not be appeared in primary_conninfo.

This is because `if (connection_string)` case in GetConnection() explicy override
a dbname to "replication". I've tried to add a dummy entry {"dbname", NULL} pair
before the overriding, but it is no-op. Because The replacement of the dbname in
pqConnectOptions2() would be done only for the valid (=lastly specified)
connection options.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/

Attachments:

pg_basebackup-write-dbname.v0002.patchapplication/octet-stream; name=pg_basebackup-write-dbname.v0002.patchDownload
diff --git a/src/fe_utils/recovery_gen.c b/src/fe_utils/recovery_gen.c
index 2585f11939..c3eb1a59a0 100644
--- a/src/fe_utils/recovery_gen.c
+++ b/src/fe_utils/recovery_gen.c
@@ -49,12 +49,20 @@ GenerateRecoveryConfig(PGconn *pgconn, char *replication_slot)
 	{
 		/* Omit empty settings and those libpqwalreceiver overrides. */
 		if (strcmp(opt->keyword, "replication") == 0 ||
-			strcmp(opt->keyword, "dbname") == 0 ||
 			strcmp(opt->keyword, "fallback_application_name") == 0 ||
 			(opt->val == NULL) ||
 			(opt->val != NULL && opt->val[0] == '\0'))
 			continue;
 
+		/*
+		 * Omit "dbname" if the special value "replication" is specified, or
+		 * the server version is 16 and earlier.
+		 */
+		if (strcmp(opt->keyword, "dbname") == 0 &&
+			(strcmp(opt->val, "replication") == 0 ||
+			 PQserverVersion(pgconn) < MINIMUM_VERSION_FOR_DBNAME_PARAM))
+			continue;
+
 		/* Separate key-value pairs with spaces */
 		if (conninfo_buf.len != 0)
 			appendPQExpBufferChar(&conninfo_buf, ' ');
diff --git a/src/include/fe_utils/recovery_gen.h b/src/include/fe_utils/recovery_gen.h
index ca2c4800d0..740d90c9d8 100644
--- a/src/include/fe_utils/recovery_gen.h
+++ b/src/include/fe_utils/recovery_gen.h
@@ -20,6 +20,12 @@
  */
 #define MINIMUM_VERSION_FOR_RECOVERY_GUC 120000
 
+/*
+ * from version 17, there is a use-case for adding the dbname parameter
+ * to the generated "primary_conninfo" value
+ */
+#define MINIMUM_VERSION_FOR_DBNAME_PARAM 170000
+
 extern PQExpBuffer GenerateRecoveryConfig(PGconn *pgconn,
 										  char *replication_slot);
 extern void WriteRecoveryConfig(PGconn *pgconn, char *target_dir,
#18Hayato Kuroda (Fujitsu)
kuroda.hayato@fujitsu.com
In reply to: Hayato Kuroda (Fujitsu) (#17)
RE: Have pg_basebackup write "dbname" in "primary_conninfo"?

PSA the patch for implementing it. It is basically same as Ian's one.
However, this patch still cannot satisfy the condition 3).

`pg_basebackup -D data_N2 -d "user=postgres" -R`
-> dbname would not be appeared in primary_conninfo.

This is because `if (connection_string)` case in GetConnection() explicy override
a dbname to "replication". I've tried to add a dummy entry {"dbname", NULL} pair
before the overriding, but it is no-op. Because The replacement of the dbname in
pqConnectOptions2() would be done only for the valid (=lastly specified)
connection options.

Oh, this patch missed the straightforward case:

pg_basebackup -D data_N2 -d "user=postgres dbname=replication" -R
-> dbname would not be appeared in primary_conninfo.

So I think it cannot be applied as-is. Sorry for sharing the bad item.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/

#19Amit Kapila
amit.kapila16@gmail.com
In reply to: Hayato Kuroda (Fujitsu) (#17)
Re: Have pg_basebackup write "dbname" in "primary_conninfo"?

On Tue, Feb 27, 2024 at 2:00 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:

We do append dbname=replication even in libpqrcv_connect for .pgpass
lookup as mentioned in comments. See below:
libpqrcv_connect()
{
....
else
{
/*
* The database name is ignored by the server in replication mode,
* but specify "replication" for .pgpass lookup.
*/
keys[++i] = "dbname";
vals[i] = "replication";
}
...
}

OK. So we must add the value for the authorization, right?
I think it should be described even in GetConnection().

I think as part of this effort we should just add dbname to
primary_conninfo written in postgresql.auto.conf file. As above, the
question is valid whether we should do it just for 17 or for prior
versions. Let's discuss more on that. I am not sure of the use case
for versions before 17 but commit cca97ce6a665 mentioned that some
middleware or proxies might however need to know the dbname to make
the correct routing decision for the connection. Does that apply here
as well? If so, we can do it and update the docs, otherwise, let's do
it just for 17.

Hmm, I might lose your requirements again. So, we must satisfy all of below
ones, right?
1) add {"dbname", "replication"} key-value pair to look up .pgpass file correctly.
2) If the -R is given, output dbname=xxx value to be used by slotsync worker.
3) If the dbname is not given in the connection string, the same string as
username must be used to keep the libpq connection rule.
4) No need to add dbname=replication pair

Point 1) and 4) seems contradictory or maybe I am missing something.

--
With Regards,
Amit Kapila.

#20Amit Kapila
amit.kapila16@gmail.com
In reply to: Hayato Kuroda (Fujitsu) (#18)
Re: Have pg_basebackup write "dbname" in "primary_conninfo"?

On Tue, Feb 27, 2024 at 2:07 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:

PSA the patch for implementing it. It is basically same as Ian's one.
However, this patch still cannot satisfy the condition 3).

`pg_basebackup -D data_N2 -d "user=postgres" -R`
-> dbname would not be appeared in primary_conninfo.

This is because `if (connection_string)` case in GetConnection() explicy override
a dbname to "replication". I've tried to add a dummy entry {"dbname", NULL} pair
before the overriding, but it is no-op. Because The replacement of the dbname in
pqConnectOptions2() would be done only for the valid (=lastly specified)
connection options.

Oh, this patch missed the straightforward case:

pg_basebackup -D data_N2 -d "user=postgres dbname=replication" -R
-> dbname would not be appeared in primary_conninfo.

So I think it cannot be applied as-is. Sorry for sharing the bad item.

Can you please share the patch that can be considered for review?

--
With Regards,
Amit Kapila.

#21vignesh C
vignesh21@gmail.com
In reply to: Amit Kapila (#20)
1 attachment(s)
Re: Have pg_basebackup write "dbname" in "primary_conninfo"?

On Mon, 11 Mar 2024 at 17:16, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Feb 27, 2024 at 2:07 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:

PSA the patch for implementing it. It is basically same as Ian's one.
However, this patch still cannot satisfy the condition 3).

`pg_basebackup -D data_N2 -d "user=postgres" -R`
-> dbname would not be appeared in primary_conninfo.

This is because `if (connection_string)` case in GetConnection() explicy override
a dbname to "replication". I've tried to add a dummy entry {"dbname", NULL} pair
before the overriding, but it is no-op. Because The replacement of the dbname in
pqConnectOptions2() would be done only for the valid (=lastly specified)
connection options.

Oh, this patch missed the straightforward case:

pg_basebackup -D data_N2 -d "user=postgres dbname=replication" -R
-> dbname would not be appeared in primary_conninfo.

So I think it cannot be applied as-is. Sorry for sharing the bad item.

Can you please share the patch that can be considered for review?

Here is a patch with few changes: a) Removed the version check based
on discussion from [1]/messages/by-id/CAGECzQTh9oB3nu98DsHMpRaVaqXPDRgTDEikY82OAKYF0=hVMA@mail.gmail.com b) updated the documentation.
I have tested various scenarios with the attached patch, the dbname
that is written in postgresql.auto.conf for each of the cases with
pg_basebackup is given below:
1) ./pg_basebackup -U vignesh -R
-> primary_conninfo = "dbname=vignesh" (In this case primary_conninfo
will have dbname as username specified)
2) ./pg_basebackup -D data -R
-> primary_conninfo = "dbname=vignesh" (In this case primary_conninfo
will have the dbname as username (which is the same as the OS user
while setting defaults))
3) ./pg_basebackup -d "user=vignesh" -D data -R
-> primary_conninfo = "dbname=replication" (In this case
primary_conninfo will have dbname as replication which is the default
value from GetConnection as connection string is specified)
4) ./pg_basebackup -d "user=vignesh dbname=postgres" -D data -R
-> primary_conninfo = "dbname=postgres" (In this case primary_conninfo
will have the dbname as the dbname specified)
5) ./pg_basebackup -d "" -D data -R
-> primary_conninfo = "dbname=replication" (In this case
primary_conninfo will have dbname as replication which is the default
value from GetConnection as connection string is specified)

I have mentioned the reasons as to why dbname is being set to
replication or username in each of the cases. How about replacing
these values in postgresql.auto.conf manually.

[1]: /messages/by-id/CAGECzQTh9oB3nu98DsHMpRaVaqXPDRgTDEikY82OAKYF0=hVMA@mail.gmail.com

Regards,
Vignesh

Attachments:

pg_basebackup-write-dbname.v0003.patchtext/x-patch; charset=US-ASCII; name=pg_basebackup-write-dbname.v0003.patchDownload
diff --git a/doc/src/sgml/ref/pg_basebackup.sgml b/doc/src/sgml/ref/pg_basebackup.sgml
index 88c689e725..1a6980d1b0 100644
--- a/doc/src/sgml/ref/pg_basebackup.sgml
+++ b/doc/src/sgml/ref/pg_basebackup.sgml
@@ -243,7 +243,10 @@ PostgreSQL documentation
         The <filename>postgresql.auto.conf</filename> file will record the connection
         settings and, if specified, the replication slot
         that <application>pg_basebackup</application> is using, so that
-        streaming replication will use the same settings later on.
+        a) synchronization of logical replication slots on the primary to the
+        hot_standby from <link linkend="pg-sync-replication-slots">
+        <function>pg_sync_replication_slots</function></link> or slot sync worker
+        and b) streaming replication will use the same settings later on.
        </para>
 
       </listitem>
diff --git a/src/fe_utils/recovery_gen.c b/src/fe_utils/recovery_gen.c
index 2585f11939..cb37703ab9 100644
--- a/src/fe_utils/recovery_gen.c
+++ b/src/fe_utils/recovery_gen.c
@@ -49,7 +49,6 @@ GenerateRecoveryConfig(PGconn *pgconn, char *replication_slot)
 	{
 		/* Omit empty settings and those libpqwalreceiver overrides. */
 		if (strcmp(opt->keyword, "replication") == 0 ||
-			strcmp(opt->keyword, "dbname") == 0 ||
 			strcmp(opt->keyword, "fallback_application_name") == 0 ||
 			(opt->val == NULL) ||
 			(opt->val != NULL && opt->val[0] == '\0'))
#22Amit Kapila
amit.kapila16@gmail.com
In reply to: vignesh C (#21)
Re: Have pg_basebackup write "dbname" in "primary_conninfo"?

On Tue, Mar 12, 2024 at 5:13 PM vignesh C <vignesh21@gmail.com> wrote:

On Mon, 11 Mar 2024 at 17:16, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Feb 27, 2024 at 2:07 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:

PSA the patch for implementing it. It is basically same as Ian's one.
However, this patch still cannot satisfy the condition 3).

`pg_basebackup -D data_N2 -d "user=postgres" -R`
-> dbname would not be appeared in primary_conninfo.

This is because `if (connection_string)` case in GetConnection() explicy override
a dbname to "replication". I've tried to add a dummy entry {"dbname", NULL} pair
before the overriding, but it is no-op. Because The replacement of the dbname in
pqConnectOptions2() would be done only for the valid (=lastly specified)
connection options.

Oh, this patch missed the straightforward case:

pg_basebackup -D data_N2 -d "user=postgres dbname=replication" -R
-> dbname would not be appeared in primary_conninfo.

So I think it cannot be applied as-is. Sorry for sharing the bad item.

Can you please share the patch that can be considered for review?

Here is a patch with few changes: a) Removed the version check based
on discussion from [1] b) updated the documentation.
I have tested various scenarios with the attached patch, the dbname
that is written in postgresql.auto.conf for each of the cases with
pg_basebackup is given below:
1) ./pg_basebackup -U vignesh -R
-> primary_conninfo = "dbname=vignesh" (In this case primary_conninfo
will have dbname as username specified)
2) ./pg_basebackup -D data -R
-> primary_conninfo = "dbname=vignesh" (In this case primary_conninfo
will have the dbname as username (which is the same as the OS user
while setting defaults))
3) ./pg_basebackup -d "user=vignesh" -D data -R
-> primary_conninfo = "dbname=replication" (In this case
primary_conninfo will have dbname as replication which is the default
value from GetConnection as connection string is specified)
4) ./pg_basebackup -d "user=vignesh dbname=postgres" -D data -R
-> primary_conninfo = "dbname=postgres" (In this case primary_conninfo
will have the dbname as the dbname specified)
5) ./pg_basebackup -d "" -D data -R
-> primary_conninfo = "dbname=replication" (In this case
primary_conninfo will have dbname as replication which is the default
value from GetConnection as connection string is specified)

I have mentioned the reasons as to why dbname is being set to
replication or username in each of the cases.

IIUC, the dbname is already allowed in connstring for pg_basebackup by
commit cca97ce6a6 and with this patch we are just writing it in
postgresql.auto.conf if -R option is used to write recovery info. This
will help slot syncworker to automatically connect with database
without user manually specifying the dbname and replication
connection, the same will still be ignored. I don't see any problem
with this. Does anyone else see any problem?

The <filename>postgresql.auto.conf</filename> file will record the connection
         settings and, if specified, the replication slot
         that <application>pg_basebackup</application> is using, so that
-        streaming replication will use the same settings later on.
+        a) synchronization of logical replication slots on the primary to the
+        hot_standby from <link linkend="pg-sync-replication-slots">
+        <function>pg_sync_replication_slots</function></link> or slot
sync worker
+        and b) streaming replication will use the same settings later on.

We can simply modify the last line as: ".. streaming replication or
logical replication slots synchronization will use the same settings
later on." Additionally, you can give a link reference to [1]https://www.postgresql.org/docs/devel/logicaldecoding-explanation.html#LOGICALDECODING-REPLICATION-SLOTS-SYNCHRONIZATION -- With Regards, Amit Kapila..

[1]: https://www.postgresql.org/docs/devel/logicaldecoding-explanation.html#LOGICALDECODING-REPLICATION-SLOTS-SYNCHRONIZATION -- With Regards, Amit Kapila.
--
With Regards,
Amit Kapila.

#23vignesh C
vignesh21@gmail.com
In reply to: Amit Kapila (#22)
1 attachment(s)
Re: Have pg_basebackup write "dbname" in "primary_conninfo"?

On Wed, 13 Mar 2024 at 16:58, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Mar 12, 2024 at 5:13 PM vignesh C <vignesh21@gmail.com> wrote:

On Mon, 11 Mar 2024 at 17:16, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Feb 27, 2024 at 2:07 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:

PSA the patch for implementing it. It is basically same as Ian's one.
However, this patch still cannot satisfy the condition 3).

`pg_basebackup -D data_N2 -d "user=postgres" -R`
-> dbname would not be appeared in primary_conninfo.

This is because `if (connection_string)` case in GetConnection() explicy override
a dbname to "replication". I've tried to add a dummy entry {"dbname", NULL} pair
before the overriding, but it is no-op. Because The replacement of the dbname in
pqConnectOptions2() would be done only for the valid (=lastly specified)
connection options.

Oh, this patch missed the straightforward case:

pg_basebackup -D data_N2 -d "user=postgres dbname=replication" -R
-> dbname would not be appeared in primary_conninfo.

So I think it cannot be applied as-is. Sorry for sharing the bad item.

Can you please share the patch that can be considered for review?

Here is a patch with few changes: a) Removed the version check based
on discussion from [1] b) updated the documentation.
I have tested various scenarios with the attached patch, the dbname
that is written in postgresql.auto.conf for each of the cases with
pg_basebackup is given below:
1) ./pg_basebackup -U vignesh -R
-> primary_conninfo = "dbname=vignesh" (In this case primary_conninfo
will have dbname as username specified)
2) ./pg_basebackup -D data -R
-> primary_conninfo = "dbname=vignesh" (In this case primary_conninfo
will have the dbname as username (which is the same as the OS user
while setting defaults))
3) ./pg_basebackup -d "user=vignesh" -D data -R
-> primary_conninfo = "dbname=replication" (In this case
primary_conninfo will have dbname as replication which is the default
value from GetConnection as connection string is specified)
4) ./pg_basebackup -d "user=vignesh dbname=postgres" -D data -R
-> primary_conninfo = "dbname=postgres" (In this case primary_conninfo
will have the dbname as the dbname specified)
5) ./pg_basebackup -d "" -D data -R
-> primary_conninfo = "dbname=replication" (In this case
primary_conninfo will have dbname as replication which is the default
value from GetConnection as connection string is specified)

I have mentioned the reasons as to why dbname is being set to
replication or username in each of the cases.

IIUC, the dbname is already allowed in connstring for pg_basebackup by
commit cca97ce6a6 and with this patch we are just writing it in
postgresql.auto.conf if -R option is used to write recovery info. This
will help slot syncworker to automatically connect with database
without user manually specifying the dbname and replication
connection, the same will still be ignored. I don't see any problem
with this. Does anyone else see any problem?

The <filename>postgresql.auto.conf</filename> file will record the connection
settings and, if specified, the replication slot
that <application>pg_basebackup</application> is using, so that
-        streaming replication will use the same settings later on.
+        a) synchronization of logical replication slots on the primary to the
+        hot_standby from <link linkend="pg-sync-replication-slots">
+        <function>pg_sync_replication_slots</function></link> or slot
sync worker
+        and b) streaming replication will use the same settings later on.

We can simply modify the last line as: ".. streaming replication or
logical replication slots synchronization will use the same settings
later on." Additionally, you can give a link reference to [1].

Thanks for the comments, the attached v4 version patch has the changes
for the same.

Regards,
Vignesh

Attachments:

pg_basebackup-write-dbname.v0004.patchtext/x-patch; charset=US-ASCII; name=pg_basebackup-write-dbname.v0004.patchDownload
diff --git a/doc/src/sgml/ref/pg_basebackup.sgml b/doc/src/sgml/ref/pg_basebackup.sgml
index 88c689e725..563346dd71 100644
--- a/doc/src/sgml/ref/pg_basebackup.sgml
+++ b/doc/src/sgml/ref/pg_basebackup.sgml
@@ -243,7 +243,9 @@ PostgreSQL documentation
         The <filename>postgresql.auto.conf</filename> file will record the connection
         settings and, if specified, the replication slot
         that <application>pg_basebackup</application> is using, so that
-        streaming replication will use the same settings later on.
+        streaming replication or <link linkend="logicaldecoding-replication-slots-synchronization">
+        logical replication slots synchronization</link> will use the same
+        settings later on.
        </para>
 
       </listitem>
diff --git a/src/fe_utils/recovery_gen.c b/src/fe_utils/recovery_gen.c
index 2585f11939..cb37703ab9 100644
--- a/src/fe_utils/recovery_gen.c
+++ b/src/fe_utils/recovery_gen.c
@@ -49,7 +49,6 @@ GenerateRecoveryConfig(PGconn *pgconn, char *replication_slot)
 	{
 		/* Omit empty settings and those libpqwalreceiver overrides. */
 		if (strcmp(opt->keyword, "replication") == 0 ||
-			strcmp(opt->keyword, "dbname") == 0 ||
 			strcmp(opt->keyword, "fallback_application_name") == 0 ||
 			(opt->val == NULL) ||
 			(opt->val != NULL && opt->val[0] == '\0'))
#24Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Amit Kapila (#14)
Re: Have pg_basebackup write "dbname" in "primary_conninfo"?

On Fri, Feb 23, 2024 at 3:05 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Feb 21, 2024 at 7:46 AM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:

Just FYI - here is an extreme case. And note that I have applied proposed patch.

When `pg_basebackup -D data_N2 -R` is used:
```
primary_conninfo = 'user=hayato ... dbname=hayato ...
```

But when `pg_basebackup -d "" -D data_N2 -R` is used:
```
primary_conninfo = 'user=hayato ... dbname=replication
```

It seems like maybe somebody should look into why this is happening,
and perhaps fix it.

I think this caused from below part [1] in GetConnection().

If both dbname and connection_string are the NULL, we will enter the else part
and NULL would be substituted - {"dbnmae", NULL} key-value pair is generated
only here.

Then, in PQconnectdbParams()->PQconnectStartParams->pqConnectOptions2(),
the strange part would be found and replaced to the username [2].

I think if both the connection string and the dbname are NULL, it should be
considered as the physical replication connection. here is a patch to fix it.

When dbname is NULL or not given, it defaults to username. This
follows the specs of the connection string.

This fact makes me think that the slotsync worker might be able to
accept the primary_conninfo value even if there is no dbname in the
value. That is, if there is no dbname in the primary_conninfo, it uses
the username in accordance with the specs of the connection string.
Currently, the slotsync worker connects to the local database first
and then establishes the connection to the primary server. But if we
can reverse the two steps, it can get the dbname that has actually
been used to establish the remote connection and use it for the local
connection too. That way, the primary_conninfo generated by
pg_basebackup could work even without the patch. For example, if the
OS user executing pg_basebackup is 'postgres', the slotsync worker
would connect to the postgres database. Given the 'postgres' database
is created by default and 'postgres' OS user is used in common, I
guess it could cover many cases in practice actually.

Having said that, even with (or without) the above change, we might
want to change the pg_basebackup so that it writes the dbname to the
primary_conninfo if -R option is specified. Since the database where
the slotsync worker connects cannot be dropped while the slotsync
worker is running, the user might want to change the database to
connect, and it would be useful if they can do that using
pg_basebackup instead of modifying the configuration file manually.

While the current approach makes sense to me, I'm a bit concerned that
we might end up having the pg_basebackup search the actual database
name (e.g. 'dbname=template1') from the .pgpass file instead of
'dbname=replication'. As far as I tested on my environment, suppose
that I execute:

pg_basebackup -D tmp -d "dbname=testdb" -R

The pg_basebackup established a replication connection but looked for
the password of the 'testdb' database. This could be another
inconvenience for the existing users who want to use the slot
synchronization.

A random idea I came up with is, we add a new option to the
pg_basebackup to overwrite the full or some portion of the connection
string that is eventually written in the primary_conninfo in
postgresql.auto.conf. For example, the command:

pg_basebackup -D tmp -d "host=1.1.1.1 port=5555" -R
--primary-coninfo-ext "host=2.2.2.2 dbname=postgres"

will produce the connection string that is based on -d option value
but is overwritten by --primary-conninfo-ext option value, which will
be like:

host=2.2.2.2 dbname=postgres port=5555

This option might help not only for users who want to use the slotsync
worker but also for users who want to take a basebackup from a standby
but have the new standby connect to the primary.

But it's still just an idea and I might be missing something. And
given we're getting closer to the feature freeze, it would be a PG18
item.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#25Amit Kapila
amit.kapila16@gmail.com
In reply to: Masahiko Sawada (#24)
Re: Have pg_basebackup write "dbname" in "primary_conninfo"?

On Thu, Mar 14, 2024 at 5:57 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

This fact makes me think that the slotsync worker might be able to
accept the primary_conninfo value even if there is no dbname in the
value. That is, if there is no dbname in the primary_conninfo, it uses
the username in accordance with the specs of the connection string.
Currently, the slotsync worker connects to the local database first
and then establishes the connection to the primary server. But if we
can reverse the two steps, it can get the dbname that has actually
been used to establish the remote connection and use it for the local
connection too. That way, the primary_conninfo generated by
pg_basebackup could work even without the patch. For example, if the
OS user executing pg_basebackup is 'postgres', the slotsync worker
would connect to the postgres database. Given the 'postgres' database
is created by default and 'postgres' OS user is used in common, I
guess it could cover many cases in practice actually.

I think this is worth investigating but I suspect that in most cases
users will end up using a replication connection without specifying
the user name and we may not be able to give a meaningful error
message when slotsync worker won't be able to connect. The same will
be true even when the dbname same as the username would be used.

Having said that, even with (or without) the above change, we might
want to change the pg_basebackup so that it writes the dbname to the
primary_conninfo if -R option is specified. Since the database where
the slotsync worker connects cannot be dropped while the slotsync
worker is running, the user might want to change the database to
connect, and it would be useful if they can do that using
pg_basebackup instead of modifying the configuration file manually.

While the current approach makes sense to me, I'm a bit concerned that
we might end up having the pg_basebackup search the actual database
name (e.g. 'dbname=template1') from the .pgpass file instead of
'dbname=replication'. As far as I tested on my environment, suppose
that I execute:

pg_basebackup -D tmp -d "dbname=testdb" -R

The pg_basebackup established a replication connection but looked for
the password of the 'testdb' database. This could be another
inconvenience for the existing users who want to use the slot
synchronization.

This is true because it is internally using logical replication
connection (as we will set set replication=database). I feel the
mentioned behavior is an expected one with or without slotsync worker
usage. Anyway, this is an optional feature, so users can still choose
to ignore specifying dbname in the connection string. The point is
that commit cca97ce6a6 allowed using dbname in the connection string
in pg_basebackup and we are just extending to write that dbname along
with other things in connection_info.

A random idea I came up with is, we add a new option to the
pg_basebackup to overwrite the full or some portion of the connection
string that is eventually written in the primary_conninfo in
postgresql.auto.conf. For example, the command:

pg_basebackup -D tmp -d "host=1.1.1.1 port=5555" -R
--primary-coninfo-ext "host=2.2.2.2 dbname=postgres"

will produce the connection string that is based on -d option value
but is overwritten by --primary-conninfo-ext option value, which will
be like:

host=2.2.2.2 dbname=postgres port=5555

This option might help not only for users who want to use the slotsync
worker but also for users who want to take a basebackup from a standby
but have the new standby connect to the primary.

Agreed, this could be another way though it would be good to get some
inputs from users or otherwise about the preferred way to specify
dbname. One can also imagine using the Alter System for this purpose.

But it's still just an idea and I might be missing something. And
given we're getting closer to the feature freeze, it would be a PG18
item.

+1. At this stage, it is important to discuss whether we should allow
pg_baseback to write dbname (either a specified one or a default one)
along with other parameters in primary_conninfo?

--
With Regards,
Amit Kapila.

#26shveta malik
shveta.malik@gmail.com
In reply to: Amit Kapila (#25)
1 attachment(s)
Re: Have pg_basebackup write "dbname" in "primary_conninfo"?

On Thu, Mar 14, 2024 at 10:57 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, Mar 14, 2024 at 5:57 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

This fact makes me think that the slotsync worker might be able to
accept the primary_conninfo value even if there is no dbname in the
value. That is, if there is no dbname in the primary_conninfo, it uses
the username in accordance with the specs of the connection string.
Currently, the slotsync worker connects to the local database first
and then establishes the connection to the primary server. But if we
can reverse the two steps, it can get the dbname that has actually
been used to establish the remote connection and use it for the local
connection too. That way, the primary_conninfo generated by
pg_basebackup could work even without the patch. For example, if the
OS user executing pg_basebackup is 'postgres', the slotsync worker
would connect to the postgres database. Given the 'postgres' database
is created by default and 'postgres' OS user is used in common, I
guess it could cover many cases in practice actually.

I think this is worth investigating but I suspect that in most cases
users will end up using a replication connection without specifying
the user name and we may not be able to give a meaningful error
message when slotsync worker won't be able to connect. The same will
be true even when the dbname same as the username would be used.

I attempted the change as suggested by Swada-San. Attached the PoC
patch .For it to work, I have to expose a new get api in libpq-fe
which gets dbname from stream-connection. Please have a look.

Without this PoC patch, the errors in slot-sync worker:

-----------------
a) If dbname is missing:
[1230932]: ERROR: slot synchronization requires dbname to be specified in primary_conninfo
[1230932]: ERROR: slot synchronization requires dbname to be specified in primary_conninfo
in primary_conninfo

b) If specified db does not exist
[1230913]: FATAL: database "postgres1" does not exist -----------------
[1230913]: FATAL: database "postgres1" does not exist -----------------
-----------------

Now with this patch:
-----------------
a) If the dbname same as user does not exist:
[1232473]: ERROR: could not connect to the primary server: connection to server at "127.0.0.1", port 5433 failed: FATAL: database "bckp_user" does not exist
[1232473]: ERROR: could not connect to the primary server: connection to server at "127.0.0.1", port 5433 failed: FATAL: database "bckp_user" does not exist
to server at "127.0.0.1", port 5433 failed: FATAL: database
"bckp_user" does not exist

b) If user itself is removed from primary_conninfo, libpq takes user
who has authenticated the system by default and gives error if db of
same name does not exist
ERROR: could not connect to the primary server: connection to server
at "127.0.0.1", port 5433 failed: FATAL: database "shveta" does not
exist
-----------------

The errors in second case look slightly confusing to me.

thanks
Shveta

Attachments:

v1-0001-Use-user-as-dbname-for-slot-sync.patchapplication/octet-stream; name=v1-0001-Use-user-as-dbname-for-slot-sync.patchDownload
From 4511eb68cf8471ed31f0a690b507b7fdd9ccc9e7 Mon Sep 17 00:00:00 2001
From: Shveta Malik <shveta.malik@gmail.com>
Date: Thu, 14 Mar 2024 11:20:00 +0530
Subject: [PATCH v1] Use user as dbname for slot sync

---
 .../libpqwalreceiver/libpqwalreceiver.c       | 45 ++-----------
 src/backend/replication/logical/slotsync.c    | 63 ++++++-------------
 src/backend/replication/slotfuncs.c           |  2 -
 src/include/replication/slotsync.h            |  1 -
 src/include/replication/walreceiver.h         | 12 ++--
 src/interfaces/libpq/exports.txt              |  1 +
 src/interfaces/libpq/fe-exec.c                |  9 +++
 src/interfaces/libpq/libpq-fe.h               |  1 +
 8 files changed, 42 insertions(+), 92 deletions(-)

diff --git a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
index 761bf0f677..9c28f956c1 100644
--- a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
+++ b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
@@ -59,7 +59,7 @@ static void libpqrcv_get_senderinfo(WalReceiverConn *conn,
 									char **sender_host, int *sender_port);
 static char *libpqrcv_identify_system(WalReceiverConn *conn,
 									  TimeLineID *primary_tli);
-static char *libpqrcv_get_dbname_from_conninfo(const char *conninfo);
+static char *libpqrcv_get_dbname_from_conn(WalReceiverConn *conn);
 static int	libpqrcv_server_version(WalReceiverConn *conn);
 static void libpqrcv_readtimelinehistoryfile(WalReceiverConn *conn,
 											 TimeLineID tli, char **filename,
@@ -102,7 +102,7 @@ static WalReceiverFunctionsType PQWalReceiverFunctions = {
 	.walrcv_send = libpqrcv_send,
 	.walrcv_create_slot = libpqrcv_create_slot,
 	.walrcv_alter_slot = libpqrcv_alter_slot,
-	.walrcv_get_dbname_from_conninfo = libpqrcv_get_dbname_from_conninfo,
+	.walrcv_get_dbname_from_conn = libpqrcv_get_dbname_from_conn,
 	.walrcv_get_backend_pid = libpqrcv_get_backend_pid,
 	.walrcv_exec = libpqrcv_exec,
 	.walrcv_disconnect = libpqrcv_disconnect
@@ -494,47 +494,12 @@ libpqrcv_server_version(WalReceiverConn *conn)
 }
 
 /*
- * Get database name from the primary server's conninfo.
- *
- * If dbname is not found in connInfo, return NULL value.
+ * Get database name from the connection.
  */
 static char *
-libpqrcv_get_dbname_from_conninfo(const char *connInfo)
+libpqrcv_get_dbname_from_conn(WalReceiverConn *conn)
 {
-	PQconninfoOption *opts;
-	char	   *dbname = NULL;
-	char	   *err = NULL;
-
-	opts = PQconninfoParse(connInfo, &err);
-	if (opts == NULL)
-	{
-		/* The error string is malloc'd, so we must free it explicitly */
-		char	   *errcopy = err ? pstrdup(err) : "out of memory";
-
-		PQfreemem(err);
-		ereport(ERROR,
-				(errcode(ERRCODE_SYNTAX_ERROR),
-				 errmsg("invalid connection string syntax: %s", errcopy)));
-	}
-
-	for (PQconninfoOption *opt = opts; opt->keyword != NULL; ++opt)
-	{
-		/*
-		 * If multiple dbnames are specified, then the last one will be
-		 * returned
-		 */
-		if (strcmp(opt->keyword, "dbname") == 0 && opt->val &&
-			*opt->val)
-		{
-			if (dbname)
-				pfree(dbname);
-
-			dbname = pstrdup(opt->val);
-		}
-	}
-
-	PQconninfoFree(opts);
-	return dbname;
+	return PQgetDbname(conn->streamConn);
 }
 
 /*
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index 5074c8409f..57c59457c8 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -891,33 +891,6 @@ validate_remote_info(WalReceiverConn *wrconn)
 		CommitTransactionCommand();
 }
 
-/*
- * Checks if dbname is specified in 'primary_conninfo'.
- *
- * Error out if not specified otherwise return it.
- */
-char *
-CheckAndGetDbnameFromConninfo(void)
-{
-	char	   *dbname;
-
-	/*
-	 * The slot synchronization needs a database connection for walrcv_exec to
-	 * work.
-	 */
-	dbname = walrcv_get_dbname_from_conninfo(PrimaryConnInfo);
-	if (dbname == NULL)
-		ereport(ERROR,
-
-		/*
-		 * translator: dbname is a specific option; %s is a GUC variable name
-		 */
-				errcode(ERRCODE_INVALID_PARAMETER_VALUE),
-				errmsg("slot synchronization requires dbname to be specified in %s",
-					   "primary_conninfo"));
-	return dbname;
-}
-
 /*
  * Return true if all necessary GUCs for slot synchronization are set
  * appropriately, otherwise, return false.
@@ -1228,22 +1201,6 @@ ReplSlotSyncWorkerMain(int argc, char *argv[])
 	 */
 	SetConfigOption("search_path", "", PGC_SUSET, PGC_S_OVERRIDE);
 
-	dbname = CheckAndGetDbnameFromConninfo();
-
-	/*
-	 * Connect to the database specified by the user in primary_conninfo. We
-	 * need a database connection for walrcv_exec to work which we use to
-	 * fetch slot information from the remote node. See comments atop
-	 * libpqrcv_exec.
-	 *
-	 * We do not specify a specific user here since the slot sync worker will
-	 * operate as a superuser. This is safe because the slot sync worker does
-	 * not interact with user tables, eliminating the risk of executing
-	 * arbitrary code within triggers.
-	 */
-	InitPostgres(dbname, InvalidOid, NULL, InvalidOid, 0, NULL);
-
-	SetProcessingMode(NormalProcessing);
 
 	initStringInfo(&app_name);
 	if (cluster_name[0])
@@ -1264,6 +1221,26 @@ ReplSlotSyncWorkerMain(int argc, char *argv[])
 				errcode(ERRCODE_CONNECTION_FAILURE),
 				errmsg("could not connect to the primary server: %s", err));
 
+	/* There must be dbName initialized in wrconn */
+	dbname = walrcv_get_dbname_from_conn(wrconn);
+	Assert(dbname);
+
+	/*
+	 * Connect to the database which is used by libpq to make connection(wrconn).
+	 * We need a database connection for walrcv_exec to work which we use to
+	 * fetch slot information from the remote node. See comments atop
+	 * libpqrcv_exec.
+	 *
+	 * We do not specify a specific user here since the slot sync worker will
+	 * operate as a superuser. This is safe because the slot sync worker does
+	 * not interact with user tables, eliminating the risk of executing
+	 * arbitrary code within triggers.
+	 */
+	InitPostgres(dbname, InvalidOid, NULL, InvalidOid, 0, NULL);
+
+	SetProcessingMode(NormalProcessing);
+
+
 	/*
 	 * Register the failure callback once we have the connection.
 	 *
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index ad79e1fccd..205e790309 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -977,8 +977,6 @@ pg_sync_replication_slots(PG_FUNCTION_ARGS)
 	/* Load the libpq-specific functions */
 	load_file("libpqwalreceiver", false);
 
-	(void) CheckAndGetDbnameFromConninfo();
-
 	initStringInfo(&app_name);
 	if (cluster_name[0])
 		appendStringInfo(&app_name, "%s_slotsync", cluster_name);
diff --git a/src/include/replication/slotsync.h b/src/include/replication/slotsync.h
index dca57c5020..3f556a9623 100644
--- a/src/include/replication/slotsync.h
+++ b/src/include/replication/slotsync.h
@@ -23,7 +23,6 @@ extern PGDLLIMPORT bool sync_replication_slots;
 extern PGDLLIMPORT char *PrimaryConnInfo;
 extern PGDLLIMPORT char *PrimarySlotName;
 
-extern char *CheckAndGetDbnameFromConninfo(void);
 extern bool ValidateSlotSyncParams(int elevel);
 
 #ifdef EXEC_BACKEND
diff --git a/src/include/replication/walreceiver.h b/src/include/replication/walreceiver.h
index b906bb5ce8..6ede80c250 100644
--- a/src/include/replication/walreceiver.h
+++ b/src/include/replication/walreceiver.h
@@ -283,11 +283,11 @@ typedef char *(*walrcv_identify_system_fn) (WalReceiverConn *conn,
 											TimeLineID *primary_tli);
 
 /*
- * walrcv_get_dbname_from_conninfo_fn
+ * walrcv_get_dbname_from_conn_fn
  *
- * Returns the database name from the primary_conninfo
+ * Returns the database name from the connection.
  */
-typedef char *(*walrcv_get_dbname_from_conninfo_fn) (const char *conninfo);
+typedef char *(*walrcv_get_dbname_from_conn_fn) (WalReceiverConn *conn);
 
 /*
  * walrcv_server_version_fn
@@ -413,7 +413,7 @@ typedef struct WalReceiverFunctionsType
 	walrcv_get_conninfo_fn walrcv_get_conninfo;
 	walrcv_get_senderinfo_fn walrcv_get_senderinfo;
 	walrcv_identify_system_fn walrcv_identify_system;
-	walrcv_get_dbname_from_conninfo_fn walrcv_get_dbname_from_conninfo;
+	walrcv_get_dbname_from_conn_fn walrcv_get_dbname_from_conn;
 	walrcv_server_version_fn walrcv_server_version;
 	walrcv_readtimelinehistoryfile_fn walrcv_readtimelinehistoryfile;
 	walrcv_startstreaming_fn walrcv_startstreaming;
@@ -439,8 +439,8 @@ extern PGDLLIMPORT WalReceiverFunctionsType *WalReceiverFunctions;
 	WalReceiverFunctions->walrcv_get_senderinfo(conn, sender_host, sender_port)
 #define walrcv_identify_system(conn, primary_tli) \
 	WalReceiverFunctions->walrcv_identify_system(conn, primary_tli)
-#define walrcv_get_dbname_from_conninfo(conninfo) \
-	WalReceiverFunctions->walrcv_get_dbname_from_conninfo(conninfo)
+#define walrcv_get_dbname_from_conn(conn) \
+	WalReceiverFunctions->walrcv_get_dbname_from_conn(conn)
 #define walrcv_server_version(conn) \
 	WalReceiverFunctions->walrcv_server_version(conn)
 #define walrcv_readtimelinehistoryfile(conn, tli, filename, content, size) \
diff --git a/src/interfaces/libpq/exports.txt b/src/interfaces/libpq/exports.txt
index 9fbd3d3407..e879116a0d 100644
--- a/src/interfaces/libpq/exports.txt
+++ b/src/interfaces/libpq/exports.txt
@@ -202,3 +202,4 @@ PQcancelSocket            199
 PQcancelErrorMessage      200
 PQcancelReset             201
 PQcancelFinish            202
+PQgetDbname               203
diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c
index c02a9180b2..4b61cebdf7 100644
--- a/src/interfaces/libpq/fe-exec.c
+++ b/src/interfaces/libpq/fe-exec.c
@@ -2917,6 +2917,15 @@ PQendcopy(PGconn *conn)
 }
 
 
+char *
+PQgetDbname(PGconn *conn)
+{
+	if (!conn)
+		return 0;
+
+	return conn->dbName;
+}
+
 /* ----------------
  *		PQfn -	Send a function call to the POSTGRES backend.
  *
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index 09b485bd2b..eac2984d1c 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -516,6 +516,7 @@ extern int	PQputline(PGconn *conn, const char *string);
 extern int	PQgetlineAsync(PGconn *conn, char *buffer, int bufsize);
 extern int	PQputnbytes(PGconn *conn, const char *buffer, int nbytes);
 extern int	PQendcopy(PGconn *conn);
+extern char *PQgetDbname(PGconn *conn);
 
 /* Set blocking/nonblocking connection to the backend */
 extern int	PQsetnonblocking(PGconn *conn, int arg);
-- 
2.34.1

#27Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Amit Kapila (#25)
Re: Have pg_basebackup write "dbname" in "primary_conninfo"?

On Thu, Mar 14, 2024 at 2:27 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, Mar 14, 2024 at 5:57 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

This fact makes me think that the slotsync worker might be able to
accept the primary_conninfo value even if there is no dbname in the
value. That is, if there is no dbname in the primary_conninfo, it uses
the username in accordance with the specs of the connection string.
Currently, the slotsync worker connects to the local database first
and then establishes the connection to the primary server. But if we
can reverse the two steps, it can get the dbname that has actually
been used to establish the remote connection and use it for the local
connection too. That way, the primary_conninfo generated by
pg_basebackup could work even without the patch. For example, if the
OS user executing pg_basebackup is 'postgres', the slotsync worker
would connect to the postgres database. Given the 'postgres' database
is created by default and 'postgres' OS user is used in common, I
guess it could cover many cases in practice actually.

I think this is worth investigating but I suspect that in most cases
users will end up using a replication connection without specifying
the user name and we may not be able to give a meaningful error
message when slotsync worker won't be able to connect. The same will
be true even when the dbname same as the username would be used.

What do you mean by not being able to give a meaningful error message?

If the slotsync worker uses the user name as the dbname, and such a
database doesn't exist, the error message the user will get is
"database "test_user" does not exist". ISTM the same is true when the
user specifies the wrong database in the primary_conninfo.

Having said that, even with (or without) the above change, we might
want to change the pg_basebackup so that it writes the dbname to the
primary_conninfo if -R option is specified. Since the database where
the slotsync worker connects cannot be dropped while the slotsync
worker is running, the user might want to change the database to
connect, and it would be useful if they can do that using
pg_basebackup instead of modifying the configuration file manually.

While the current approach makes sense to me, I'm a bit concerned that
we might end up having the pg_basebackup search the actual database
name (e.g. 'dbname=template1') from the .pgpass file instead of
'dbname=replication'. As far as I tested on my environment, suppose
that I execute:

pg_basebackup -D tmp -d "dbname=testdb" -R

The pg_basebackup established a replication connection but looked for
the password of the 'testdb' database. This could be another
inconvenience for the existing users who want to use the slot
synchronization.

This is true because it is internally using logical replication
connection (as we will set set replication=database).

Did you mean the pg_basebackup is using a logical replication
connection in this case? As far as I tested, even if we specify dbname
to the -d option of pg_basebackup, it uses a physical replication
connection. For example, it can take a backup even if I specify a
non-existing database name.

A random idea I came up with is, we add a new option to the
pg_basebackup to overwrite the full or some portion of the connection
string that is eventually written in the primary_conninfo in
postgresql.auto.conf. For example, the command:

pg_basebackup -D tmp -d "host=1.1.1.1 port=5555" -R
--primary-coninfo-ext "host=2.2.2.2 dbname=postgres"

will produce the connection string that is based on -d option value
but is overwritten by --primary-conninfo-ext option value, which will
be like:

host=2.2.2.2 dbname=postgres port=5555

This option might help not only for users who want to use the slotsync
worker but also for users who want to take a basebackup from a standby
but have the new standby connect to the primary.

Agreed, this could be another way though it would be good to get some
inputs from users or otherwise about the preferred way to specify
dbname. One can also imagine using the Alter System for this purpose.

Agreed.

But it's still just an idea and I might be missing something. And
given we're getting closer to the feature freeze, it would be a PG18
item.

+1. At this stage, it is important to discuss whether we should allow
pg_baseback to write dbname (either a specified one or a default one)
along with other parameters in primary_conninfo?

True. While I basically agree that pg_basebackup writes dbname in
primary_conninfo, I'm concerned that writing "dbname=replication"
could be problematic. Quoting the case 3) Vignesh summarized before:

3) ./pg_basebackup -d "user=vignesh" -D data -R
-> primary_conninfo = "dbname=replication" (In this case
primary_conninfo will have dbname as replication which is the default
value from GetConnection as connection string is specified)

The primary_conninfo generated by pg_basebackup -R is now used by
either a walreceiver (for physical replication connection) or a
slotsync worker (for normal connection). The "dbname=replication" is
okay for walreceiver. On the other hand, as for the slotsync worker,
it can pass the CheckAndGetDbnameFromConninfo() check but it's very
likely that it cannot connect to the primary since most users won't
create a database with "replication" name. The user will end up
getting an error message like 'database "replication" does not exist'
but I'm not sure it would be informative for users. Rather, the error
message "slot synchronization requires dbname to be specified in
primary_conninfo" might be more informative for users. So I personally
like to omit the dbname if "dbname=replication", at this point.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#28Amit Kapila
amit.kapila16@gmail.com
In reply to: Masahiko Sawada (#27)
Re: Have pg_basebackup write "dbname" in "primary_conninfo"?

On Thu, Mar 14, 2024 at 1:45 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Thu, Mar 14, 2024 at 2:27 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, Mar 14, 2024 at 5:57 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

This fact makes me think that the slotsync worker might be able to
accept the primary_conninfo value even if there is no dbname in the
value. That is, if there is no dbname in the primary_conninfo, it uses
the username in accordance with the specs of the connection string.
Currently, the slotsync worker connects to the local database first
and then establishes the connection to the primary server. But if we
can reverse the two steps, it can get the dbname that has actually
been used to establish the remote connection and use it for the local
connection too. That way, the primary_conninfo generated by
pg_basebackup could work even without the patch. For example, if the
OS user executing pg_basebackup is 'postgres', the slotsync worker
would connect to the postgres database. Given the 'postgres' database
is created by default and 'postgres' OS user is used in common, I
guess it could cover many cases in practice actually.

I think this is worth investigating but I suspect that in most cases
users will end up using a replication connection without specifying
the user name and we may not be able to give a meaningful error
message when slotsync worker won't be able to connect. The same will
be true even when the dbname same as the username would be used.

What do you mean by not being able to give a meaningful error message?

If the slotsync worker uses the user name as the dbname, and such a
database doesn't exist, the error message the user will get is
"database "test_user" does not exist". ISTM the same is true when the
user specifies the wrong database in the primary_conninfo.

Right, the exact error message as mentioned by Shveta will be:
ERROR: could not connect to the primary server: connection to server
at "127.0.0.1", port 5433 failed: FATAL: database "bckp_user" does not
exist

Now, without this idea, the ERROR message will be:
ERROR: slot synchronization requires dbname to be specified in
primary_conninfo

I am not sure how much this matters but the second message sounds more useful.

Having said that, even with (or without) the above change, we might
want to change the pg_basebackup so that it writes the dbname to the
primary_conninfo if -R option is specified. Since the database where
the slotsync worker connects cannot be dropped while the slotsync
worker is running, the user might want to change the database to
connect, and it would be useful if they can do that using
pg_basebackup instead of modifying the configuration file manually.

While the current approach makes sense to me, I'm a bit concerned that
we might end up having the pg_basebackup search the actual database
name (e.g. 'dbname=template1') from the .pgpass file instead of
'dbname=replication'. As far as I tested on my environment, suppose
that I execute:

pg_basebackup -D tmp -d "dbname=testdb" -R

The pg_basebackup established a replication connection but looked for
the password of the 'testdb' database. This could be another
inconvenience for the existing users who want to use the slot
synchronization.

This is true because it is internally using logical replication
connection (as we will set set replication=database).

Did you mean the pg_basebackup is using a logical replication
connection in this case? As far as I tested, even if we specify dbname
to the -d option of pg_basebackup, it uses a physical replication
connection. For example, it can take a backup even if I specify a
non-existing database name.

You are right. I misunderstood some part of the code in GetConnection.
However, I think my point is still valid that if the user has provided
dbname in the connection string it means that she wants that database
entry to be looked upon not "replication" entry.

But it's still just an idea and I might be missing something. And
given we're getting closer to the feature freeze, it would be a PG18
item.

+1. At this stage, it is important to discuss whether we should allow
pg_baseback to write dbname (either a specified one or a default one)
along with other parameters in primary_conninfo?

True. While I basically agree that pg_basebackup writes dbname in
primary_conninfo, I'm concerned that writing "dbname=replication"
could be problematic. Quoting the case 3) Vignesh summarized before:

3) ./pg_basebackup -d "user=vignesh" -D data -R
-> primary_conninfo = "dbname=replication" (In this case
primary_conninfo will have dbname as replication which is the default
value from GetConnection as connection string is specified)

The primary_conninfo generated by pg_basebackup -R is now used by
either a walreceiver (for physical replication connection) or a
slotsync worker (for normal connection). The "dbname=replication" is
okay for walreceiver. On the other hand, as for the slotsync worker,
it can pass the CheckAndGetDbnameFromConninfo() check but it's very
likely that it cannot connect to the primary since most users won't
create a database with "replication" name. The user will end up
getting an error message like 'database "replication" does not exist'
but I'm not sure it would be informative for users. Rather, the error
message "slot synchronization requires dbname to be specified in
primary_conninfo" might be more informative for users. So I personally
like to omit the dbname if "dbname=replication", at this point.

How about if we write dbname in primary_conninfo to
postgresql.auto.conf file only when the user has explicitly specified
dbname in the connection string? To achieve this we need to somehow
pass this information via PGconn (say by having a new bool variable
dbname_specified) from GetConnection() or something like that?

--
With Regards,
Amit Kapila.

#29vignesh C
vignesh21@gmail.com
In reply to: Amit Kapila (#28)
1 attachment(s)
Re: Have pg_basebackup write "dbname" in "primary_conninfo"?

On Thu, 14 Mar 2024 at 15:49, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, Mar 14, 2024 at 1:45 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Thu, Mar 14, 2024 at 2:27 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, Mar 14, 2024 at 5:57 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

This fact makes me think that the slotsync worker might be able to
accept the primary_conninfo value even if there is no dbname in the
value. That is, if there is no dbname in the primary_conninfo, it uses
the username in accordance with the specs of the connection string.
Currently, the slotsync worker connects to the local database first
and then establishes the connection to the primary server. But if we
can reverse the two steps, it can get the dbname that has actually
been used to establish the remote connection and use it for the local
connection too. That way, the primary_conninfo generated by
pg_basebackup could work even without the patch. For example, if the
OS user executing pg_basebackup is 'postgres', the slotsync worker
would connect to the postgres database. Given the 'postgres' database
is created by default and 'postgres' OS user is used in common, I
guess it could cover many cases in practice actually.

I think this is worth investigating but I suspect that in most cases
users will end up using a replication connection without specifying
the user name and we may not be able to give a meaningful error
message when slotsync worker won't be able to connect. The same will
be true even when the dbname same as the username would be used.

What do you mean by not being able to give a meaningful error message?

If the slotsync worker uses the user name as the dbname, and such a
database doesn't exist, the error message the user will get is
"database "test_user" does not exist". ISTM the same is true when the
user specifies the wrong database in the primary_conninfo.

Right, the exact error message as mentioned by Shveta will be:
ERROR: could not connect to the primary server: connection to server
at "127.0.0.1", port 5433 failed: FATAL: database "bckp_user" does not
exist

Now, without this idea, the ERROR message will be:
ERROR: slot synchronization requires dbname to be specified in
primary_conninfo

I am not sure how much this matters but the second message sounds more useful.

Having said that, even with (or without) the above change, we might
want to change the pg_basebackup so that it writes the dbname to the
primary_conninfo if -R option is specified. Since the database where
the slotsync worker connects cannot be dropped while the slotsync
worker is running, the user might want to change the database to
connect, and it would be useful if they can do that using
pg_basebackup instead of modifying the configuration file manually.

While the current approach makes sense to me, I'm a bit concerned that
we might end up having the pg_basebackup search the actual database
name (e.g. 'dbname=template1') from the .pgpass file instead of
'dbname=replication'. As far as I tested on my environment, suppose
that I execute:

pg_basebackup -D tmp -d "dbname=testdb" -R

The pg_basebackup established a replication connection but looked for
the password of the 'testdb' database. This could be another
inconvenience for the existing users who want to use the slot
synchronization.

This is true because it is internally using logical replication
connection (as we will set set replication=database).

Did you mean the pg_basebackup is using a logical replication
connection in this case? As far as I tested, even if we specify dbname
to the -d option of pg_basebackup, it uses a physical replication
connection. For example, it can take a backup even if I specify a
non-existing database name.

You are right. I misunderstood some part of the code in GetConnection.
However, I think my point is still valid that if the user has provided
dbname in the connection string it means that she wants that database
entry to be looked upon not "replication" entry.

But it's still just an idea and I might be missing something. And
given we're getting closer to the feature freeze, it would be a PG18
item.

+1. At this stage, it is important to discuss whether we should allow
pg_baseback to write dbname (either a specified one or a default one)
along with other parameters in primary_conninfo?

True. While I basically agree that pg_basebackup writes dbname in
primary_conninfo, I'm concerned that writing "dbname=replication"
could be problematic. Quoting the case 3) Vignesh summarized before:

3) ./pg_basebackup -d "user=vignesh" -D data -R
-> primary_conninfo = "dbname=replication" (In this case
primary_conninfo will have dbname as replication which is the default
value from GetConnection as connection string is specified)

The primary_conninfo generated by pg_basebackup -R is now used by
either a walreceiver (for physical replication connection) or a
slotsync worker (for normal connection). The "dbname=replication" is
okay for walreceiver. On the other hand, as for the slotsync worker,
it can pass the CheckAndGetDbnameFromConninfo() check but it's very
likely that it cannot connect to the primary since most users won't
create a database with "replication" name. The user will end up
getting an error message like 'database "replication" does not exist'
but I'm not sure it would be informative for users. Rather, the error
message "slot synchronization requires dbname to be specified in
primary_conninfo" might be more informative for users. So I personally
like to omit the dbname if "dbname=replication", at this point.

How about if we write dbname in primary_conninfo to
postgresql.auto.conf file only when the user has explicitly specified
dbname in the connection string? To achieve this we need to somehow
pass this information via PGconn (say by having a new bool variable
dbname_specified) from GetConnection() or something like that?

Here is a patch which will write dbname in the primary_conninfo only
if the database name is specified explicitly. I have added a new
function GetDbnameFromConnectionString which will return the dbname
specified in the connection and GenerateRecoveryConfig will append
this database name.
Here are the test results with the patch:
case 1:
pg_basebackup -D test10 -p 5431 -X s -P -R -d "user=vignesh dbname=postgres"
primary_conninfo will have dbname=postgres

case 2:
pg_basebackup -D test10 -p 5431 -X s -P -R -d "user=vignesh dbname=replication"
primary_conninfo will have dbname=replication

case 3:
pg_basebackup -D test10 -p 5431 -X s -P -R -U vignesh
primary_conninfo will not have dbname

case 4:
pg_basebackup -D test10 -p 5431 -X s -P -R
primary_conninfo will not have dbname

case 5:
pg_basebackup -D test10 -p 5431 -X s -P -R -d "user=vignesh"
primary_conninfo will not have dbname

case 6:
pg_basebackup -D test10 -p 5431 -X s -P -R -d ""
primary_conninfo will not have dbname

Thoughts?

Regards,
Vignesh

Attachments:

pg_basebackup-write-dbname.v0005.patchapplication/x-patch; name=pg_basebackup-write-dbname.v0005.patchDownload
diff --git a/doc/src/sgml/ref/pg_basebackup.sgml b/doc/src/sgml/ref/pg_basebackup.sgml
index 88c689e725..903da73df7 100644
--- a/doc/src/sgml/ref/pg_basebackup.sgml
+++ b/doc/src/sgml/ref/pg_basebackup.sgml
@@ -243,7 +243,10 @@ PostgreSQL documentation
         The <filename>postgresql.auto.conf</filename> file will record the connection
         settings and, if specified, the replication slot
         that <application>pg_basebackup</application> is using, so that
-        streaming replication will use the same settings later on.
+        streaming replication or <link linkend="logicaldecoding-replication-slots-synchronization">
+        logical replication slot synchronization</link> will use the same
+        settings later on. The dbname will be recorded only if the dbname was
+        specified explicitly in the connection string.
        </para>
 
       </listitem>
@@ -807,7 +810,9 @@ PostgreSQL documentation
         client applications, but because <application>pg_basebackup</application>
         doesn't connect to any particular database in the cluster, any database
         name in the connection string will be ignored
-        by <productname>PostgreSQL</productname>. Middleware, or proxies, used in
+        by <productname>PostgreSQL</productname>. Middleware, proxies, or
+        <link linkend="logicaldecoding-replication-slots-synchronization">
+        logical replication slot synchronization</link> used in
         connecting to <productname>PostgreSQL</productname> might however
         utilize the value.
        </para>
diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index 3a9940097c..8880e0bef6 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -1810,7 +1810,8 @@ BaseBackup(char *compression_algorithm, char *compression_detail,
 	 * Build contents of configuration file if requested
 	 */
 	if (writerecoveryconf)
-		recoveryconfcontents = GenerateRecoveryConfig(conn, replication_slot);
+		recoveryconfcontents = GenerateRecoveryConfig(conn, replication_slot,
+													  GetDbnameFromConnectionString());
 
 	/*
 	 * Run IDENTIFY_SYSTEM so we can get the timeline
diff --git a/src/bin/pg_basebackup/streamutil.c b/src/bin/pg_basebackup/streamutil.c
index 56d1b15951..a88b660c8a 100644
--- a/src/bin/pg_basebackup/streamutil.c
+++ b/src/bin/pg_basebackup/streamutil.c
@@ -267,6 +267,38 @@ GetConnection(void)
 	return tmpconn;
 }
 
+/*
+ * GetDbnameFromConnectionString
+ *
+ * If database is specified in the connection string, then return the last
+ * database name specified. If database is not specified in the connection
+ * string, then return NULL.
+ */
+char *
+GetDbnameFromConnectionString(void)
+{
+	PQconninfoOption *conn_opts = NULL;
+	PQconninfoOption *conn_opt;
+	char	   *err_msg = NULL;
+	char	   *dbname = NULL;
+
+	if (connection_string)
+	{
+		conn_opts = PQconninfoParse(connection_string, &err_msg);
+		if (conn_opts == NULL)
+			pg_fatal("%s", err_msg);
+
+		for (conn_opt = conn_opts; conn_opt->keyword != NULL; conn_opt++)
+		{
+			if ((strcmp(conn_opt->keyword, "dbname") == 0) &&
+				conn_opt->val != NULL && conn_opt->val[0] != '\0')
+				dbname = conn_opt->val;
+		}
+	}
+
+	return dbname;
+}
+
 /*
  * From version 10, explicitly set wal segment size using SHOW wal_segment_size
  * since ControlFile is not accessible here.
diff --git a/src/bin/pg_basebackup/streamutil.h b/src/bin/pg_basebackup/streamutil.h
index 7a3dd98da3..1ade59247a 100644
--- a/src/bin/pg_basebackup/streamutil.h
+++ b/src/bin/pg_basebackup/streamutil.h
@@ -31,6 +31,8 @@ extern PGconn *conn;
 
 extern PGconn *GetConnection(void);
 
+extern char *GetDbnameFromConnectionString(void);
+
 /* Replication commands */
 extern bool CreateReplicationSlot(PGconn *conn, const char *slot_name,
 								  const char *plugin, bool is_temporary,
diff --git a/src/bin/pg_rewind/pg_rewind.c b/src/bin/pg_rewind/pg_rewind.c
index bde90bf60b..8449ae78ef 100644
--- a/src/bin/pg_rewind/pg_rewind.c
+++ b/src/bin/pg_rewind/pg_rewind.c
@@ -451,7 +451,7 @@ main(int argc, char **argv)
 		pg_log_info("no rewind required");
 		if (writerecoveryconf && !dry_run)
 			WriteRecoveryConfig(conn, datadir_target,
-								GenerateRecoveryConfig(conn, NULL));
+								GenerateRecoveryConfig(conn, NULL, NULL));
 		exit(0);
 	}
 
@@ -525,7 +525,7 @@ main(int argc, char **argv)
 	/* Also update the standby configuration, if requested. */
 	if (writerecoveryconf && !dry_run)
 		WriteRecoveryConfig(conn, datadir_target,
-							GenerateRecoveryConfig(conn, NULL));
+							GenerateRecoveryConfig(conn, NULL, NULL));
 
 	/* don't need the source connection anymore */
 	source->destroy(source);
diff --git a/src/fe_utils/recovery_gen.c b/src/fe_utils/recovery_gen.c
index 2585f11939..0b5d9abc48 100644
--- a/src/fe_utils/recovery_gen.c
+++ b/src/fe_utils/recovery_gen.c
@@ -20,7 +20,7 @@ static char *escape_quotes(const char *src);
  * return it.
  */
 PQExpBuffer
-GenerateRecoveryConfig(PGconn *pgconn, char *replication_slot)
+GenerateRecoveryConfig(PGconn *pgconn, char *replication_slot, char *dbname)
 {
 	PQconninfoOption *connOptions;
 	PQExpBufferData conninfo_buf;
@@ -66,6 +66,20 @@ GenerateRecoveryConfig(PGconn *pgconn, char *replication_slot)
 		appendPQExpBuffer(&conninfo_buf, "%s=", opt->keyword);
 		appendConnStrVal(&conninfo_buf, opt->val);
 	}
+
+	if (dbname)
+	{
+		/*
+		 * If dbname is specified in the connection, append the dbname. This
+		 * will be used later for logical replication slot synchronization.
+		 */
+		if (conninfo_buf.len != 0)
+			appendPQExpBufferChar(&conninfo_buf, ' ');
+
+		appendPQExpBuffer(&conninfo_buf, "%s=", "dbname");
+		appendConnStrVal(&conninfo_buf, dbname);
+	}
+
 	if (PQExpBufferDataBroken(conninfo_buf))
 		pg_fatal("out of memory");
 
diff --git a/src/include/fe_utils/recovery_gen.h b/src/include/fe_utils/recovery_gen.h
index ca2c4800d0..5a986297a5 100644
--- a/src/include/fe_utils/recovery_gen.h
+++ b/src/include/fe_utils/recovery_gen.h
@@ -21,7 +21,8 @@
 #define MINIMUM_VERSION_FOR_RECOVERY_GUC 120000
 
 extern PQExpBuffer GenerateRecoveryConfig(PGconn *pgconn,
-										  char *replication_slot);
+										  char *replication_slot,
+										  char *dbname);
 extern void WriteRecoveryConfig(PGconn *pgconn, char *target_dir,
 								PQExpBuffer contents);
 
#30Masahiko Sawada
sawada.mshk@gmail.com
In reply to: vignesh C (#29)
Re: Have pg_basebackup write "dbname" in "primary_conninfo"?

On Thu, Mar 14, 2024 at 11:46 PM vignesh C <vignesh21@gmail.com> wrote:

On Thu, 14 Mar 2024 at 15:49, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, Mar 14, 2024 at 1:45 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Thu, Mar 14, 2024 at 2:27 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, Mar 14, 2024 at 5:57 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

This fact makes me think that the slotsync worker might be able to
accept the primary_conninfo value even if there is no dbname in the
value. That is, if there is no dbname in the primary_conninfo, it uses
the username in accordance with the specs of the connection string.
Currently, the slotsync worker connects to the local database first
and then establishes the connection to the primary server. But if we
can reverse the two steps, it can get the dbname that has actually
been used to establish the remote connection and use it for the local
connection too. That way, the primary_conninfo generated by
pg_basebackup could work even without the patch. For example, if the
OS user executing pg_basebackup is 'postgres', the slotsync worker
would connect to the postgres database. Given the 'postgres' database
is created by default and 'postgres' OS user is used in common, I
guess it could cover many cases in practice actually.

I think this is worth investigating but I suspect that in most cases
users will end up using a replication connection without specifying
the user name and we may not be able to give a meaningful error
message when slotsync worker won't be able to connect. The same will
be true even when the dbname same as the username would be used.

What do you mean by not being able to give a meaningful error message?

If the slotsync worker uses the user name as the dbname, and such a
database doesn't exist, the error message the user will get is
"database "test_user" does not exist". ISTM the same is true when the
user specifies the wrong database in the primary_conninfo.

Right, the exact error message as mentioned by Shveta will be:
ERROR: could not connect to the primary server: connection to server
at "127.0.0.1", port 5433 failed: FATAL: database "bckp_user" does not
exist

Now, without this idea, the ERROR message will be:
ERROR: slot synchronization requires dbname to be specified in
primary_conninfo

I am not sure how much this matters but the second message sounds more useful.

Having said that, even with (or without) the above change, we might
want to change the pg_basebackup so that it writes the dbname to the
primary_conninfo if -R option is specified. Since the database where
the slotsync worker connects cannot be dropped while the slotsync
worker is running, the user might want to change the database to
connect, and it would be useful if they can do that using
pg_basebackup instead of modifying the configuration file manually.

While the current approach makes sense to me, I'm a bit concerned that
we might end up having the pg_basebackup search the actual database
name (e.g. 'dbname=template1') from the .pgpass file instead of
'dbname=replication'. As far as I tested on my environment, suppose
that I execute:

pg_basebackup -D tmp -d "dbname=testdb" -R

The pg_basebackup established a replication connection but looked for
the password of the 'testdb' database. This could be another
inconvenience for the existing users who want to use the slot
synchronization.

This is true because it is internally using logical replication
connection (as we will set set replication=database).

Did you mean the pg_basebackup is using a logical replication
connection in this case? As far as I tested, even if we specify dbname
to the -d option of pg_basebackup, it uses a physical replication
connection. For example, it can take a backup even if I specify a
non-existing database name.

You are right. I misunderstood some part of the code in GetConnection.
However, I think my point is still valid that if the user has provided
dbname in the connection string it means that she wants that database
entry to be looked upon not "replication" entry.

But it's still just an idea and I might be missing something. And
given we're getting closer to the feature freeze, it would be a PG18
item.

+1. At this stage, it is important to discuss whether we should allow
pg_baseback to write dbname (either a specified one or a default one)
along with other parameters in primary_conninfo?

True. While I basically agree that pg_basebackup writes dbname in
primary_conninfo, I'm concerned that writing "dbname=replication"
could be problematic. Quoting the case 3) Vignesh summarized before:

3) ./pg_basebackup -d "user=vignesh" -D data -R
-> primary_conninfo = "dbname=replication" (In this case
primary_conninfo will have dbname as replication which is the default
value from GetConnection as connection string is specified)

The primary_conninfo generated by pg_basebackup -R is now used by
either a walreceiver (for physical replication connection) or a
slotsync worker (for normal connection). The "dbname=replication" is
okay for walreceiver. On the other hand, as for the slotsync worker,
it can pass the CheckAndGetDbnameFromConninfo() check but it's very
likely that it cannot connect to the primary since most users won't
create a database with "replication" name. The user will end up
getting an error message like 'database "replication" does not exist'
but I'm not sure it would be informative for users. Rather, the error
message "slot synchronization requires dbname to be specified in
primary_conninfo" might be more informative for users. So I personally
like to omit the dbname if "dbname=replication", at this point.

How about if we write dbname in primary_conninfo to
postgresql.auto.conf file only when the user has explicitly specified
dbname in the connection string? To achieve this we need to somehow
pass this information via PGconn (say by having a new bool variable
dbname_specified) from GetConnection() or something like that?

Here is a patch which will write dbname in the primary_conninfo only
if the database name is specified explicitly. I have added a new
function GetDbnameFromConnectionString which will return the dbname
specified in the connection and GenerateRecoveryConfig will append
this database name.
Here are the test results with the patch:
case 1:
pg_basebackup -D test10 -p 5431 -X s -P -R -d "user=vignesh dbname=postgres"
primary_conninfo will have dbname=postgres

case 2:
pg_basebackup -D test10 -p 5431 -X s -P -R -d "user=vignesh dbname=replication"
primary_conninfo will have dbname=replication

case 3:
pg_basebackup -D test10 -p 5431 -X s -P -R -U vignesh
primary_conninfo will not have dbname

case 4:
pg_basebackup -D test10 -p 5431 -X s -P -R
primary_conninfo will not have dbname

case 5:
pg_basebackup -D test10 -p 5431 -X s -P -R -d "user=vignesh"
primary_conninfo will not have dbname

case 6:
pg_basebackup -D test10 -p 5431 -X s -P -R -d ""
primary_conninfo will not have dbname

Thoughts?

Thank you for updating the patch!

This behavior makes sense to me. But do we want to handle the case of
using environment variables too? IIUC,

pg_basebackup -D tmp -d "user=masahiko dbname=test_db"

is equivalent to:

PGDATABASE="user=masahiko dbname=test_db" pg_basebackup -D tmp

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#31Hayato Kuroda (Fujitsu)
kuroda.hayato@fujitsu.com
In reply to: Masahiko Sawada (#30)
1 attachment(s)
RE: Have pg_basebackup write "dbname" in "primary_conninfo"?

Dear Sawada-san,

Thanks for giving comments!

This behavior makes sense to me. But do we want to handle the case of
using environment variables too?

Yeah, v5 does not consider which libpq parameters are specified by environment
variables. Such a variable should be used when the dbname is not expressly written
in the connection string.
Such a path was added in the v6 patch. If the dbname is not determined after
parsing the connection string, we call PQconndefaults() to get settings from
environment variables and service files [1]https://www.postgresql.org/docs/devel/libpq-pgservice.html, then start to search dbname again.
Below shows an example.

```
PGPORT=5431 PGUSER=kuroda PGDATABASE=postgres pg_basebackup -D data_N2 -R -v
->
primary_conninfo = 'user=kuroda ... port=5431 ... dbname=postgres ... '
```

IIUC,

pg_basebackup -D tmp -d "user=masahiko dbname=test_db"

is equivalent to:

PGDATABASE="user=masahiko dbname=test_db" pg_basebackup -D tmp

The case won't work. I think You assumed that expanded_dbname like
PQconnectdbParams() [2]https://www.postgresql.org/docs/devel/libpq-connect.html#LIBPQ-PQCONNECTDBPARAMS can be used for enviroment variables, but it is not correct
- it won't parse as connection string again.

In the libpq layer, connection parameters are parsed in PQconnectStartParams()->conninfo_array_parse().
When expand_dbname is specified, the entry "dbname" is firstly checked and
parsed its value. They are done at fe-connect.c:5846.

The environment variables are checked and parsed in conninfo_add_defaults(), which
is called from conninfo_array_parse(). However, it is done at fe-connect.c:5956 - the
expand_dbname has already been done at that time. This means there is no chance
that PGDATABASE is parsed as an expanded style.

For example, if the pg_basebackup runs like below:

PGDATABASE="user=kuroda dbname=postgres" pg_basebackup -D data_N2 -R -v

The primary_conninfo written in the file will be:

primary_conninfo = 'user=hayato ... dbname=''user=kuroda dbname=postgres'''

[1]: https://www.postgresql.org/docs/devel/libpq-pgservice.html
[2]: https://www.postgresql.org/docs/devel/libpq-connect.html#LIBPQ-PQCONNECTDBPARAMS

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/

Attachments:

pg_basebackup-write-dbname.v0006.patchapplication/octet-stream; name=pg_basebackup-write-dbname.v0006.patchDownload
diff --git a/doc/src/sgml/ref/pg_basebackup.sgml b/doc/src/sgml/ref/pg_basebackup.sgml
index 88c689e725..903da73df7 100644
--- a/doc/src/sgml/ref/pg_basebackup.sgml
+++ b/doc/src/sgml/ref/pg_basebackup.sgml
@@ -243,7 +243,10 @@ PostgreSQL documentation
         The <filename>postgresql.auto.conf</filename> file will record the connection
         settings and, if specified, the replication slot
         that <application>pg_basebackup</application> is using, so that
-        streaming replication will use the same settings later on.
+        streaming replication or <link linkend="logicaldecoding-replication-slots-synchronization">
+        logical replication slot synchronization</link> will use the same
+        settings later on. The dbname will be recorded only if the dbname was
+        specified explicitly in the connection string.
        </para>
 
       </listitem>
@@ -807,7 +810,9 @@ PostgreSQL documentation
         client applications, but because <application>pg_basebackup</application>
         doesn't connect to any particular database in the cluster, any database
         name in the connection string will be ignored
-        by <productname>PostgreSQL</productname>. Middleware, or proxies, used in
+        by <productname>PostgreSQL</productname>. Middleware, proxies, or
+        <link linkend="logicaldecoding-replication-slots-synchronization">
+        logical replication slot synchronization</link> used in
         connecting to <productname>PostgreSQL</productname> might however
         utilize the value.
        </para>
diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index 3a9940097c..2826af1252 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -1807,10 +1807,15 @@ BaseBackup(char *compression_algorithm, char *compression_detail,
 	}
 
 	/*
-	 * Build contents of configuration file if requested
+	 * Build contents of configuration file if requested.
+	 *
+	 * The dbname dbname is either of 1) the value written in the connection
+	 * string, 2) specified as PGDATABASE or 3) the value written in the
+	 * service file.
 	 */
 	if (writerecoveryconf)
-		recoveryconfcontents = GenerateRecoveryConfig(conn, replication_slot);
+		recoveryconfcontents = GenerateRecoveryConfig(conn, replication_slot,
+													  GetDbnameFromConnectionStringOrDefaults());
 
 	/*
 	 * Run IDENTIFY_SYSTEM so we can get the timeline
diff --git a/src/bin/pg_basebackup/streamutil.c b/src/bin/pg_basebackup/streamutil.c
index 56d1b15951..733eb89d49 100644
--- a/src/bin/pg_basebackup/streamutil.c
+++ b/src/bin/pg_basebackup/streamutil.c
@@ -34,6 +34,8 @@
 int			WalSegSz;
 
 static bool RetrieveDataDirCreatePerm(PGconn *conn);
+static void FindDbnameFromPQconninfoOptions(PQconninfoOption *conn_opts,
+											char **dbname);
 
 /* SHOW command for replication connection was introduced in version 10 */
 #define MINIMUM_VERSION_FOR_SHOW_CMD 100000
@@ -267,6 +269,71 @@ GetConnection(void)
 	return tmpconn;
 }
 
+/*
+ * FindDbnameFromPQconninfoOptions
+ *
+ * This is a helper function for GetDbnameFromConnectionStringOrDefaults().
+ * Read all parameters in PQconninfoOption and extract the dbname to the
+ * argument.
+ */
+static void
+FindDbnameFromPQconninfoOptions(PQconninfoOption *conn_opts, char **dbname)
+{
+	Assert(dbname != NULL);
+
+	for (PQconninfoOption *conn_opt = conn_opts; conn_opt->keyword != NULL;
+		 conn_opt++)
+	{
+		if ((strcmp(conn_opt->keyword, "dbname") == 0) &&
+			conn_opt->val != NULL && conn_opt->val[0] != '\0')
+			*dbname = pg_strdup(conn_opt->val);
+	}
+}
+
+/*
+ * GetDbnameFromConnectionStringOrDefaults
+ *
+ * If database is specified either in the connection string (higher priority)
+ * or enviroment variables, then return the last database name specified.
+ * If database is not specified, then return NULL.
+ */
+char *
+GetDbnameFromConnectionStringOrDefaults(void)
+{
+	PQconninfoOption *conn_opts = NULL;
+	char	   *err_msg = NULL;
+	char	   *dbname = NULL;
+
+	/* Parse the connection string first because this has higher priority */
+	if (connection_string)
+	{
+		conn_opts = PQconninfoParse(connection_string, &err_msg);
+		if (conn_opts == NULL)
+			pg_fatal("%s", err_msg);
+
+		FindDbnameFromPQconninfoOptions(conn_opts, &dbname);
+	}
+
+	/* Skip till the cleanup phase when found */
+	if (dbname)
+		goto cleanup;
+
+	/*
+	 * If we reach here, the connection string does not contain the dbname.
+	 * Let's check default settings, especially environment variables and the
+	 * service file.
+	 */
+	conn_opts = PQconndefaults();
+	if (conn_opts == NULL)
+		pg_fatal("%s", err_msg);
+
+	FindDbnameFromPQconninfoOptions(conn_opts, &dbname);
+
+cleanup:
+	PQconninfoFree(conn_opts);
+	return dbname;
+}
+
 /*
  * From version 10, explicitly set wal segment size using SHOW wal_segment_size
  * since ControlFile is not accessible here.
diff --git a/src/bin/pg_basebackup/streamutil.h b/src/bin/pg_basebackup/streamutil.h
index 7a3dd98da3..406c0070b2 100644
--- a/src/bin/pg_basebackup/streamutil.h
+++ b/src/bin/pg_basebackup/streamutil.h
@@ -31,6 +31,8 @@ extern PGconn *conn;
 
 extern PGconn *GetConnection(void);
 
+extern char *GetDbnameFromConnectionStringOrDefaults(void);
+
 /* Replication commands */
 extern bool CreateReplicationSlot(PGconn *conn, const char *slot_name,
 								  const char *plugin, bool is_temporary,
diff --git a/src/bin/pg_rewind/pg_rewind.c b/src/bin/pg_rewind/pg_rewind.c
index bde90bf60b..8449ae78ef 100644
--- a/src/bin/pg_rewind/pg_rewind.c
+++ b/src/bin/pg_rewind/pg_rewind.c
@@ -451,7 +451,7 @@ main(int argc, char **argv)
 		pg_log_info("no rewind required");
 		if (writerecoveryconf && !dry_run)
 			WriteRecoveryConfig(conn, datadir_target,
-								GenerateRecoveryConfig(conn, NULL));
+								GenerateRecoveryConfig(conn, NULL, NULL));
 		exit(0);
 	}
 
@@ -525,7 +525,7 @@ main(int argc, char **argv)
 	/* Also update the standby configuration, if requested. */
 	if (writerecoveryconf && !dry_run)
 		WriteRecoveryConfig(conn, datadir_target,
-							GenerateRecoveryConfig(conn, NULL));
+							GenerateRecoveryConfig(conn, NULL, NULL));
 
 	/* don't need the source connection anymore */
 	source->destroy(source);
diff --git a/src/fe_utils/recovery_gen.c b/src/fe_utils/recovery_gen.c
index 63c78c986c..5904f0840a 100644
--- a/src/fe_utils/recovery_gen.c
+++ b/src/fe_utils/recovery_gen.c
@@ -18,9 +18,15 @@ static char *escape_quotes(const char *src);
 /*
  * Write recovery configuration contents into a fresh PQExpBuffer, and
  * return it.
+ *
+ * This can also accept the dbname, which should be written in the
+ * primary_conninfo. The walreciever process will ignore this value, but it is
+ * essential for the slotsync worker. It connects to the specified database on
+ * the primary server.
  */
 PQExpBuffer
-GenerateRecoveryConfig(PGconn *pgconn, const char *replication_slot)
+GenerateRecoveryConfig(PGconn *pgconn, const char *replication_slot,
+					   char *dbname)
 {
 	PQconninfoOption *connOptions;
 	PQExpBufferData conninfo_buf;
@@ -66,6 +72,20 @@ GenerateRecoveryConfig(PGconn *pgconn, const char *replication_slot)
 		appendPQExpBuffer(&conninfo_buf, "%s=", opt->keyword);
 		appendConnStrVal(&conninfo_buf, opt->val);
 	}
+
+	if (dbname)
+	{
+		/*
+		 * If dbname is specified in the connection, append the dbname. This
+		 * will be used later for logical replication slot synchronization.
+		 */
+		if (conninfo_buf.len != 0)
+			appendPQExpBufferChar(&conninfo_buf, ' ');
+
+		appendPQExpBuffer(&conninfo_buf, "%s=", "dbname");
+		appendConnStrVal(&conninfo_buf, dbname);
+	}
+
 	if (PQExpBufferDataBroken(conninfo_buf))
 		pg_fatal("out of memory");
 
diff --git a/src/include/fe_utils/recovery_gen.h b/src/include/fe_utils/recovery_gen.h
index f1b760604b..73c1aa8e59 100644
--- a/src/include/fe_utils/recovery_gen.h
+++ b/src/include/fe_utils/recovery_gen.h
@@ -21,7 +21,8 @@
 #define MINIMUM_VERSION_FOR_RECOVERY_GUC 120000
 
 extern PQExpBuffer GenerateRecoveryConfig(PGconn *pgconn,
-										  const char *replication_slot);
+										  const char *replication_slot,
+										  char *dbname);
 extern void WriteRecoveryConfig(PGconn *pgconn, const char *target_dir,
 								PQExpBuffer contents);
 
#32vignesh C
vignesh21@gmail.com
In reply to: Hayato Kuroda (Fujitsu) (#31)
Re: Have pg_basebackup write "dbname" in "primary_conninfo"?

On Tue, 19 Mar 2024 at 17:18, Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:

Dear Sawada-san,

Thanks for giving comments!

This behavior makes sense to me. But do we want to handle the case of
using environment variables too?

Yeah, v5 does not consider which libpq parameters are specified by environment
variables. Such a variable should be used when the dbname is not expressly written
in the connection string.
Such a path was added in the v6 patch. If the dbname is not determined after
parsing the connection string, we call PQconndefaults() to get settings from
environment variables and service files [1], then start to search dbname again.
Below shows an example.

```
PGPORT=5431 PGUSER=kuroda PGDATABASE=postgres pg_basebackup -D data_N2 -R -v
->
primary_conninfo = 'user=kuroda ... port=5431 ... dbname=postgres ... '
```

IIUC,

pg_basebackup -D tmp -d "user=masahiko dbname=test_db"

is equivalent to:

PGDATABASE="user=masahiko dbname=test_db" pg_basebackup -D tmp

The case won't work. I think You assumed that expanded_dbname like
PQconnectdbParams() [2] can be used for enviroment variables, but it is not correct
- it won't parse as connection string again.

In the libpq layer, connection parameters are parsed in PQconnectStartParams()->conninfo_array_parse().
When expand_dbname is specified, the entry "dbname" is firstly checked and
parsed its value. They are done at fe-connect.c:5846.

The environment variables are checked and parsed in conninfo_add_defaults(), which
is called from conninfo_array_parse(). However, it is done at fe-connect.c:5956 - the
expand_dbname has already been done at that time. This means there is no chance
that PGDATABASE is parsed as an expanded style.

For example, if the pg_basebackup runs like below:

PGDATABASE="user=kuroda dbname=postgres" pg_basebackup -D data_N2 -R -v

The primary_conninfo written in the file will be:

primary_conninfo = 'user=hayato ... dbname=''user=kuroda dbname=postgres'''

Thanks for the patch.
Here are the test results for various tests by specifying connection
string, environment variable, service file, and connection URIs with
the patch:
case 1:
pg_basebackup -D test10 -p 5431 -X s -P -R -d "user=vignesh dbname=db1"
primary_conninfo will have dbname=db1

case 2:
pg_basebackup -D test10 -p 5431 -X s -P -R -d "user=vignesh dbname=replication"
primary_conninfo will have dbname=replication

case 3:
pg_basebackup -D test10 -p 5431 -X s -P -R -U vignesh
primary_conninfo will not have dbname

case 4:
pg_basebackup -D test10 -p 5431 -X s -P -R
primary_conninfo will not have dbname

case 5:
pg_basebackup -D test10 -p 5431 -X s -P -R -d "user=vignesh"
primary_conninfo will not have dbname

case 6:
pg_basebackup -D test10 -p 5431 -X s -P -R -d ""
primary_conninfo will not have dbname

--- Testing through PGDATABASE environment variable
case 7:
export PGDATABASE="user=postgres dbname=test"
./pg_basebackup -D test10 -p 5431 -X s -P  -R
primary_conninfo will have dbname=''user=postgres dbname=test'' like below:
primary_conninfo = 'user=vignesh passfile=''/home/vignesh/.pgpass''
channel_binding=prefer port=5431 sslmode=prefer sslcompression=0
sslcertmode=allow sslsni=1 ssl_min_protocol_version=TLSv1.2
gssencmode=disable krbsrvname=postgres gssdelegation=0
target_session_attrs=any load_balance_hosts=disable
dbname=''user=postgres dbname=test'''

case 8:
export PGDATABASE=db1
./pg_basebackup -D test10 -p 5431 -X s -P -R
primary_conninfo will have dbname=db1

--- Testing through pg_service
case 9:
Create .pg_service.conf with the following info:
[conn1]
dbname=db2

export PGSERVICE=conn1

./pg_basebackup -D test10 -p 5431 -X s -P -R
primary_conninfo will have dbname=db2

case 10:
Create .pg_service.conf with the following info, i.e. there is no
database specified:
[conn1]

./pg_basebackup -D test10 -p 5431 -X s -P -R
primary_conninfo will not have dbname

--- Testing through Connection URIs
case 11:
./pg_basebackup -D test10  -X s -P  -R -d "postgresql://localhost:5431"
primary_conninfo will not have dbname

case 12:
./pg_basebackup -D test10 -p 5431 -X s -P -R -d
"postgresql://localhost/db3:5431"
primary_conninfo will have dbname=''db3:5431'' like below:
primary_conninfo = 'user=vignesh passfile=''/home/vignesh/.pgpass''
channel_binding=prefer host=localhost port=5431 sslmode=prefer
sslcompression=0 sslcertmode=allow sslsni=1
ssl_min_protocol_version=TLSv1.2 gssencmode=disable
krbsrvname=postgres gssdelegation=0 target_session_attrs=any
load_balance_hosts=disable dbname=''db3:5431'''

case 13:
./pg_basebackup -D test10 -p 5431 -X s -P -R -d "postgresql://localhost/db3"
primary_conninfo will have dbname=db3

case 14:
./pg_basebackup -D test10 -X s -P -R -d "postgresql://localhost:5431/db3"
primary_conninfo will have dbname=db3

case 15:
./pg_basebackup -D test10 -X s -P -R -d
"postgresql://localhost:5431/db4,127.0.0.1:5431/db5"
primary_conninfo will have dbname=''db4,127.0.0.1:5431/db5'' like below:
primary_conninfo = 'user=vignesh passfile=''/home/vignesh/.pgpass''
channel_binding=prefer host=localhost port=5431 sslmode=prefer
sslcompression=0 sslcertmode=allow sslsni=1
ssl_min_protocol_version=TLSv1.2 gssencmode=disable
krbsrvname=postgres gssdelegation=0 target_session_attrs=any
load_balance_hosts=disable dbname=''db4,127.0.0.1:5431/db5'''

case 16:
./pg_basebackup -D test10 -X s -P -R -d
"postgresql://localhost:5431,127.0.0.1:5431/db5"
primary_conninfo will have dbname=db5

case 17:
./pg_basebackup -D test10 -X s -P -R -d
"postgresql:///db6?host=localhost&port=5431"
primary_conninfo will have dbname=db6

case 18:
./pg_basebackup -D test10 -p 5431 -X s -P -R -d
"postgresql:///db7?host=/home/vignesh/postgres/inst/bin"
primary_conninfo will have dbname=db7

case 19:
./pg_basebackup -D test10 -p 5431 -X s -P -R -d
"postgresql:///db8?host=%2Fhome%2Fvignesh%2Fpostgres%2Finst%2Fbin"
primary_conninfo will have dbname=db8

In these cases, the database name specified will be written to the
conf file. The test results look good to me.

Regards,
Vignesh

#33Amit Kapila
amit.kapila16@gmail.com
In reply to: Hayato Kuroda (Fujitsu) (#31)
1 attachment(s)
Re: Have pg_basebackup write "dbname" in "primary_conninfo"?

On Tue, Mar 19, 2024 at 5:18 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:

Thanks for giving comments!

This behavior makes sense to me. But do we want to handle the case of
using environment variables too?

Yeah, v5 does not consider which libpq parameters are specified by environment
variables. Such a variable should be used when the dbname is not expressly written
in the connection string.
Such a path was added in the v6 patch. If the dbname is not determined after
parsing the connection string, we call PQconndefaults() to get settings from
environment variables and service files [1], then start to search dbname again.

The functionality implemented by the patch looks good to me. I have
made minor modifications in the function names, error handling,
comments, and doc updates in the attached patch. Let me know what you
think of the attached.

--
With Regards,
Amit Kapila.

Attachments:

pg_basebackup-write-dbname.v0007.patchapplication/octet-stream; name=pg_basebackup-write-dbname.v0007.patchDownload
diff --git a/doc/src/sgml/ref/pg_basebackup.sgml b/doc/src/sgml/ref/pg_basebackup.sgml
index 88c689e725..1a09904b81 100644
--- a/doc/src/sgml/ref/pg_basebackup.sgml
+++ b/doc/src/sgml/ref/pg_basebackup.sgml
@@ -243,7 +243,11 @@ PostgreSQL documentation
         The <filename>postgresql.auto.conf</filename> file will record the connection
         settings and, if specified, the replication slot
         that <application>pg_basebackup</application> is using, so that
-        streaming replication will use the same settings later on.
+        streaming replication and <link linkend="logicaldecoding-replication-slots-synchronization">
+        logical replication slot synchronization</link> will use the same
+        settings later on. The dbname will be recorded only if the dbname was
+        specified explicitly in the connection string or environment variable
+        (see <xref linkend="libpq-envars"/>).
        </para>
 
       </listitem>
@@ -809,7 +813,9 @@ PostgreSQL documentation
         name in the connection string will be ignored
         by <productname>PostgreSQL</productname>. Middleware, or proxies, used in
         connecting to <productname>PostgreSQL</productname> might however
-        utilize the value.
+        utilize the value. The database name specified in connection string can
+        also be used by <link linkend="logicaldecoding-replication-slots-synchronization">
+        logical replication slot synchronization</link>.
        </para>
       </listitem>
      </varlistentry>
diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index 3a9940097c..8f3dd04fd2 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -1807,10 +1807,18 @@ BaseBackup(char *compression_algorithm, char *compression_detail,
 	}
 
 	/*
-	 * Build contents of configuration file if requested
+	 * Build contents of configuration file if requested.
+	 *
+	 * Note that we don't use the dbname from key-value pair in conn as that
+	 * would have been filled by the default dbname (dbname=replication) in
+	 * case the user didn't specify the one. The dbname written in the config
+	 * file as part of primary_conninfo would be used by slotsync worker which
+	 * doesn't use a replication connection so the default won't work for it.
 	 */
 	if (writerecoveryconf)
-		recoveryconfcontents = GenerateRecoveryConfig(conn, replication_slot);
+		recoveryconfcontents = GenerateRecoveryConfig(conn,
+													  replication_slot,
+													  GetDbnameFromConnectionOptions());
 
 	/*
 	 * Run IDENTIFY_SYSTEM so we can get the timeline
diff --git a/src/bin/pg_basebackup/streamutil.c b/src/bin/pg_basebackup/streamutil.c
index 56d1b15951..43c01fb1ca 100644
--- a/src/bin/pg_basebackup/streamutil.c
+++ b/src/bin/pg_basebackup/streamutil.c
@@ -34,6 +34,7 @@
 int			WalSegSz;
 
 static bool RetrieveDataDirCreatePerm(PGconn *conn);
+static void FindDbnameInConnParams(PQconninfoOption *conn_opts, char **dbname);
 
 /* SHOW command for replication connection was introduced in version 10 */
 #define MINIMUM_VERSION_FOR_SHOW_CMD 100000
@@ -267,6 +268,74 @@ GetConnection(void)
 	return tmpconn;
 }
 
+/*
+ * FindDbnameInConnParams
+ *
+ * This is a helper function for GetDbnameFromConnectionOptions(). Extract
+ * the value of dbname from PQconninfoOption parameters.
+ */
+static void
+FindDbnameInConnParams(PQconninfoOption *conn_opts, char **dbname)
+{
+	PQconninfoOption	*conn_opt;
+	Assert(dbname != NULL);
+
+	for (conn_opt = conn_opts; conn_opt->keyword != NULL; conn_opt++)
+	{
+		if ((strcmp(conn_opt->keyword, "dbname") == 0) &&
+			conn_opt->val != NULL && conn_opt->val[0] != '\0')
+			*dbname = pg_strdup(conn_opt->val);
+	}
+}
+
+/*
+ * GetDbnameFromConnectionOptions
+ *
+ * This is a special purpose function to retrieve the dbname from either the
+ * connection_string specified by the user or from the environment variables.
+ *
+ * We follow GetConnection() to fetch the dbname from various connection
+ * options.
+ *
+ * Returns NULL, if dbname is not specified by the user in the above
+ * mentioned connection options.
+ */
+char *
+GetDbnameFromConnectionOptions(void)
+{
+	PQconninfoOption *conn_opts = NULL;
+	char	   *err_msg = NULL;
+	char	   *dbname = NULL;
+
+	/* First try to get the dbname from connection string. */
+	if (connection_string)
+	{
+		conn_opts = PQconninfoParse(connection_string, &err_msg);
+		if (conn_opts == NULL)
+			pg_fatal("%s", err_msg);
+
+		FindDbnameInConnParams(conn_opts, &dbname);
+		if (dbname)
+		{
+			PQconninfoFree(conn_opts);
+			return dbname;
+		}
+	}
+
+	/*
+	 * Next try to get the dbname from default values that are available from the
+	 * environment.
+	 */
+	conn_opts = PQconndefaults();
+	if (conn_opts == NULL)
+		pg_fatal("out of memory");
+
+	FindDbnameInConnParams(conn_opts, &dbname);
+
+	PQconninfoFree(conn_opts);
+	return dbname;
+}
+
 /*
  * From version 10, explicitly set wal segment size using SHOW wal_segment_size
  * since ControlFile is not accessible here.
diff --git a/src/bin/pg_basebackup/streamutil.h b/src/bin/pg_basebackup/streamutil.h
index 7a3dd98da3..9b38e8c0f3 100644
--- a/src/bin/pg_basebackup/streamutil.h
+++ b/src/bin/pg_basebackup/streamutil.h
@@ -31,6 +31,8 @@ extern PGconn *conn;
 
 extern PGconn *GetConnection(void);
 
+extern char *GetDbnameFromConnectionOptions(void);
+
 /* Replication commands */
 extern bool CreateReplicationSlot(PGconn *conn, const char *slot_name,
 								  const char *plugin, bool is_temporary,
diff --git a/src/bin/pg_rewind/pg_rewind.c b/src/bin/pg_rewind/pg_rewind.c
index bde90bf60b..8449ae78ef 100644
--- a/src/bin/pg_rewind/pg_rewind.c
+++ b/src/bin/pg_rewind/pg_rewind.c
@@ -451,7 +451,7 @@ main(int argc, char **argv)
 		pg_log_info("no rewind required");
 		if (writerecoveryconf && !dry_run)
 			WriteRecoveryConfig(conn, datadir_target,
-								GenerateRecoveryConfig(conn, NULL));
+								GenerateRecoveryConfig(conn, NULL, NULL));
 		exit(0);
 	}
 
@@ -525,7 +525,7 @@ main(int argc, char **argv)
 	/* Also update the standby configuration, if requested. */
 	if (writerecoveryconf && !dry_run)
 		WriteRecoveryConfig(conn, datadir_target,
-							GenerateRecoveryConfig(conn, NULL));
+							GenerateRecoveryConfig(conn, NULL, NULL));
 
 	/* don't need the source connection anymore */
 	source->destroy(source);
diff --git a/src/fe_utils/recovery_gen.c b/src/fe_utils/recovery_gen.c
index 63c78c986c..733982a82f 100644
--- a/src/fe_utils/recovery_gen.c
+++ b/src/fe_utils/recovery_gen.c
@@ -18,9 +18,14 @@ static char *escape_quotes(const char *src);
 /*
  * Write recovery configuration contents into a fresh PQExpBuffer, and
  * return it.
+ *
+ * This accepts the dbname which will be appended to the primary_conninfo.
+ * The dbname will be ignored by walreciever process but slotsync worker uses
+ * it to connect to the primary server.
  */
 PQExpBuffer
-GenerateRecoveryConfig(PGconn *pgconn, const char *replication_slot)
+GenerateRecoveryConfig(PGconn *pgconn, const char *replication_slot,
+					   char *dbname)
 {
 	PQconninfoOption *connOptions;
 	PQExpBufferData conninfo_buf;
@@ -66,6 +71,20 @@ GenerateRecoveryConfig(PGconn *pgconn, const char *replication_slot)
 		appendPQExpBuffer(&conninfo_buf, "%s=", opt->keyword);
 		appendConnStrVal(&conninfo_buf, opt->val);
 	}
+
+	if (dbname)
+	{
+		/*
+		 * If dbname is specified in the connection, append the dbname. This
+		 * will be used later for logical replication slot synchronization.
+		 */
+		if (conninfo_buf.len != 0)
+			appendPQExpBufferChar(&conninfo_buf, ' ');
+
+		appendPQExpBuffer(&conninfo_buf, "%s=", "dbname");
+		appendConnStrVal(&conninfo_buf, dbname);
+	}
+
 	if (PQExpBufferDataBroken(conninfo_buf))
 		pg_fatal("out of memory");
 
diff --git a/src/include/fe_utils/recovery_gen.h b/src/include/fe_utils/recovery_gen.h
index f1b760604b..73c1aa8e59 100644
--- a/src/include/fe_utils/recovery_gen.h
+++ b/src/include/fe_utils/recovery_gen.h
@@ -21,7 +21,8 @@
 #define MINIMUM_VERSION_FOR_RECOVERY_GUC 120000
 
 extern PQExpBuffer GenerateRecoveryConfig(PGconn *pgconn,
-										  const char *replication_slot);
+										  const char *replication_slot,
+										  char *dbname);
 extern void WriteRecoveryConfig(PGconn *pgconn, const char *target_dir,
 								PQExpBuffer contents);
 
#34Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Hayato Kuroda (Fujitsu) (#31)
Re: Have pg_basebackup write "dbname" in "primary_conninfo"?

On Tue, Mar 19, 2024 at 8:48 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:

Dear Sawada-san,

Thanks for giving comments!

This behavior makes sense to me. But do we want to handle the case of
using environment variables too?

Yeah, v5 does not consider which libpq parameters are specified by environment
variables. Such a variable should be used when the dbname is not expressly written
in the connection string.
Such a path was added in the v6 patch. If the dbname is not determined after
parsing the connection string, we call PQconndefaults() to get settings from
environment variables and service files [1], then start to search dbname again.
Below shows an example.

Thank you for updating the patch!

```
PGPORT=5431 PGUSER=kuroda PGDATABASE=postgres pg_basebackup -D data_N2 -R -v
->
primary_conninfo = 'user=kuroda ... port=5431 ... dbname=postgres ... '
```

IIUC,

pg_basebackup -D tmp -d "user=masahiko dbname=test_db"

is equivalent to:

PGDATABASE="user=masahiko dbname=test_db" pg_basebackup -D tmp

The case won't work. I think You assumed that expanded_dbname like
PQconnectdbParams() [2] can be used for enviroment variables, but it is not correct
- it won't parse as connection string again.

In the libpq layer, connection parameters are parsed in PQconnectStartParams()->conninfo_array_parse().
When expand_dbname is specified, the entry "dbname" is firstly checked and
parsed its value. They are done at fe-connect.c:5846.

The environment variables are checked and parsed in conninfo_add_defaults(), which
is called from conninfo_array_parse(). However, it is done at fe-connect.c:5956 - the
expand_dbname has already been done at that time. This means there is no chance
that PGDATABASE is parsed as an expanded style.

Thank you for pointing it out. I tested the use of PGDATABASE with
pg_basebackup and somehow missed the fact you explained.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#35Masahiko Sawada
sawada.mshk@gmail.com
In reply to: vignesh C (#32)
Re: Have pg_basebackup write "dbname" in "primary_conninfo"?

On Wed, Mar 20, 2024 at 2:24 PM vignesh C <vignesh21@gmail.com> wrote:

On Tue, 19 Mar 2024 at 17:18, Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:

Dear Sawada-san,

Thanks for giving comments!

This behavior makes sense to me. But do we want to handle the case of
using environment variables too?

Yeah, v5 does not consider which libpq parameters are specified by environment
variables. Such a variable should be used when the dbname is not expressly written
in the connection string.
Such a path was added in the v6 patch. If the dbname is not determined after
parsing the connection string, we call PQconndefaults() to get settings from
environment variables and service files [1], then start to search dbname again.
Below shows an example.

```
PGPORT=5431 PGUSER=kuroda PGDATABASE=postgres pg_basebackup -D data_N2 -R -v
->
primary_conninfo = 'user=kuroda ... port=5431 ... dbname=postgres ... '
```

IIUC,

pg_basebackup -D tmp -d "user=masahiko dbname=test_db"

is equivalent to:

PGDATABASE="user=masahiko dbname=test_db" pg_basebackup -D tmp

The case won't work. I think You assumed that expanded_dbname like
PQconnectdbParams() [2] can be used for enviroment variables, but it is not correct
- it won't parse as connection string again.

In the libpq layer, connection parameters are parsed in PQconnectStartParams()->conninfo_array_parse().
When expand_dbname is specified, the entry "dbname" is firstly checked and
parsed its value. They are done at fe-connect.c:5846.

The environment variables are checked and parsed in conninfo_add_defaults(), which
is called from conninfo_array_parse(). However, it is done at fe-connect.c:5956 - the
expand_dbname has already been done at that time. This means there is no chance
that PGDATABASE is parsed as an expanded style.

For example, if the pg_basebackup runs like below:

PGDATABASE="user=kuroda dbname=postgres" pg_basebackup -D data_N2 -R -v

The primary_conninfo written in the file will be:

primary_conninfo = 'user=hayato ... dbname=''user=kuroda dbname=postgres'''

Thanks for the patch.
Here are the test results for various tests by specifying connection
string, environment variable, service file, and connection URIs with
the patch:
case 1:
pg_basebackup -D test10 -p 5431 -X s -P -R -d "user=vignesh dbname=db1"
primary_conninfo will have dbname=db1

case 2:
pg_basebackup -D test10 -p 5431 -X s -P -R -d "user=vignesh dbname=replication"
primary_conninfo will have dbname=replication

case 3:
pg_basebackup -D test10 -p 5431 -X s -P -R -U vignesh
primary_conninfo will not have dbname

case 4:
pg_basebackup -D test10 -p 5431 -X s -P -R
primary_conninfo will not have dbname

case 5:
pg_basebackup -D test10 -p 5431 -X s -P -R -d "user=vignesh"
primary_conninfo will not have dbname

case 6:
pg_basebackup -D test10 -p 5431 -X s -P -R -d ""
primary_conninfo will not have dbname

--- Testing through PGDATABASE environment variable
case 7:
export PGDATABASE="user=postgres dbname=test"
./pg_basebackup -D test10 -p 5431 -X s -P  -R
primary_conninfo will have dbname=''user=postgres dbname=test'' like below:
primary_conninfo = 'user=vignesh passfile=''/home/vignesh/.pgpass''
channel_binding=prefer port=5431 sslmode=prefer sslcompression=0
sslcertmode=allow sslsni=1 ssl_min_protocol_version=TLSv1.2
gssencmode=disable krbsrvname=postgres gssdelegation=0
target_session_attrs=any load_balance_hosts=disable
dbname=''user=postgres dbname=test'''

case 8:
export PGDATABASE=db1
./pg_basebackup -D test10 -p 5431 -X s -P -R
primary_conninfo will have dbname=db1

--- Testing through pg_service
case 9:
Create .pg_service.conf with the following info:
[conn1]
dbname=db2

export PGSERVICE=conn1

./pg_basebackup -D test10 -p 5431 -X s -P -R
primary_conninfo will have dbname=db2

case 10:
Create .pg_service.conf with the following info, i.e. there is no
database specified:
[conn1]

./pg_basebackup -D test10 -p 5431 -X s -P -R
primary_conninfo will not have dbname

--- Testing through Connection URIs
case 11:
./pg_basebackup -D test10  -X s -P  -R -d "postgresql://localhost:5431"
primary_conninfo will not have dbname

case 12:
./pg_basebackup -D test10 -p 5431 -X s -P -R -d
"postgresql://localhost/db3:5431"
primary_conninfo will have dbname=''db3:5431'' like below:
primary_conninfo = 'user=vignesh passfile=''/home/vignesh/.pgpass''
channel_binding=prefer host=localhost port=5431 sslmode=prefer
sslcompression=0 sslcertmode=allow sslsni=1
ssl_min_protocol_version=TLSv1.2 gssencmode=disable
krbsrvname=postgres gssdelegation=0 target_session_attrs=any
load_balance_hosts=disable dbname=''db3:5431'''

case 13:
./pg_basebackup -D test10 -p 5431 -X s -P -R -d "postgresql://localhost/db3"
primary_conninfo will have dbname=db3

case 14:
./pg_basebackup -D test10 -X s -P -R -d "postgresql://localhost:5431/db3"
primary_conninfo will have dbname=db3

case 15:
./pg_basebackup -D test10 -X s -P -R -d
"postgresql://localhost:5431/db4,127.0.0.1:5431/db5"
primary_conninfo will have dbname=''db4,127.0.0.1:5431/db5'' like below:
primary_conninfo = 'user=vignesh passfile=''/home/vignesh/.pgpass''
channel_binding=prefer host=localhost port=5431 sslmode=prefer
sslcompression=0 sslcertmode=allow sslsni=1
ssl_min_protocol_version=TLSv1.2 gssencmode=disable
krbsrvname=postgres gssdelegation=0 target_session_attrs=any
load_balance_hosts=disable dbname=''db4,127.0.0.1:5431/db5'''

case 16:
./pg_basebackup -D test10 -X s -P -R -d
"postgresql://localhost:5431,127.0.0.1:5431/db5"
primary_conninfo will have dbname=db5

case 17:
./pg_basebackup -D test10 -X s -P -R -d
"postgresql:///db6?host=localhost&port=5431"
primary_conninfo will have dbname=db6

case 18:
./pg_basebackup -D test10 -p 5431 -X s -P -R -d
"postgresql:///db7?host=/home/vignesh/postgres/inst/bin"
primary_conninfo will have dbname=db7

case 19:
./pg_basebackup -D test10 -p 5431 -X s -P -R -d
"postgresql:///db8?host=%2Fhome%2Fvignesh%2Fpostgres%2Finst%2Fbin"
primary_conninfo will have dbname=db8

In these cases, the database name specified will be written to the
conf file. The test results look good to me.

Thank you for the tests! These results look good to me too.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#36vignesh C
vignesh21@gmail.com
In reply to: Amit Kapila (#33)
1 attachment(s)
Re: Have pg_basebackup write "dbname" in "primary_conninfo"?

On Wed, 20 Mar 2024 at 17:09, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Mar 19, 2024 at 5:18 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:

Thanks for giving comments!

This behavior makes sense to me. But do we want to handle the case of
using environment variables too?

Yeah, v5 does not consider which libpq parameters are specified by environment
variables. Such a variable should be used when the dbname is not expressly written
in the connection string.
Such a path was added in the v6 patch. If the dbname is not determined after
parsing the connection string, we call PQconndefaults() to get settings from
environment variables and service files [1], then start to search dbname again.

The functionality implemented by the patch looks good to me. I have
made minor modifications in the function names, error handling,
comments, and doc updates in the attached patch. Let me know what you
think of the attached.

While reviewing, I found the following changes could be done:
a) we can add one test in 010_pg_basebackup.pl to verify the change
b) Here two different styles of linking is used in the document, we
can try to keep it same:
+        streaming replication and <link
linkend="logicaldecoding-replication-slots-synchronization">
+        logical replication slot synchronization</link> will use the same
+        settings later on. The dbname will be recorded only if the dbname was
+        specified explicitly in the connection string or environment variable
+        (see <xref linkend="libpq-envars"/>).

The updated patch has the changes for the same.

Regards,
Vignesh

Attachments:

pg_basebackup-write-dbname.v0008.patchtext/x-patch; charset=US-ASCII; name=pg_basebackup-write-dbname.v0008.patchDownload
diff --git a/doc/src/sgml/ref/pg_basebackup.sgml b/doc/src/sgml/ref/pg_basebackup.sgml
index 88c689e725..208530f393 100644
--- a/doc/src/sgml/ref/pg_basebackup.sgml
+++ b/doc/src/sgml/ref/pg_basebackup.sgml
@@ -243,7 +243,11 @@ PostgreSQL documentation
         The <filename>postgresql.auto.conf</filename> file will record the connection
         settings and, if specified, the replication slot
         that <application>pg_basebackup</application> is using, so that
-        streaming replication will use the same settings later on.
+        streaming replication and <link linkend="logicaldecoding-replication-slots-synchronization">
+        logical replication slot synchronization</link> will use the same
+        settings later on. The dbname will be recorded only if the dbname was
+        specified explicitly in the connection string or <link linkend="libpq-envars">
+        environment variable</link>.
        </para>
 
       </listitem>
@@ -809,7 +813,9 @@ PostgreSQL documentation
         name in the connection string will be ignored
         by <productname>PostgreSQL</productname>. Middleware, or proxies, used in
         connecting to <productname>PostgreSQL</productname> might however
-        utilize the value.
+        utilize the value. The database name specified in connection string can
+        also be used by <link linkend="logicaldecoding-replication-slots-synchronization">
+        logical replication slot synchronization</link>.
        </para>
       </listitem>
      </varlistentry>
diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index 3a9940097c..8f3dd04fd2 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -1807,10 +1807,18 @@ BaseBackup(char *compression_algorithm, char *compression_detail,
 	}
 
 	/*
-	 * Build contents of configuration file if requested
+	 * Build contents of configuration file if requested.
+	 *
+	 * Note that we don't use the dbname from key-value pair in conn as that
+	 * would have been filled by the default dbname (dbname=replication) in
+	 * case the user didn't specify the one. The dbname written in the config
+	 * file as part of primary_conninfo would be used by slotsync worker which
+	 * doesn't use a replication connection so the default won't work for it.
 	 */
 	if (writerecoveryconf)
-		recoveryconfcontents = GenerateRecoveryConfig(conn, replication_slot);
+		recoveryconfcontents = GenerateRecoveryConfig(conn,
+													  replication_slot,
+													  GetDbnameFromConnectionOptions());
 
 	/*
 	 * Run IDENTIFY_SYSTEM so we can get the timeline
diff --git a/src/bin/pg_basebackup/streamutil.c b/src/bin/pg_basebackup/streamutil.c
index 56d1b15951..43c01fb1ca 100644
--- a/src/bin/pg_basebackup/streamutil.c
+++ b/src/bin/pg_basebackup/streamutil.c
@@ -34,6 +34,7 @@
 int			WalSegSz;
 
 static bool RetrieveDataDirCreatePerm(PGconn *conn);
+static void FindDbnameInConnParams(PQconninfoOption *conn_opts, char **dbname);
 
 /* SHOW command for replication connection was introduced in version 10 */
 #define MINIMUM_VERSION_FOR_SHOW_CMD 100000
@@ -267,6 +268,74 @@ GetConnection(void)
 	return tmpconn;
 }
 
+/*
+ * FindDbnameInConnParams
+ *
+ * This is a helper function for GetDbnameFromConnectionOptions(). Extract
+ * the value of dbname from PQconninfoOption parameters.
+ */
+static void
+FindDbnameInConnParams(PQconninfoOption *conn_opts, char **dbname)
+{
+	PQconninfoOption	*conn_opt;
+	Assert(dbname != NULL);
+
+	for (conn_opt = conn_opts; conn_opt->keyword != NULL; conn_opt++)
+	{
+		if ((strcmp(conn_opt->keyword, "dbname") == 0) &&
+			conn_opt->val != NULL && conn_opt->val[0] != '\0')
+			*dbname = pg_strdup(conn_opt->val);
+	}
+}
+
+/*
+ * GetDbnameFromConnectionOptions
+ *
+ * This is a special purpose function to retrieve the dbname from either the
+ * connection_string specified by the user or from the environment variables.
+ *
+ * We follow GetConnection() to fetch the dbname from various connection
+ * options.
+ *
+ * Returns NULL, if dbname is not specified by the user in the above
+ * mentioned connection options.
+ */
+char *
+GetDbnameFromConnectionOptions(void)
+{
+	PQconninfoOption *conn_opts = NULL;
+	char	   *err_msg = NULL;
+	char	   *dbname = NULL;
+
+	/* First try to get the dbname from connection string. */
+	if (connection_string)
+	{
+		conn_opts = PQconninfoParse(connection_string, &err_msg);
+		if (conn_opts == NULL)
+			pg_fatal("%s", err_msg);
+
+		FindDbnameInConnParams(conn_opts, &dbname);
+		if (dbname)
+		{
+			PQconninfoFree(conn_opts);
+			return dbname;
+		}
+	}
+
+	/*
+	 * Next try to get the dbname from default values that are available from the
+	 * environment.
+	 */
+	conn_opts = PQconndefaults();
+	if (conn_opts == NULL)
+		pg_fatal("out of memory");
+
+	FindDbnameInConnParams(conn_opts, &dbname);
+
+	PQconninfoFree(conn_opts);
+	return dbname;
+}
+
 /*
  * From version 10, explicitly set wal segment size using SHOW wal_segment_size
  * since ControlFile is not accessible here.
diff --git a/src/bin/pg_basebackup/streamutil.h b/src/bin/pg_basebackup/streamutil.h
index 7a3dd98da3..9b38e8c0f3 100644
--- a/src/bin/pg_basebackup/streamutil.h
+++ b/src/bin/pg_basebackup/streamutil.h
@@ -31,6 +31,8 @@ extern PGconn *conn;
 
 extern PGconn *GetConnection(void);
 
+extern char *GetDbnameFromConnectionOptions(void);
+
 /* Replication commands */
 extern bool CreateReplicationSlot(PGconn *conn, const char *slot_name,
 								  const char *plugin, bool is_temporary,
diff --git a/src/bin/pg_basebackup/t/010_pg_basebackup.pl b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
index 490a9822f0..63f7bd2735 100644
--- a/src/bin/pg_basebackup/t/010_pg_basebackup.pl
+++ b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
@@ -783,6 +783,19 @@ my $checksum = $node->safe_psql('postgres', 'SHOW data_checksums;');
 is($checksum, 'on', 'checksums are enabled');
 rmtree("$tempdir/backupxs_sl_R");
 
+$node->command_ok(
+	[
+		@pg_basebackup_defs, '-D', "$tempdir/backup_dbname_R", '-X',
+		'stream', '-d', "dbname=db1", '-R',
+	],
+	'pg_basebackup with dbname and -R runs');
+like(
+	slurp_file("$tempdir/backup_dbname_R/postgresql.auto.conf"),
+	qr/dbname=db1/m,
+	'recovery conf file sets dbname');
+
+rmtree("$tempdir/backup_dbname_R");
+
 # create tables to corrupt and get their relfilenodes
 my $file_corrupt1 = $node->safe_psql('postgres',
 	q{CREATE TABLE corrupt1 AS SELECT a FROM generate_series(1,10000) AS a; ALTER TABLE corrupt1 SET (autovacuum_enabled=false); SELECT pg_relation_filepath('corrupt1')}
diff --git a/src/bin/pg_rewind/pg_rewind.c b/src/bin/pg_rewind/pg_rewind.c
index bde90bf60b..8449ae78ef 100644
--- a/src/bin/pg_rewind/pg_rewind.c
+++ b/src/bin/pg_rewind/pg_rewind.c
@@ -451,7 +451,7 @@ main(int argc, char **argv)
 		pg_log_info("no rewind required");
 		if (writerecoveryconf && !dry_run)
 			WriteRecoveryConfig(conn, datadir_target,
-								GenerateRecoveryConfig(conn, NULL));
+								GenerateRecoveryConfig(conn, NULL, NULL));
 		exit(0);
 	}
 
@@ -525,7 +525,7 @@ main(int argc, char **argv)
 	/* Also update the standby configuration, if requested. */
 	if (writerecoveryconf && !dry_run)
 		WriteRecoveryConfig(conn, datadir_target,
-							GenerateRecoveryConfig(conn, NULL));
+							GenerateRecoveryConfig(conn, NULL, NULL));
 
 	/* don't need the source connection anymore */
 	source->destroy(source);
diff --git a/src/fe_utils/recovery_gen.c b/src/fe_utils/recovery_gen.c
index 63c78c986c..733982a82f 100644
--- a/src/fe_utils/recovery_gen.c
+++ b/src/fe_utils/recovery_gen.c
@@ -18,9 +18,14 @@ static char *escape_quotes(const char *src);
 /*
  * Write recovery configuration contents into a fresh PQExpBuffer, and
  * return it.
+ *
+ * This accepts the dbname which will be appended to the primary_conninfo.
+ * The dbname will be ignored by walreciever process but slotsync worker uses
+ * it to connect to the primary server.
  */
 PQExpBuffer
-GenerateRecoveryConfig(PGconn *pgconn, const char *replication_slot)
+GenerateRecoveryConfig(PGconn *pgconn, const char *replication_slot,
+					   char *dbname)
 {
 	PQconninfoOption *connOptions;
 	PQExpBufferData conninfo_buf;
@@ -66,6 +71,20 @@ GenerateRecoveryConfig(PGconn *pgconn, const char *replication_slot)
 		appendPQExpBuffer(&conninfo_buf, "%s=", opt->keyword);
 		appendConnStrVal(&conninfo_buf, opt->val);
 	}
+
+	if (dbname)
+	{
+		/*
+		 * If dbname is specified in the connection, append the dbname. This
+		 * will be used later for logical replication slot synchronization.
+		 */
+		if (conninfo_buf.len != 0)
+			appendPQExpBufferChar(&conninfo_buf, ' ');
+
+		appendPQExpBuffer(&conninfo_buf, "%s=", "dbname");
+		appendConnStrVal(&conninfo_buf, dbname);
+	}
+
 	if (PQExpBufferDataBroken(conninfo_buf))
 		pg_fatal("out of memory");
 
diff --git a/src/include/fe_utils/recovery_gen.h b/src/include/fe_utils/recovery_gen.h
index f1b760604b..73c1aa8e59 100644
--- a/src/include/fe_utils/recovery_gen.h
+++ b/src/include/fe_utils/recovery_gen.h
@@ -21,7 +21,8 @@
 #define MINIMUM_VERSION_FOR_RECOVERY_GUC 120000
 
 extern PQExpBuffer GenerateRecoveryConfig(PGconn *pgconn,
-										  const char *replication_slot);
+										  const char *replication_slot,
+										  char *dbname);
 extern void WriteRecoveryConfig(PGconn *pgconn, const char *target_dir,
 								PQExpBuffer contents);